Strangled by data?

| 7 Comments
Google in 1998

Image via Wikipedia

A frustrated ex-Googler writes:
Yes, it's true that a team at Google couldn't decide between two blues, so they're testing 41 shades between each blue to see which one performs better. I had a recent debate over whether a border should be 3, 4 or 5 pixels wide, and was asked to prove my case. I can't operate in an environment like that. I've grown tired of debating such miniscule design decisions. There are more exciting design problems in this world to tackle.

So, Google observes people and their clicks to determine the color or line thickness. When your software phones back every time it is used, it's like having a microphone or camera in a car that detects every mistake, or that measures the response time.

It is easy to optimize the line thickness, but it's more difficult to optimize the overall design of the study. When your working day has 16 hours, and you spend 15 of them on optimization, there is not much time left for new designs.

The analogy carries over to statistical practice: your model is only as good as the data you're using. And the data, while plentiful and accurate, might be preventing you from solving the problem, looking for keys under the lamp post. Methodology can often be just as constraining as the data.

Over the past few decades, most policy programs were focused on remediation based on easily measured demographic variables, such as age, gender, income, race, education, ideology, ability - at the expense of variables that are harder to model and measure, such as honor, talent, potential, trustworthiness, motivation.

7 Comments

Aleks- nice post ... a related issue to "optimization" online is that many tests are not conducted in a truly systemtic manner. Many companies conduct an A-B or multivariate test an conclude that 3 pixels is better than 4. The problem is that many of these tests are simple two week snapshots. If they were conducted in another four weeks the results could be reversed. This occurs quite frequently. As someone who conducts numerous tests on the Internet I find reversals a regular phenomena. Is it just chance that arises given the alpha or is it something else? I will suggest itis something else.

I agree with this guy - Seems like a lot of work to go through just to decide on the color of a border. I understand that on one level it is important, but to have a design team actively working on this? Why not just hire a psychologist and ask him what the best color would be and be done with it?

Agonizing over optimizing the design to that precision is fool's gold in another respect: if you think the optimal border width is 3pix, I am obliged to ask, "and what's the 90% confidence interval on that," and what's more, "how is it correlated with the font size, text color, window width, and whatnot?" It is a silly exercise, there is no meaningful answer.

What does it mean to remediate age, gender or race?

Anonymous, one can't remediate age, gender or race - this is often referred to as genocide.

In the past decades, remediation in the form of additional funding, legal protection and similar were aimed at eliminating differences that could be explained using these easily measured variables. Also, easily measured test performance, GPA or IQ overwhelm other measures of individual's value.

My point is that "statistical" approaches often favor easily measured superficial variables with plentiful data over more complex notions that are closer to truth, but might be harder to measure.

Aleks:

Very interesting post.

As background, Daryl Pregibon gave a talk on what they do at Google and it was encouraging to hear that "cheap" randomized studies were replacing rhetorical based claims about consumer preferences with reason based answers based on the experimental results. The idea that these answers had a limited shelf life and the randomized experiments need to be regularly redone was also being considered - even if a bit reluctantly.

So my first reaction to this was simply someone being upset and not being able to exert their rhetoric on their and others work -- hey this is still what most successful people (in academia and business) do most of their time as good randomized evidence is usually too expensive/limited to be competitive AND it is hard to claim randomized evidence as a personal contribution (as one of my old directors used to say - any idiot can randomize to two groups and then observe which group does better)

But as in your follow-up point - the "choice" of what to randomize remains a rhetorical matter (at least according to Popper/Peirce)

also if getting reasoned answers to parts of the question distracts from important rhetorical work of addressing the overall question

that would be counter-productive

But given a question has been decided upon and if randomization is feasible to address it - getting an answer any other way does not make sense even if it affront's peoples sensibilities and "lived" experiences

there are psychologists at berkeley researching color and spatial preference, so you design folks soon enough won't have to worry about "optimization." here's a link to their website: http://socrates.berkeley.edu/~plab/projects.html (and email kschloss@berkeley.edu , if you are interested. they have a few manuscripts in preparation). interestingly, their research is being funded by google...

Leave a comment

Recent Comments

  • joe austerweil: there are psychologists at berkeley researching color and spatial preference, read more
  • Keith O'Rourke: Aleks: Very interesting post. As background, Daryl Pregibon gave a read more
  • Aleks Jakulin: Anonymous, one can't remediate age, gender or race - this read more
  • Anonymous: What does it mean to remediate age, gender or race? read more
  • vlk: Agonizing over optimizing the design to that precision is fool's read more
  • CashCrate Scam: I agree with this guy - Seems like a lot read more
  • simone: Aleks- nice post ... a related issue to "optimization" online read more

Back to archived post list | Wayback snapshot | Live page