Tuesday, March 27, 2012

No, you’re weird

Ever noticed how most behavioral research is based on studies of Western, upper-middle-class, undergraduate university students? If you, like me, are American, it might never occur to you to wonder whether those results can really be generalized to describe the behavior of "people." After reading The weirdest people in the world? (Western, Educated, Industrialized, Rich and Democratic (WEIRD)), you may want to go back through your favorite studies on decision-making, collaboration, cognition, and symbol interpretation and question your first read.

This paper also has pretty much the best opening paragraph of any academic paper ever. Fair warning: it's not SFW.

Thursday, March 22, 2012

Where’s Gringo?

Because Americans are so geographically isolated, we are often less aware of the signature quirks of our own culture and perspective than are (say) people from patchwork continents like Africa, South America, Europe, The Artist Formerly Known as the Soviet Union, etc. Our biases hide in plain sight. For a dose of cultural perspective from the comfort of your own beanbag chair, do not miss American Cultural Patterns. I’m told this tiny little book was written as culture-shock prep for undergraduates who were entering the Peace Corps, and were traveling overseas for the first time. It delves deeply into kernel-level cultural assumptions about communication, values, morality, the perception of time and causality - the list goes on. In my experience, reading any three pages of this book provokes an hour of fascinated discussion over the late-night-coffee of your choice.

Statistics in Plain English

Yesterday I picked up Statistics in Plain English, by Timothy C. Urdan, and I tell ya I can't put it down. From my review:
I think I can say, without fear of hyperbole, that this is the best math book in the history of the entire universe. The fact that there are only six reviews of the book so far, instead of six hundred, hints at the fundamental problem I personally see in math education: it looks harder than it is because we communicate so poorly about it. Urdan communicates clearly and naturally, so the chilly math textbook mystique drops away, and you are left with a functional vocabulary of basic stats techniques.
Urdan starts with the assumption that all humans can understand and benefit from statistical techniques. By assuming that, he makes it true. He not only defines every term and every symbol he uses -- which is already amazing -- but the new terms and definitions are summarized at the end of each chapter. He lays out lots of context and many straightforward and interesting examples. The chapters are short, which gives you a nice feeling of accomplishment and plenty of breaks to think. He even humanizes the experience by speaking in the first person, expressing personal preferences, and even cracking the occasional joke. It's like talking about math over tea with a good friend. 
In the modern data space, there's a great shortage of people who have a comfortable intuition for stats. If this book were in every undergraduate class, I'd wager that shortage would just go away. 

Monday, March 19, 2012

Perception of self-efficacy, and technology for developing countries

Kentaro Toyama, assistant director of Microsoft Research India, has spent a lot of time thinking about how to use technology to change social systems. He's focused on using technology to further development in rural India (ICT4D, or Information & Communications Technology for Developing Countries). He published a very cool set of essays in The Atlantic in 2011 about the topic. I've been thinking about them ever since. Check 'em out.

Kentaro talks about technology as basically an amplifier for people's will. Don't let the "virtue" language deter you; fully unpacked, it's a pretty loaded concept.

Sunday, March 18, 2012

Why you need an excellent data scientist

The data science hiring space in a nutshell:
Remember, the driver is as important as the car. If you want to make the best use of your BI application, your organization needs the right people to exploit it. BI is not just about reporting and visualization anymore. It involves intensive and creative analysis, along with data management, to create value for an organization. - Got BI? Now You Need to Hire a Data Geek. Here’s What to Look For.


Hal Varian, Google’s Chief Economist, was interviewed a few months ago, and said the following in the McKinsey Quarterly: “The sexy job in the next ten years will be statisticians… The ability to take data—to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it—that’s going to be a hugely important skill.”  - The Three Sexy Skills of Data Geeks 

Data geeks are a hot commodity. Why?

Data is piling up around the industry's ears. We humans are suddenly generating a mountainous drift of accumulating data, growing exponentially, that nobody anticipated having. That mountain is filled with profitable, scientific gems that we are just beginning to learn how to mine out.

The market is unprepared for the demand. Even with the rush to train data miners, the market isn't coming close to keeping up with the pace of the data mountain's growth. Folks like me are hounded by recruiters; folks with +5 years data mining experience/ education are actively stalked.
CNN coverage: Companies that want to make sense of all their bits and bytes are hiring so-called data scientists - if they can find any. [...] A recent report from the McKinsey Global Institute says that by 2018 the U.S. could face a shortage of up to 190,000 workers with analytical skills.


Data science salaries are growing. The supply/demand disparity is driving up salaries. According to one survey, in 2010, the average data miner salary in the US was $103k; in 2011, it was $113k.

So if professional data miners are so hard to get, why do you need one of us?

[Your Company Here] needs an excellent data scientist. If humans use your digital product, your company is already generating an enormous quantity of ultra-rich data. Based on that fact alone, I can make the following safe bets:

Your data is buggy. No matter how good your testing is, there will be bugs. The bigger the data, the badder the bugs. You are going to need a data analyst who can identify dirty data, scope the damage, and prescribe a solution. Skip that, and risk spending months acting on an interesting data trend that, in the end, describes nothing but a broken javascript call. I've seen it happen over and over.

Skill #2: Data Munging (Suffering). The second critical skill mentioned above is “data munging.” Among data geek circles, this refers to the painful process of cleaning, parsing, and proofing one’s data before it’s suitable for analysis. Real world data is messy. - The Three Sexy Skills of Data Geeks 
    Your data has a fluid architecture. As time moves on, your product evolves. Add a new option? Remove a feature? Need to view user behavior through a whole new lens? Like it or not, you have to change your data architecture while it is live. Every time that happens, you add more complexity. You need an analyst who can keep up with that.

    Your data has, or should have, journeyman-level richness. If you send an apprentice-level data-miner into that trove, you're going to come out with a handful of iron ore. You can hire an apprentice analyst to run the queries you specify and graph them. You can't hire a apprentice to ask big, hot, actionable, counter-intuitive questions. Those questions grow out of an elbows-deep daily dialogue with your data set, composed of statistics and good old-fashioned nerdly zeal. If you want the gems out of that mine, you need a real industry-level data miner.

    With big data comes big headaches; also big, big opportunity for a reputation for genius product development. If your analyst can't handle the big hairy real-world data mess, or doesn't know statistical relevance from a hole in the wall, you get bunk analysis. If your analyst just plain isn't that into it, you will get shallow, token inquiry. If your analyst is a passionate data/social science geek, you get game-changing analysis, and stand to score media-worthy customer relationship coups.