Sunday, August 26, 2012

One of the biggest challenges as a data professional...

Lately, when I heard others had heated discussion on algorithms or models. My heart felt a little bit bitter(and no kidding, that is the taste of jealousness). How come? I feel like I am facing one of the biggest challenges as a data professional, which is to fight against other people in order to help them and show them the truth using data.

This sounds rather odd, doesn't it? I have seen people, working really hard to add new features to a product, and setting up their success metric to be X% overall revenue lift Y days after rolling-out. Ignoring the fact that the feature has to be activated via clicking an icon of 4mm x 4mm in size, after the users moving their mouse over that otherwise invisible icon, the success metric does not seem be a bad one, right?

First of all, it's a good thing that people try to set up some metrics to measure the success of their project, before they actually implement it. However, in my opinion, the metric still has a few places that need to be re-considered and validated.
  1.  From product integrity's perspective, UI design needs to "promote" the new feature, at least doing no harm. Making it so invisible is very unfortunate. Other functionality team better work together on this as well. For example, sending emails, messages, etc. 
  2. Everybody wants revenue lift. Who doesn't? But not everyone realizes that there are 99 steps before the goal could be reached. For example, users need time to discovery new things, time to learn, time to use, and, time to increase usage if possible. The entire process is "time"-consuming. Will that Y days be enough? If not, then setting up a metric far down the road with limited time constrain, does not seem to be a smart move. Likely one is going to fail the project according to this metric.
  3. An organization, with some test system, is going to make the feature available in "rolling-out" fashion, instead of in front of all users' face at the same time. If during the entire Y days, only small amount of customers are in the test, it's very likely that one won't be able to see that X% lift. 
  4. If the feature is new and only going to affect a specific group of users, then the baseline revenue needs to be carefully chosen. And a historical data set should be examined to obtain some idea on how the baseline changes over time. Let it had a bigger variation than the X%, very unlikely one is going to detect the changes they desire to see. 
As more people/organizations realize the value of their data, some of them need to be "educated" on how to make sense out of it. That's one of the roles data scientists play. Just as data professionals come with all sorts of sizes and shapes (I mean background and trainings :P), their jobs varies in terms of the percentage mixes among the roles: "interpreter, teacher, visualizer, programmer and data cruncher".

No comments:

Post a Comment