Over the past decade or so, the digital revolution has given us a surplus of data. This is exciting for a number of reasons, but mostly in terms of how AI will be able to further revolutionize the enterprise.
However, in the world of B2B — the industry I’m deeply involved in — we are still experiencing a shortage of data, largely because the number of transactions is vastly lower compared to B2C. So, in order for AI to deliver on its promise of revolutionizing the enterprise, it must be able to solve these small data problems as well. Thankfully, it can.
The problem is that many data scientists turn to bad practices, creating self-fulfilling prophecies, which reduces the effectiveness of AI in small data scenarios — and ultimately hinders AI’s influence in advancing the enterprise.
The trick to applying AI correctly to small data problems is in following correct data science practices and avoiding bad ones.
The term “self-fulfilling prophecy” is used in psychology, investing and elsewhere, but in the world of data science, it can simply be described as “predicting the obvious.” We see this when companies find a model that predicts what already works for them, sometimes even “by design,” and apply it to different scenarios.
For instance, a retail company determines that people who filled their cart online are more likely to purchase than people who didn’t, so they heavily market to that group. They are predicting the obvious!
Instead, they should apply models that help optimize what does not work well — converting first-time buyers who don’t already have items in their cart. By solving for the latter — or predicting the non-obvious — this retail company will be much more likely to impact sales and acquire new customers instead of just keeping the same ones.
To avoid the trap of creating self-fulfilling prophecies, here’s the process you should follow for applying AI to small data problems:
- Enrich your data: When you find you don’t have a ton of existing data to work off of, the first step is to enrich the data you already have. This can be done by tapping into external data to apply look-alike modeling. We see this more than ever thanks to the rise of recommendation systems used by Amazon, Netflix, Spotify and more. Even if you only have one or two purchases on Amazon, they have so much information on products in the world and the people who buy them, that they can make fairly accurate predictions on your next purchase. If you’re a B2B company that uses a “single dimension” to categorize your deals (e.g., “large companies”), follow Pandora’s example and dissect each customer by the most detailed degrees (e.g., song title, artist, singer gender, melody construction, beat, etc.). The more you know about your data, the richer it gets. You can go from low-dimensional data with trivial predictions to high-dimensional knowledge with powerful prediction and recommendation models.