"If a company tries to do every aspect of data work internally, the amount of investment (software, hardware, storage, personnel, training, etc.) becomes so large, so that only huge jackpots would look like a real success," he said. "That is like setting up a ketchup factory for every household, without considering economy of scale. So, start small with a proof of concept."
3. Make Time for Data Refinement
Data refinement should precede any analytical activity. If you're waiting for a "genius data scientist" to show up and make sense of your mess, you're out of luck, Yu said. Marketers have to invest in this important step. During the refinement process, data is verified, edited, standardized, categorized and summarized, Yu explained. He offered the example of SKU level data.
"Even small businesses may carry hundreds of thousands of unique products. If they are not categorized and tagged properly, the product data become unusable free-form data for analytics," Yu said.
"Imagine a case where the very top product doesn't even represent 1 percent of the total sales. But if the products are properly categorized, they become the most potent predictor in analytics."
If marketing is the goal, Yu said, databases must provide a "buyer-centric" view—not just brand-, division-, channel- or event-centric views. "That must start with the proper definition of an individual, but unfortunately, most databases are not capable of answering a simple question like 'How many 24-month active customers do you have?' If the answer is: 'We have a million-ish email addresses,'" he continued, "well, that is not even close."
4. Let It Go
Holding on to old data is one of the biggest mistakes a marketer can make. Because companies are operating in an age of large data, they must be able to harness answers out of their data, as Yu previously mentioned. "Just like lots of rocks are thrown out during the refinement process of gold, irrelevant data must be dropped along the way," Yu instructed. "Now, the tricky part is that such relevancy depends on the business goals, and it should be determined by mathematics, not someone's hunch. That is the reason why the business goals must be set first, and enabling analytics must dictate database structure."