Most of us are used to thinking of marketing research data – or data in general – in terms of aggregates that change over time. While valuable at the macro level, it is not very useful when attempting to determine customer behavior which in turn drives the marketing strategy and resulting promotional campaign. Internal systems (POS) indicate when a transaction occurred, its monetary value and even who executed it. It is even possible to know whether the transaction is new or repeat business and it can easily be extended to link media exposure and campaign effectiveness directly to sales. Add data mining tools, CRM filters and a range of models and analysis gadgets into the mix, and strategic planners should be ready to fire out a bulletproof strategy based on historical data and probability. That is not the case; an important component is missing: sequence of events.
A customer that buys Nike running shoes and a Reebok track suit is the equivalent to one or two points in the data stream based on whether it is treated as a single event or two. Alone, this information is completely useless. With 100 additional customers of which 9 buy Nike running shoes and Reebok tracksuits, a pattern begins to emerge. Are sales of these two items the result of a promotional campaign, does it have to do with the weather, was there a blockbuster movie featuring these items, a special price or offer, or just fluke chance? What if we had this scenario:
Four of the nine customers that bought the two items had seen the same blockbuster movie less than three days earlier where the items were featured. This information can be gathered once smartphone shopping develops to the extent of competing with credit and debit cards (this is already happening). The day they bought the items, outside temperature has risen from chilly to warm and rain had given way to sunshine and clear skies. There was no special promotion in place nor discount pricing. Is this an indicator that the movie affected sales or was it the combination of the movie and the sudden change in the weather? While the former scenario was limited to internal information, this second scenario takes external information into account. Still, the ability to track customer spending behavior before the actual purchase is valuable but still not good enough. We need to know exactly why the customer came to the store in the first place and why nine customers bought the exact same items (as they bought two brands, we can eliminate brand loyalty to some degree, although they may be loyal to a brand for specific products). A third scenario could be along these lines:
All retailers featuring Nike running shoes and Reebok tracksuits and are located less than a 5 minute walk from a movie theater showing the blockbuster movie experienced an increase in sales of these two items three days after the customers saw the movie. If this happened at the national level, we are talking about a major statistic that can be used for strategic planning (and extends to outlet planning, distribution channels and campaign management). Also, if other retailers offering the same items but not located near the movie theater showing the blockbuster did not incur sales increase, it further supports the probability that the movie and item sales are related. Lastly, if retailers experiencing rain and continued chills in their area fail to experience the sales increase of the two items, we also know that weather does have effect which means that the movie alone is not enough. Our insight to the market has become so much greater than in the first scenario where our visibility was confined to data originating within a closed network.
Whereas the first scenario limited visibility to internal data and the second expanded that view to customer external and internal transactions, the third scenario includes data that lies outside the realm of traditional market research. The Big Data concept will not work unless it renders sales influencers and drivers useful for the end customer, which in this case are marketing managers and researchers. As such, they should not have to browse for anything or generate derivative indicators but get the final result delivered directly into their own strategic marketing model. This means that they will get information from competitor internal systems just like they themselves share internal information with competitors. I expect someone to gasp at this point and for good reason; who in his or her right mind would share internal data with competitors? The very idea is ludicrous … or is it? What if the data shared is of derivative nature instead of relaying actual values?
A useful tool to determine chain of events is correlation. If the correlation between any two indicators is strong (positive or negative), they either directly influence one another or are both influenced by a third indicator. Internal systems can yield this information but they cannot extend that ability outward as comparable data is locked inside other internal systems. Sharing the output of these systems directly is not an option for competitive reasons, but it is still possible to establish a data communications network that protects the internal data while making it available to the outside world, even competitors.
As long as data remains confined within organizations or institutions, long-term strategic planning will contain a great deal of uncertainty. With a mechanism in place that can explore emerging patterns as they develop, that uncertainty is largely reduced and opens up an opportunity for real-time strategic marketing models that shift based on what is happening right now and estimate with a high degree of accuracy what will happen further down the timeline. This does not mean that the entire strategy is constantly being updated, but is does provide the means to adjust it based on changes in market drivers. Such changes may happen very quickly and completely blindside a company. Now what does it take to make this real?
The solution is very simple as far as the framework is concerned. The challenges lie at the backend. Like adapters used for household appliances, the Derivative Data Sharing Framework (DDSF) transforms all data stream plugged into it into a common metric (or common current to continue the electricity analogy). These streams can be of any type, frequency, scale or nature and are used for the sole purpose of generating a running indicator as to how they are related. For the retailer in the scenarios above, there would be no need to look at internal or external data to figure out what the sales drivers in that particular case are. By selecting the Nike running shoes, the DDSF yields all internal and external quantifiable measures that show a relationship within the range set by the user (usually +/- o.8 or above) regardless of origin. The result is a fourth scenario:
Selecting Nike running shoes shows strong positive correlation to other similar products on the market and negative correlation to McDonald’s french fry sales. Temperatures ranging from 10-15°C show a +0.83 correlation while the range 15-25°C increase correlative strength to +0.88. Air humidity levels between 40-60% also reveal strong correlation but so does low inflation, comparative strength of the domestic currency, a drop in unemployment levels, increased labor force, increased demand for studio apartments in downtown Chicago and grain export to Canada. Suddenly, the marketing strategy is no longer confined to internal processing systems and external research entities, but takes the entire planet into account.
This is the theory in a nutshell. Implementing it may appear difficult but it is not as complicated as many would believe as long as the processing power to manage this kind of data flow is available. The bottom line is that it will be possible to estimate sales development based on factors that are located far ahead of actual point of sale just like a domino that falls on another eventually affects the last brick in the chain. Harnessing all data flow is to my knowledge the only means of achieving this, but the only way to do it is to configure the data so that actual values are completely concealed.