
Data, data, everywhere, not a drib to sniff. To say that there is data overload faced by almost every organization would be an understatement. The problem is so serious that multiple lines of careers have been spawned by it. And yet, this data (once qualified) is not providing the value it can. And that is primarily because the pie charts, bar graphs, and PowerPoint are grossly inadequate in helping making sense of this flood; not to talk of the deficiency of trust in how Artificial Intelligence (AI)/Machine Learning (ML)/Deep Learning (DL) are interpreting it. A new type of visual tools is now needed, which not only present data differently but also build trust. This column will examine “Data Visualization”, perhaps one of the most talked about emerging concepts and (therefore) widely misunderstood. And we will look at how to find the story that data is trying to tell us.
Data Visualization is the new way data is presented and used. At the highest level, Data Visualization focuses on the best and the worst, and not all the data. We will look at what this highlighting means a little later. And this identification of the best and the worst is done by AI/ML/DL, with ability for the user to make the final call. The end game of all this is an ability to tell a “Data Story”. The nature of this interaction between humans and computers varies by the context. In a recent Webinar, MIT listed five such variations. Let us look at these in the context of implementation by one of the largest retailers in the world.
At one end is the “Automator” mode, where humans will typically slow down the decision-making process. An application is Clearance Sale at the end of a season; the only priority is to move items off the shelf with multiple micro-decisions, sometimes in millions. Then there is the “Decider” mode, where the data is used to provide information for humans to make the decision. The application here is Smart Substitution, the method for recommending alternatives to online shoppers when their specific choice is not in stock. The control starts to shift as we move to the “Recommender” mode; data here suggests optimal solution(s), like in Replenishment. Next level of human computer interaction is the “Illuminator” mode, where the data provides information for human thought to focus in the right direction. A rendition would be in Assortments.
Lastly is the “Evaluator” mode, applicable when the options available are too many or too complex for human mind to even comprehend, let alone select and decide. Data is used to generate a comprehensible set of options for humans to finally decide. This is useful in case of extreme events, like a tornado, when the entire Replenishment, Assortment and the Supply Chain need to be rejigged so as to spin on a dime. This has also been used for the current pandemic by ML-mature organizations, including in the vaccine research.
It is clear that data needs to be used only as needed and not in its entirety, and with full visibility if not interactivity. According to MIT, such a system requires three core modules: Visualization (to present multiple views side by side), Interactivity (to explore the multiple views, including ‘what-ifs’), and Prescription (for analytics). Most certainly, not all components are required in all the modes outlined earlier, at least not in full strength.
Let us now look at what kind of data presentation needs to be done by the three modules for the five modes. This presentation is a four step process. First and foremost is to identify the top and/or the bottom, and highlight it. Secondly, context-specific labels need to be added. Thirdly, to make sure that one is not focused on the outliers, bundling must be done to show that the “highlight” is a not an anomaly. This allows the data to be grouped together to allow the right focus. Lastly, a deep dive may be required in this delineated data for detailed analysis and decision making. The Visualization Engine mentioned above will need to walk the user through all four steps. The first three steps build trust in the user, and the last one provides interactivity.
A story gets created as one takes the Data Visuals at multiple points of time, so that a ‘plot’ starts to emerge. And just like in storytelling, one must admire the magnitude (in turn establishing the need to focus only on the critical subset), have a hero and a villain (both of which can be people, entities, events, circumstances, etc.) with the consequent struggle/opportunity (past or to be planned by humans), an emotional arc; and the story needs to be within the context of the viewers. It usually helps to have conversations with the people who have generated data; this provides help in all the elements as listed above for a good story.
And this is how one gets the Data Story.
Let us end with a quick summary of a real Data Story. It is a well-established fact that the laundry bill of most households jumps as autumn sets in. But there is a set of households where this jump is manifold. On dissection of this data, it was found that these were HNI households. And right there was a list of consumers that are more likely to buy luxury goods in the coming year.
We will be looking at Data in its interesting forms over the next four to six months; all in plain English, and not as a doctoral thesis. Next up: Data in Wilderness.
The author managed large IT organizations for global players like MasterCard and Reliance, as well as lean IT organizations for startups, with experience in financial and retail technologies
Add new comment