BIG Data & Analytics, Data Science, Machine Learning - Random Insights!
#1: What really is BIG Data?
Today, we can store and process so much data that we have nearly captured reality; no more sampling biases/ errors or related issues - this is my definition of Big Data; not tera or peta bytes! If you have measured the entire population (or close to it) and not sample just a small fraction, resulting data is BIG Data!
#2: Analytics - what is in a name?
Analytics, Machine Learning, Data Science, pattern recognition, statistical modeling, data mining, knowledge discovery, predictive analytics, data science, adaptive systems, self-organizing systems, . . . There ARE subtle technical differences but let us just call it “Analytics”, at least in business applications!
#3: Dynamics & Context Sensitivity:
When you hear “dynamics”, time always comes to mind first but it is only one of the many possibilities. Dynamics could be over any independent variable! Context-sensitivity to better identify a word within a sentence is also an example of dynamics.
#4: Artificial Intelligence & Intelligence Augmentation:
As a card-carrying Neuroscientist, I can say that creating Artificial Intelligence by mimicking the human brain seems to be a fool’s errand. If you think about neuronal axons along which electrical spikes travel, they are like conducting wires with insulation scraped off every few millimeters! THEN, the whole jumble of these billions of wires are dunked in a salt solution! Try sending your TCP/IP packets along such a network . . . I believe that the electric fields that these leaky wires produce have a major role to play in how the brain works. However, in silicon implementations of artificial brain today, electric fields are actively suppressed by careful physical layout and ground planes!
Now, there is a glimmer of hope for Artificial Intelligence of this sort. Our ancestors saw flying birds and dreamt of flying; our planes don’t flap their wings but they fly really, really well! Similarly, there may be a Wright Brothers who finds that last twist in the story and creates Artificial Intelligence that works like human brain but NOT by “flapping the wings” . . .
Instead of AI, I am a believer in IA or “Intelligence Augmentation”! When Doug Englebart wrote to ARPA about IA in the 1960’s, he could not have foreseen what Big Data can do. Englebart’s “Mother of All Demos” was really about Communication augmentation – computer mouse, video conferencing, hypertext, etc. With Big Data & Analytics, NOW we can truly do Intelligence augmentation! This is why all of us Data Scientists ought to be excited to be working in this field.
#5: UNIFIED Machine Learning:
Pattern Recognition and Classification is an excellent unifying perspective for Machine Learning (ML). In particular, the classic textbook of Duda & Hart, “Pattern Classification & Scene Analysis”, published in 1973 is a great starting point!
Given labelled samples, obtain a class description consisting of either a distance metric (Euclidean, intra-class, etc.) or a probability density function and then derive a decision rule (Maximum A-posteriori Probability, Bayes, etc.) from the description. The decision rule specifies a decision boundary in feature space among classes. Alternatively, decision surface can be derived directly from labelled samples which is then called a “Discriminant Function”.
Discriminant Functions can be linear or nonlinear (neural network with back-propagation, deep learning, support vector machines, kernel PCA, etc.) and outputs can be binary, integer or real valued. Various learning algorithms can be seen as belonging to the family of iterative/ recursive/ adaptive learning algorithms (Least Mean Square being a great old standby!) that update the parameters of the Discriminant Function as new data arrive.
Context-sensitivity (for identifying a word within a sentence as an example) or Dynamics can be added to improve classification by incorporating Markov models (or Hidden Markov Models for tractable computations). Markov model is a special case of State Space Models which are well-studied in Systems Theory.
Unsupervised Learning is very useful in transforming basic features into more meaningful ones. One usually brings in some overall desirable property to guide unsupervised learning. For example, Mutual Information among classes can be minimized as a learning process in the belief that the “best” classification happens when the classes have least overlapping information (better “efficiency” in representation).
In all of the existing ML bags of tricks, we are still staying at the surface level! We are modeling the attributes or data DIRECTLY. What if we went one level deeper? Model the SYSTEM that generates the data! Syzen Analytics, Inc., takes such an explicit approach in what is called “SYSTEMS” Analytics”.
Once the patterns have been recognized and classes identified, the resulting classes can be used for all sorts of applications such as Recommendation Engine, Language Translation, Fraud Detection and many others. For deeper insight, “Machine Learning – a unifying perspective & new paths”.
#6: Business Needs, Prescriptive Analytics & SYSTEMS Analytics:
There is hardly ever a business solution that is “one and done”! What do I mean by that? One-shot solution that will, FOREVER, solve the problem . . . like life-long immunity from chickenpox (even that is not perfect – shingles can show up later in life). Almost all business problems I have seen are more like needing yearly “flu shots” - monitor the outcomes of the first solution, tweak the mix, administer again after a while and so on. Continuous Intelligence Augmentation . . .
After much customer interactions, I am convinced of the following simple “syllogism” in Analytics:
Businesses need actionable insights from data – “PRESCRIPTIVE” Analytics.
Prescriptive Analytics business solutions are like “flu shots” . . . treat, monitor, update & treat again over time.
Businesses need closed-loop Analytics at the right time-intervals!
That is Analytics as a system or “SYSTEMS Analytics”.
As Data Scientists, we have a duty to educate business clients to subscribe to this view, what I call, “goal-seeking” or tracking solution concept. The first solution may be just 80% but track and improve over time. With such realistic expectations, your customer will be delighted if the trajectory is good and fast. For a data science company, this is good – recurring revenue! Clearly, a win-win situation. So, collect data over time . . . and provide “Goal-seeking and Tracking solutions” within a Systems framework.
#7: SYSTEMS Analytics technology roadmap:
We have developed what is known as SYSTEMS Analytics as an effective Machine Learning tracking solution framework to address the dynamics of Retail Commerce. Details are explained in a Youtube video: “Future of Analytics – a roadmap”. https://youtu.be/1TAYLQw3u9s
As a guidance for your IA implementations for Retail or other business problems where “dynamics” (tracking solution) is important –
We have sketched out a roadmap of increasingly complex tools that can be brought to bear on Analytics or Machine Learning of today.
These tools provide high-value features in terms of system parameters, a framework for closed-loop real-time Analytics and ways to possibly accommodate the networked nature of data sources.
Theories of all the techniques discussed in the video are fully or partially developed but will require additional development to reach their full potential for Analytics applications.
Breakthrough business applications of the later milestones will require significantly more development in collaboration with business domain experts.
#8: Why is Predictive Analytics important to business?
A prerequisite for performance at a high level in business is the ability to understand and manage complexity. Complex systems to be managed properly requires a ton of data at the right time. BIG Data provide us the data we need; to put these data to work in order to take us to the high levels of complexity required while still managing it, we have to anticipate what is about to happen and react when it happens in a closed loop manner. Predictive Analytics will allow us to push our “system” to the edge (without “falling over”) in a managed fashion. This is why businesses embrace Predictive Analytics - to manage businesses at a high level of performance at the edge of complexity overload.
#9: eCommerce or “brick-and-mortar”?
You may ask, “Why not address e-commerce with Analytics?!” Look at 2014 data for US – Total In-store Retail: $4 TRILLION; e-commerce: $300 Billion or 6.4%; you want to address the MUCH bigger business opportunity! E-commerce growth is levelling off and projected e-commerce plus m-commerce total in 2018 is expected to approach 11%. (Source: eMarketer, 2014). By then, OMNI channel movement will make the distinctions among in-store, e-commerce and m-commerce irrelevant – sales attribution to any single entity will not make sense anymore!
Whenever there are constraints such as display space, warehousing cost, advertising dollars, personalization challenge, etc., there is a “product density” problem. Analytics can be put to work (“relevance” engine, for example) to optimize the solution! Here is a retail grocery example . . .
#10: Behavioral Segmentation Usage – depends on Purchase Funnel phase!
The state-of-the-art in Commerce is “behavioral” segmentation where the market is divided into segments based on pre-selected personal/ demographic characteristics.
Clearly, bucketizing shoppers into convenient segments such as “Price sensitive” or “Families” allows one level of meaningful abstraction. Instead of addressing millions of shoppers individually, one can tailor marketing, merchandising and loyalty efforts to a handful of labelled groups.
However, what is helpful at one level can be a flawed approach for some applications. Consider a case where a particular shopper, per behavioral segmentation, ended up in the Price Sensitive bucket. While this may be true in general for her, she may have specific preferences in certain product categories; for example, while Price Sensitive in general, her wine choice may be expensive brands! Such misallocations when multiplied by millions of shoppers lead to flawed product assortment decisions in the case of Behavioral Segmentation applied to Merchandising.
Syzen Analytics uses a better approach using shopper big data and Machine Learning (ML) to create and identify “segments” for Merchandising. In ML segmentation, the shoppers (whatever their behavioral characteristics may be) fall into N Preference Groups based on what they actually buy (actual purchase pattern is a great proxy for true product preference). In essence, each product category is its own unique market. Improvement due to ML data-driven method was demonstrated to be FIVE times higher than that due to Behavioral Segmentation.
ML method has several advantages when it comes to Merchandising. Whenever the data itself determine groups rather than being externally imposed, data analysis history has shown that results will be superior. Another nice feature is that human labor for and subjectivity in “bucketizing” can be avoided which makes analysis fast, inexpensive and repeatable. The fact that separate preference groups are generated for every product category and that ML method acts on what people purchase rather than why has led to breakout applications of this ML method in Retail Merchandising product assortment optimization.
In summary, while Behavioral Segmentation may be valuable at the “top end of the funnel” where *why* people buy can be a meaningful basis for awareness campaigns such as TV advertising, further down at the Desire and Action phases of the Purchase Funnel, *what* they buy or shopper “purchase propensity” is more relevant for Merchandising optimization.
#11: Canonical Retail Commerce SYSTEM:
Remember “Goal-seeking and Tracking solutions” as a System (see #6). A complete characterization of the essential elements of retail dynamics is captured in this canonical diagram. Business objective of ANY retail commerce is to increase customer acquisition and retention; then all good business results follow. Business owners have three levers to affect change – Marketing, Loyalty and Merchandising.
#12: Better Analytics - Spatio-temporal data:
As businesses push to higher levels of performance (see #8), higher fidelity models are going to be necessary to produce more accurate and hence valuable predictions and recommendations for business operations.
ALL data are spatio-temporal! At the simplest to more complex levels -
Data can be considered isolated at the simplest level – a “snap shot”.
Then we realize that data exist in a network with mutual interactions.
In reality, data exist in *embedded* forms in “influence” networks of one type or the other which are distributed in time and space – a “video”!
Spatial extent of data (distance) can be folded into time if we assume a certain information diffusion speed. Graph-theoretic methods do not account for time dimension. For accurate analysis, no escaping Dynamics over Time; meaning the use of differential (or difference) equations . . . and Systems Theory!
Systems Theory + Analytics = “SYSTEMS Analytics”. Some example business applications . . .
Some opinions expressed in this article may be those of a guest author and not necessarily Analytikus. Staff authors are listed http://www.datasciencecentral.com/profiles/blogs/big-data-analytics-data-science-machine-learning-random-insights