Machine Learning – Can We Please Just Agree What This Means
As a profession we do a pretty poor job of agreeing on good naming conventions for really important parts of our professional lives. “Machine Learning” is just the most recent case in point. It’s had a perfectly good definition for a very long time, but now the deep learning folks are trying to hijack the term. Come on folks. Let’s make up our minds.
How about ‘Big Data’? Terrible. It’s not about just size although if you asked most non-DS practitioners that’s what they’d say. Or how about ‘Data Scientist’. Nope. Can’t really agree on that one either.
Now we come to ‘Machine Learning’. If you asked 95 out of 100 data scientists, specifically those who are not doing deep learning they would unanimously agree that this definition hasn’t changed over at least the last 15 years:
The application of any computer-enabled algorithm that can be applied against a data set to find a pattern in the data. This encompasses basically all types of data science algorithms, supervised, unsupervised, segmentation, classification, and regression including deep learning.
Increasingly though there are more and more articles written that hijack this term to mean only deep learning.
It’s natural that the most press is given to the newest and most exciting frontier developments but this is an unnecessary source of confusion. Deep learning specialists have variously argued that machine learning means only unsupervised systems (not unique to deep learning), or systems which automatically discover all features (not unique to deep learning), or simply that it’s synonymous with deep neural nets, or more specifically convolutional neural nets or recurrent neural nets (including LSTM).
Personally, I think the traditional, more inclusive definition is more descriptive and more valuable in describing what we do to non-practitioners.
This is a much used graphic going around these days, and while I might pick some nits with the labeling it’s broadly inclusive of both traditional predictive analytics, data viz, and what we think of today as AI, which is broader than just deep learning. In other words it’s the full scope of data science.
What’s Wrong with Calling Deep Learning Machine Learning?
Well aside from the fact that it takes a much broader definition and makes it unnecessarily narrow; let’s look at some of the individual arguments.
Deep Learning is Different from Traditional Predictive Analytics
This is true is some respects but let’s look at the origins of deep learning, which is deep neural nets. Neural nets have been part of the predictive analytics toolset for decades and we’ve used them to solve complex regression and classification problems in supervised learning.
Not too many years ago our hardware simply couldn’t keep up with the computational complexity of NNs especially when we started to add hidden layers. But eventually hardware did catch up and we found we could answer both traditional supervised questions and some neat new unsupervised questions by adding more and more hidden layers.
In general, a NN architecture with more than two or three hidden layers is called a ‘deep neural net’ and is the origin of the phrase ‘deep learning’.
What about some of the other claims?
Deep Learning is Unsupervised
Not true. In fact the problem holding back many image recognition projects is the lack of labeled training datasets. The original success of the cat/not-a-cat CNN image system required millions of pictures of cats and not-cats all of which had to be labeled. Entire businesses like CrowdFlower have grown up around providing human-in-the-loop labeling strategies for just such problems.
Speech processing which is basically time series analysis using RNN/LSTM deep neural nets also has to be trained on known ‘good speech’. For chatbots for example, one strategy is to train the RNN not only for content, but also for simple conversational English using existing datasets of old film scripts.
Deep Learning Requires No Predefined Features
One of the strengths of deep learning is indeed to be able to find feature patterns that humans could probably not predefine. It’s also mostly true that you’re free of the columns of labeled features. However, this is not true of all deep learning. Take for example one of the most commercially successful applications which is facial recognition. The developers of facial recognition systems predefine from 50 to 400 measurements and ratios found on the human face (e.g. ratio of distance from eye to nose or nose to lips). In fact facial recognition is completely dependent on these predefined features to work.
Deep Learning is Unique Because it Uses Deep Neural Nets
Right now the only two elements of what we call AI or deep learning are image processing and text/speech processing. Because these approximate the eyes, ears, and mouths of our theoretical AI robot this is also sometimes called ‘Cognitive AI’. Hundreds if not thousands of startups are at work as we speak implementing image and text/speech capabilities into virtually every application imaginable.
But to equate Machine Learning with Deep Learning and restrict it to deep neural nets sells both machine learning and the near future of AI much too short.
Generative Adversarial deep neural nets are indeed another type of deep learning that isn’t quite commercially ready but will be in the near future. But the other elements of machine learning that support AI do not belong to deep neural nets.
Reinforcement learning is a field of machine learning unique to itself and involves training systems by trial and error toward a predefined goal. Actually reinforcement learning is more like a group of similar problems in search of a solution. There are currently two primary tools at use in RL, Q-learning and TD-learning but this field is new enough that it’s likely other algorithmic approaches will evolve.
Another area of AI machine learning that hasn’t been in the press much lately is Question Answering Machines like IBM’s Watson. QAMs which may use text, image, or speech deep learning front ends and back ends for input or output communication are actually modified search engines which use weighted evidence algorithms to find the single best answer to a question. Deep neural nets and deep learning is not their defining feature.
Is Deep Learning Really So Unique?
I understand that the new practitioners of CNN and RNN deep learning see their experience of data science as quite different from the 95% of us who still work with predictive analytics, data lakes, recommenders, dynamic pricing, time series forecasting, IoT and all the rest.
They have had to immerse themselves in Tensorflow or one of the other emerging deep learning platforms and struggle with the fact that in this new environment none of it is easy. The hyperparameters associated with our regression and classification techniques are reasonably well understood. In deep learning not only are these hyperparameters not yet well understood but it’s likely that many have not yet even been defined.
It’s still possible to run a deep learning training set for weeks and have it fail to train altogether.
It’s more likely that deep learning is not so unique as it is simply young. Every day there are hundreds of deep learning specialists working to make this simple and reliable, and the first innovators to achieve that will be richly rewarded.
Gartner forecasts that by 2018, deep neural nets will be a standard component of 80% of data scientists’ tool box. Though I appreciate Gartner’s forecasts I’ll bet you a bunch that this one is premature. However, I think we’re only two or three years off from a time when deep neural nets have become sufficiently simplified that we can all use them when appropriate. Remember that Gartner’s not saying we’ll use DNNs 80% of the time, only that it will be an available tool.
I also think that equating Machine Learning with deep learning is selling the future of data science in general and specifically AI too short. Not only are there the other approaches like GANNs and Reinforcement Learning on the near horizon but I’d make another strong bet that some of what we think of cutting edge today gets replaced by even newer developments.
Capsule networks to replace neural networks are already establishing a place in commercial research. I would also bet that we will see a rebirth of evolutionary or genetic algorithms either as standalone capabilities or as components in rapidly tuning DNNs. And what will we say about quantum in a few years when we apply it to these same problems.
What’s a Better Naming Convention for these New Capabilities
Right now deep learning is the tip of the spear driving AI. But as we already pointed out, machine learning is broader than deep learning, and the technologies supporting AI are broader than just deep learning.
While it’s tempting to equate deep learning with AI, this would be just as inaccurate as equating deep learning with machine learning. My suggestion, let’s keep it simple and call it what it is, Deep Learning.
Some opinions expressed in this article may be those of a guest author and not necessarily Analytikus. Staff authors are listed in https://www.datasciencecentral.com/profiles/blogs/machine-learning-can-we-please-just-agree-what-this-means