Heard about self-driving cars? Or seen facebook suggesting tags on picture of someone you know? Ever considered if these two can even be remotely related? Well, yes they are, related through a breakthrough technology know as Artificial Intelligence, AI, the field of research that focuses on making machine independent on decision-making and other cognitive functions usually associated with the human brain. The field also encompasses Machine Learning along with Deep Learning.
Let us look into the basics of machine learning and deep learning and tackle the key concepts involved in them.
What is Machine Learning?
As quoted in Wikipedia “Machine learning is a field of computer science that uses statistical techniques to give computer systems the ability to “learn” (e.g., progressively improve performance on a specific task) with data, without being explicitly programmed”. Simply speaking, machine learning is the process of making machine independent in decision-making with examples and experience. It is the process of teaching a computer to carry out a task, rather than programming it how to carry that task out step by step.
Let us consider an example to simplify how machine learning works. Human babies learn to identify things with time. They grow in their understanding of the surrounding with experience and example. We constantly observer parents talking to children and helping them identify object near their vicinity time and again. No rules, just plain old ‘this is this’ and ‘that is that’. Well in simple terms, that is how machine learning works. Apart from the complex maths behind any machine learning algorithm, the general understanding is that we make a program (called a learning algorithm) learn from examples (called training), that we have provided, repetitively and making them predict on new data in each iteration with an eye for decreasing errors and polishing outputs. Finally, the trained model is saved for evaluation and future predictions on new data-set.
Given the nature of inputs and expected outcomes from a machine learning initiatives, machine learning algorithm or tasks can fundamentally be categorized into two parts:
i) Supervised Learning
ii) Unsupervised Learning.
Machine Learning tasks where a supervisor feeds in both the input and expected output to the learning algorithm during the training process is know as Supervised Learning. Here, the learning algorithm uses the sample input (called features) and outputs (called labels) to map the input to output. This learning is then used to predict value on new data.
On the other hand, in unsupervised learning, the learning algorithm is not provided with any labels and is left on its own to identify patterns in the data. This approach is normally undertaken when the human supervisor is not sure about what aspects to look for in the data. The learning algorithm’s job is to find pattern in the data.
Machine learning algorithms can also be categorized based on types of output that it is expected to predict. Some of the tasks associated with Machine learning include:
When the aim of the learning algorithm is to predict a new value from the list of predefined categories/labels based on which the model was trained, the task is known as classification. Tasks such as is this email a spam or not?, is this a picture of cat or a dog?, etc fall under this category. If the output is Boolean, meaning one of the either, then it is known as binary classification and falls under supervised learning where the learning algorithm is trained on labeled data. Similarly, when classification is to be done on more than 2 classes, then the classification is called multi-class classification.
Similarly, regression is associated with the problem types where we need to predict a continuous numeric value. How many goal will there be in a football match? Prediction of real state values or stock values in the future all fall under this supervised learning task. In summary, it can be stated that classification is used to segregate the data whereas regression is used to fit in the missing pieces.
On the other hand, clustering is associated with grouping of similar entities. Here, the goal is not to separate data, but to identify pattern within the training data. It is similar to clustering, however no label is provided during the training phase. Sentiment analysis tasks fall under this type of task.
Now that we have gotten familiar with Machine learning and its basics, we can move onto the cutting-edge technology know as deep learning.
We are constantly noticing terms such as deep learning and neural network coming into mainstream with lots of tech giants investing heavily in them. Deep Learning is used by Google in its voice and image recognition algorithms, by Netflix and Amazon to decide what you want to watch or buy next. It also used by banks to identify fraudulent activities, etc. These buzz words seem complicated at first, however, once we get familiar with some of the basics, we will realize that it is easy to understand deep learning and that the past procrastination and worry has been for nothing.
Firstly, deep learning is a sub-field of AI, precisely machine learning, that aims to develop the working functionalities of human brain in identifying patterns and providing decision based on these patterns learned. It aims to achieve the thinking process, undertaking place each and every second in the brain, and use it to learn patterns on data.
Technically speaking, Deep learning is a collection of algorithms used in machine learning, used to model high-level abstractions in data through the use of model architectures, which are composed of multiple nonlinear transformations. In other words, deep learning uses neural networks to carry out the tasks of Machine learning. Deep learning is inspired by the structure and functioning of the human brain.
So what is the difference between machine learning and deep learning you say?
The core difference between machine learning and deep learning is the use of neural networks. Unlike standard machine learning algorithms that break problems down into parts and solves them individually, deep learning solves the problem from end to end. Better yet, the more data and time you feed a deep learning algorithm, the better it gets at solving a task. This is possible through the use of neural networks. It is quiet easy to comprehend how this works taking into example a human being. Upon discovering a new topic that we are unaware of, it takes considerable time and example(resource) to grasp the idea and understand the principles involved. Using the same analogy, the neural networks is able to understand the problem at hand through time and experience during the course of its training phase. The more data that is provided to the network, the better chance it has to identify patterns in it.
Furthermore, unlike standard machine learning approaches where we need to provide feature extraction information from the data-set, the neural networks in deep learning provide us with the abstraction on feature engineer and makes it easier for us to write programs to solve problems at hand. Feature engineering is the task of identifying features associated with an example that makes it fall under a specific category. Due to the feature engineering capability embedded with the neural networks, it takes longer to train and requires more powerful hardware but provides higher level of accuracy than conventional machine learning approaches.
Neural networks are mathematical models whose structure is inspired by that of the brain with an aim to achieve the cognitive functioning and decision making of the brain. The neural network consisted of neurons which takes input from the previous layer of neurons, performs some operation on it and then provides the output to the next layer. A simple representation of the neural network is shown below.
The layers between the input and the output layer is know as hidden layers. A neural network is said to be deep if there are more than 1 hidden layer in the network.
There are many types of Neural networks. An MLP (multilayer perceptron) is the simplest type of feed-forward neural network where information moves from the input layer to the output layer through the hidden layers. Backpropagation supervised learning technique is used to calculate the gradient required to decrease the loss function in each epoch (iteration through out the entire dataset). Here, the weight of the nodes between the neurons is calculated using loss functions such as gradient descendant in order to reduce the loss during training and to increase the accuracy of the network.
Python is a great programming language to work with machine learning and deep learning. There are tons of libraries available from which we can start implementing the above concepts.
For machine learning, we have the python libraries SciPy and scikit-learn. And for deep learning, we can explore it through the hidden world of tensorflow, Keras, etc. While tensorflow has a huge and supportive community, it is not quite easy to grasp for first timers. On the other hand, Keras is a high-level deep learning API built on top of tensorflow/theano mathematical libraries. Keras provide high level abstraction, is more user friendly and it is easier to develop neural networks in. However, since it is used on top of tensorflow/theano, however it does consume more memory and is slower than the native libraries.
It is always a daunting task to start working on a new technology. None more so than in case of machine/deep learning. However, the concepts and mathematics have been made light by the various resources and examples readily available.