Deep learning (DL) is a branch of artificial intelligence and a machine learning (ML) technique based on artificial neural networks. Deep learning algorithms use multiple layers (hence the name) to process and analyse information. This can be used for tasks such as image and speech recognition and natural language processing. In other words, for processes that humans perform intuitively and that cannot be calculated using formulae. The necessary complexity is achieved using a digital layer model. In DL complex learning effects – and decisions based on them – are based on the combination of many small, simple decisions and learning effects.
Deep learning has greatly improved the results of machine learning in many areas, but it is also much more resource-intensive than non-deep ML methods, i.e. single-layer neural networks, or other algorithms that do not require neural networks at all.
Deep learning is based on the idea that machines can learn to perform tasks independently. This involves analysing large amounts of data. This is achieved by using multiple layers of artificial neurons that process input data and make predictions based on what they have learned. In the process, DL models constantly combine what they have learnt with new content, learning as they go. At a certain point, the human is no longer involved in this learning process and the analysis is left to the machine. This is a key difference to machine learning, where humans retain more control over data analysis and the actual decision-making process.
A key strength of DL is its ability to recognise features and patterns in data without prior specialist programming by humans. This independent work of applying previously trained deep learning models to new data is known as transfer learning. This process significantly reduces the amount of high-quality data required to train a new model, while improving the model’s performance. This makes deep learning a powerful tool for solving complex problems.
One of the biggest challenges in deep learning is finding the right architecture for a given problem and the right parameters (hyperparameters) to set before training to improve the accuracy and training speed of DL models.
Another important aspect of deep learning is scalability, i.e. the ability of DL models to process large amounts of data. One approach is distributed data processing, where the training process is distributed across multiple computers and specially designed GPUs and TPUs (tensor processing units). This allows the model to be trained on much larger amounts of data. It is also possible to significantly reduce the time required to train the model. There are many cloud-based deep learning platforms that provide access to powerful GPUs and other hardware acceleration devices, making it easier for organisations to use deep learning.
How are deep learning algorithms constructed?
In most cases, these algorithms form deep neural networks. These consist of any number of layers of artificial neurons, modelled on the human brain. These linear and non-linear units process information and can learn from it. The term “deep” is derived from their layered structure. There is no precise definition of how many layers are required and how many neurons each layer must contain in order to be considered “deep learning”.
An artificial neural network used for deep learning consists of three layers: the input layer, a middle layer (hidden layer) and the output layer. The input layer receives information and passes it on in a weighted way to the next layer. This layer may consist of many layers of neurons that re-weigh the information and pass it on to other neurons. Exactly how this literally multi-layered process takes place cannot be seen from the outside. This is why the area where it takes place is known as the “hidden layer”. In effect, there is a black box at the heart of an artificial neural network. The last level of the hidden layer is directly connected to the output layer, which receives all the multiply weighted and processed information as a finished decision and keeps it ready for further processing.
Where is DL used?
One application of DL is in Natural Language Processing (NLP). DL models such as transformer networks have revolutionised the field of NLP, achieving very good results in tasks such as language translation, text classification and question answering. These models are able to process large amounts of textual data and learn the relationships between words and sentences, making them well suited to NLP tasks.
An exciting area of application for deep learning is the production of content that includes images, speech and video as well as text.
How is deep learning different from the human brain?
In DL, a computer mimics the structure and function of the human brain as closely as possible. The software creates a network of artificial nerve cells (neurons) and synapses (points of contact between nerve cells) that learns by processing lots of data, such as images or text.
However, AI neural networks only partially mimic the structure and function of the brain. In the human brain, there are several multi-layered brain areas that are responsible for individual functions and differ in the organisation of their neurons. Brain research is far from over. Fully understanding the interplay and the many complex connections between brain areas and then reproducing them in artificial neural networks must remain a long-term goal.
Despite its immense potential for solving complex problems, deep learning is therefore not yet classified as “strong AI”. This is because it would require artificial neural networks to develop a similar level of awareness and abstraction to humans, and to be able to accumulate knowledge about the world.
Limitations of deep learning
Despite its many advantages, deep learning has some limitations, including the need for large amounts of labelled data. This is necessary to create good training corpora and to train the models. The training data also needs to be diverse enough to avoid biased results.