+41 544 36 40
  • en
  • de
  • Lexicon

    adversarial learning

    Besides defensive distillation, one of the techniques for defending against adversarial attacks on AI systems is adversarial learning (adversarial training). There is currently no other way to defend against such attacks with so-called “adversarial examples”. (more…)


    Artificial Intelligence (AI) is a branch of computer science. The term is used to describe the technologies used (see Machine Learning). The models generated by machine learning are not actually “intelligent”, but solve narrowly defined problems using complex statistical calculations. AI is the attempt to program machines to acquire human-like abilities.



    An annotation is a short note or comment added to a text, image or another document. In linguistics, this is mostly manual extraction of certain features of natural language in texts. Example: determining the gender of proper names by marking individual names in texts with labels such as female, diverse, neutral, etc.


    automatic text generation (Natural Language Generation)

    Automatic text generation, also known as Natural Language Generation (NLG), is a branch of artificial intelligence. It is part of Natural Language Processing (NLP) and refers to the generation of natural language text using software. Automatic text generation programs are increasingly used to create content for websites and other online applications. NLG software often uses a combination of machine learning and linguistic algorithms to generate text that follows a particular pattern or template. This requires structured data. (more…)


    Classification describes the task of assigning a data object to one of several previously defined classes. Example: assigning a text to a genre based on its content.

    cluster analysis (clustering)

    Cluster analysis or clustering is a machine learning technique that sorts data points into similar groups. It uses the unsupervised machine learning algorithm, which requires no prior information about the data and relies solely on the similarities between the data points.


    The basic idea behind this method is to group similar data together and place different data in separate groups. In this way, patterns and structures in the data become apparent. The similarities between the data points are calculated using different metrics. (more…)

    computational linguistics

    Computational linguistics is the interface between computer science and linguistics. It involves the use of computers to process natural language (both text and audio), such as speech recognition and synthesis, machine translation and dialogue systems. It is therefore an interdisciplinary field concerned with the application of computer technology to language.

    One of its main goals is to enable computers to perform natural, human-like language processing, including comprehension and production. This may require hardware, such as input and output devices, as well as software programs. (more…)

    content marketing

    Content marketing is a strategic approach to creating and distributing valuable and relevant content to attract and retain users. The aim is to generate new internet users and retain existing ones by providing them with informative, useful and/or entertaining content. It is therefore, among other things, a tool for increasing traffic. Content marketing can also improve a company’s image and increase awareness of a brand, product or person. Content can be delivered in a variety of formats, such as blog articles on your own website, videos, podcasts or infographics. (more…)


    A collection of texts that usually has a context in terms of content or structure. For example, a corpus may consist of texts from one source.


    A crawler is a program that extracts data from a web page and writes these into a database. Crawlers are also known as robots or spiders because their search is automatic and their path through the web is similar to a spider’s web.

    Spiders usually visit websites via hyperlinks embedded in websites that have already been indexed. The retrieved content is then cached, analysed and, if necessary, indexed. Indexing is based on the search engine’s algorithm. The indexed data then appears in the search engine results.

    Using special web analysis tools, web crawlers can analyse information such as page views and links and compile or compare data in the sense of data mining. Websites that do not link or are not linked to cannot be detected by crawlers and therefore cannot be found by search engines.







    A dashboard is an interactive visualisation of certain data (e.g. business figures) on a separate user interface. The user can change the time period of the data displayed or zoom into a chart to look at something in more detail. Dashboards can be used to transform data into information from which knowledge can be generated.




    data mining

    Data mining refers to the computer-aided evaluation and analysis of large volumes of data to identify patterns and correlations. It uses automated processes for pattern recognition, methods from artificial intelligence, statistics and data analysis.

    Companys and organisations can use data mining to gain valuable knowledge to help them make better decisions. It involves using historical data to predict likely future developments and identify possible trends or anomalies.
    One example is the analysis of customer data in an online shop. By analysing purchasing behaviour, search queries and demographic information, targeted marketing can be developed.
    Data mining is also used in the context of automatic text generation. Text mining uses techniques such as natural language processing (NLP) and machine learning (ML) to identify patterns in text. By using algorithms and machine learning, it is possible to generate new coherent and meaningful sentences or paragraphs. This can be used to automatically generate articles or stories.





    deep learning

    Deep learning (DL) is a branch of artificial intelligence and a machine learning (ML) technique based on artificial neural networks. Deep learning algorithms use multiple layers (hence the name) to process and analyse information. This can be used for tasks such as image and speech recognition and natural language processing. In other words, for processes that humans perform intuitively and that cannot be calculated using formulae. The necessary complexity is achieved using a digital layer model. In DL complex learning effects – and decisions based on them – are based on the combination of many small, simple decisions and learning effects.

    Deep learning has greatly improved the results of machine learning in many areas, but it is also much more resource-intensive than non-deep ML methods, i.e. single-layer neural networks, or other algorithms that do not require neural networks at all. (more…)

    defensive distillation

    Defensive distillation is an adversarial training technique used in the field of machine learning, specifically in the context of deep learning. The technique protects neural networks from adversarial attacks and makes an algorithm’s classification process more flexible so the model is less vulnerable to exploitation. (more…)

    entity extraction

    Entity extraction or entity recognition is a process in which specific information is filtered out of unstructured or semi-structured digital data. Individual, clearly identifiable entities such as people, places, things, terms, etc. are extracted and stored in a machine-readable format. Entity recognition makes it possible to gain insight into unknown datasets by making it immediately clear who and what the information is focused on. This enables more efficient decision making and optimised workflows.



    GPT-2 is a common deep learning language model developed by OpenAI for text generation. It is open source and has been trained on over 1.5 billion parameters to generate the next sequence of text for a given sentence.



    Further development of GPT-2 (see above). Both models use approximately the same architecture, but GPT-3 has more layers (see neural network) and has been trained with more data.


    Graphics Processing Unit / Graphics Card. Graphics cards offer very high computing power and are therefore used for training deep learning models.

    language models

    There are language models for all areas of computational linguistics. Besides text generation, these are for example speech recognition, handwriting recognition, information recognition and extraction.

    The following types of language models are used by Ella:

    Sequence-to-sequence language model: This is a type of model used in natural language processing (NLP) where the input and output are both sequences of words or tokens. It’s commonly used for tasks like machine translation, where the model takes a sequence in one language and generates a sequence in another language.

    BERT-Variant language model: BERT stands for “Bidirectional Encoder Representations from Transformers”. A BERT-variant language model is a model that is based on the BERT architecture but might have some modifications or improvements, such as different training data, model size, or downstream task fine-tuning.

    Large language model (LLM): This refers to a type of neural network-based language model that is designed to understand and generate human language. “Large” indicates that the model has a high number of parameters (weights and connections) in its architecture. These models are capable of performing a wide range of NLP tasks and often require significant computational resources for training and inference.

    machine learning

    Machine learning refers to a process in which computers learn on their own without being programmed to do so for each use case. These are technologies in which computer programs process a large number of examples, derive patterns and apply them to new data points. In the process, a statistical model is built from the examples to build language, for instance. Deep learning is a variant of machine learning. AI is a buzzword for Machine Learning.


    Predefined measurement value to indicate quality in relation to a specific criterion.

    model architecture

    The sequence of all procedures used to build the model. Neural networks are usually described by the number and function of their layers. A model is created from the combination of architecture and corpus.


    Morphology means the study of forms and is a branch of linguistics. It is the science of the change of word forms in a language. Words are not fixed entities and can change their form. Depending on the context, for example, write becomes writes or run becomes runs.

    Named Entity Recognition (NER)

    The automatic detection and labeling of proper names (entities) in texts. Example: Angela Merkel and Frau Merkel refer to the same person in two sentences.

    Natural Language Generation (NLG)

    Natural Language Generation. The generation of text (natural language) using machine learning.

    Natural Language Processing (NLP)

    Natural language processing is the automatic processing of natural language. It uses methods from computational linguistics, artificial intelligence and statistics to recognise, understand, interpret and generate language. These insights can then be used to translate or rewrite texts, for example.

    Natural Language Understanding (NLU)

    Natural Language Understanding describes the ability of machines to understand natural language. This includes both reading and writing natural language and analyzing meaning and context.

    neural networks

    Neural networks are artificial intelligence models modeled on the human brain. They consist of a series of processing units that are interconnected.

    An artificial neural network consists of layers of interconnected units (neurons) that pass information to each other under certain conditions. Each unit processes a specific signal and passes it on to the next unit. The network learns by processing signals and adjusting the connections between units.

    In Deep Learning, there are usually many such layers, each with a very large number of neurons, which makes training very computationally intensive.

    normalization of texts

    Normalization of texts describes the standardization of text structure and punctuation. For example, all quotation marks and dashes are normalized to one character each, as are section markers such as lines or markers for chapter headings.


    In computer science, ontology is the formalization of a field of knowledge to describe complex facts in a machine-readable form. In doing so, a certain structure of objects and their relationships is specified so that computer software can process them. Ontologies are used in computer science mainly for semantic procedures. Here, one is attempting to extract and to model knowledge from an unstructured data stock with the help of machine learning procedures. This way, complex inquiries can for example also be made to a data stock.


    Before a corpus can be passed to AI for training, some preprocessing steps have to be performed. Unwanted content, for example, is removed, normalizations are performed, and texts are adapted to model specifics. If for instance a model has only learned one type of quotation marks in pretraining, these quotation marks should be the same in the training corpus for finetuning, so that they are recognized correctly right away.

    pretraining, pretrained model

    The initial training of a model. Pretraining involves passing very large amounts of data to build a robust statistical model of language and knowledge.

    recommender systems

    A model that suggests additional items based on the user behavior of similar users. Example: ‘Users who viewed this item also viewed the following other items…’

    reinforcement learning

    Reinforcement learning (RL) is a machine learning training method based on rewarding desired behaviours and punishing undesired ones. “Reward” and “punishment” are to be understood in this context as merely numerical values which help the algorithm to find the “best” way to a solution. This approach allows an agent to learn to navigate the complex demands of the specific environment for which it was created so that over time, the agent optimises its behaviours. (more…)

    robotic journalism

    Robotic journalism is a form of journalism that uses computer-controlled programs to create journalistic content. This content can be news reports, sports scores, weather reports, financial reports, stock market reports, and other forms of journalism.

    search engine marketing (sem)

    Search engine marketing (SEM) is one of the most important types of online marketing. It can be divided into search engine advertising (SEA). Here, advertisers pay for their own website to be listed above other websites. This paid advertising is displayed, for example, on Google in special areas on the search results page (SERP – Search Engine Result Page) and marked as such. Another method is search engine optimization (SEO or Search Engine Optimization). Here, one is trying to get one’s own Internet presence displayed as high up as possible in the organic search results.

    statistical model

    A statistical model makes predictions about input data based on learned patterns. Language models, for example, predict the next word in an input sentence. A model has an architecture and must be trained to build the statistical model.

    text mining

    Text mining is data mining specifically for written data in natural language. The text mining process involves the use of algorithms and methods to extract valuable information from unstructured or semi-structured text data, identify new patterns, confirm existing patterns, or make predictions. The insights gained can be applied in many fields, such as science, marketing, customer service or finance.

    text spinning

    Text spinning or article spinning is a technique aimed at changing a text to make it more appealing to a specific target audience. It involves replacing words, changing sentence structures, and inserting new words. The actual content of the text remains unchanged. This is relevant, for example, when creating new texts for search engine marketing (SEM), especially for search engine optimization (SEO or search engine marketing).


    While training, a model learns from examples. Based on the examples, the model tries to predict an outcome (for example, filling in a cloze correctly) and compares its results with the real values at the end of each cycle. If the result is wrong, the underlying statistical model is adjusted and a new attempt is started. Usually, a training runs until the statistical model hardly changes anymore, i.e. the results become stable. This can be the case after a few minutes (classic machine learning) or weeks/months (deep learning on very large data sets).

    training corpus

    A corpus used for training a model.