There are language models for all areas of computational linguistics. Besides text generation, these are for example speech recognition, handwriting recognition, information recognition and extraction.
The following types of language models are used by Ella:
Sequence-to-sequence language model: This is a type of model used in natural language processing (NLP) where the input and output are both sequences of words or tokens. It’s commonly used for tasks like machine translation, where the model takes a sequence in one language and generates a sequence in another language.
BERT-Variant language model: BERT stands for “Bidirectional Encoder Representations from Transformers”. A BERT-variant language model is a model that is based on the BERT architecture but might have some modifications or improvements, such as different training data, model size, or downstream task fine-tuning.
Large language model (LLM): This refers to a type of neural network-based language model that is designed to understand and generate human language. “Large” indicates that the model has a high number of parameters (weights and connections) in its architecture. These models are capable of performing a wide range of NLP tasks and often require significant computational resources for training and inference.