A Mid-Sized Language Model (MLM) is a generative language model is an advanced AI system comprising of maximum 10 billion (miliarden!) parameters, organized into multiple layers with attention mechanisms. These layers process and interpret vast amounts of text data, while the attention mechanisms allow the model to focus on relevant parts of the input. This architecture enables the model to understand and generate human-like language, perform nuanced tasks like answering complex questions, writing detailed texts, and engaging in sophisticated conversations, leveraging its deep learning capabilities.
Essentially, You have three options:
fine-tuning (in every training step, training process updates billions and billions of parameters)
training a LoRa (instead of updating billions of parameters, You update just few millions)
do Retrieval Augmented Generation
"I"-Avatarization is the process whereby a living human H consciously creates, develops, fine tunes and optimizes (his|her) own generative AI avatar datasets & models.
That is, using datasets (mails, chat transcripts etc.) to create a generative AI copy of one's self (an "I-Avatar") which could provide information in situation when H (her|him)self is not alive anymore.
On a machine amsel.udk.ai (running somewhere in this room), there are many nice Generative AI tools installed, including:
text-generation-webui web-based interface for work with language models
Training PRO extension for training LoRas for Mistral-architecture models
superbooga-v2 for easy RAG prototyping