Reinventing Language Models: The Rise of 'Small Language Models'
After large language models, there is research on the "small language models" in the field of AI. The development of the Mixtral 8x7B model by the French AI startup Mistral is capable of matching the quality of GPT-3.5 on some benchmarks despite its relatively small size. Mixtral 8x7B utilizes a "sparse mixture of experts" model, which combines smaller models trained to handle specific tasks, allowing it to run more efficiently. Additionally, Microsoft researchers have published the latest version of their home-grown model called Phi-2, which is tiny enough to run on a mobile phone with just 2.7 billion parameters. Phi-2 sets a new standard for performance among base language models with less than 13 billion parameters and leverages high-quality training data to achieve excellent results.