Microsoft Releases Phi-3 “Small Language Model”
- By Paul Mah
- May 01, 2024
Microsoft last week introduced Phi-3, the latest iteration of its “Phi” family of small language models, or SLMs. The software giant says Phi-3 outperform models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks.
SLMs are AI models designed specifically to be smaller in size. These compact models require less computational resources to run, making them ideal for resource-constrained environments such as standalone appliances or to enable offline inference.
They are also ideal for situations where extremely speedy responses are required, or to perform simpler tasks where cost is a consideration. Moreover, the smaller size of Phi-3 models makes fine-tuning or customization easier and more affordable.
Phi-3
Phi-3 builds on prior work with Phi models using a selection of high-quality data from the web and synthetically generated data from GPT-3.5. It is then further improved with extensive safety post-training, including reinforcement learning from human feedback (RLHF), automated testing and evaluations across various harm categories, and manual red-teaming.
According to Microsoft, its Phi-3 models significantly outperform language models of the same and larger sizes on key benchmarks. Specifically, Phi-3-mini does better than models twice its size, and Phi-3-small and Phi-3-medium outperform much larger models, including GPT-3.5 Turbo.
Customers are already building solutions with Phi-3. One area where Phi-3 is demonstrating its value is in agriculture, where the Internet might not be readily accessible. The smaller footprint of Phi-3 makes it available to farmers at the point of need and provides the additional benefit of running at a reduced cost for greater accessibility.
Microsoft says a leading business conglomerate based in India is currently leveraging Phi-3 in collaboration with Microsoft for Krishi Mitra, a farmer-facing app that reaches over a million farmers.
The 3.8B Phi-3-mini language model is currently available on Microsoft Azure AI Studio, Hugging Face, and Ollama. It is available in two context-length variants – 4K and 128K tokens, and is instruction-tuned and ready for use out-of-the-box.
The larger Phi-3-small (7B) and Phi-3-medium (14B) models will be available in the weeks ahead.
Paul Mah
Paul Mah is the editor of DSAITrends, where he report on the latest developments in data science and AI. A former system administrator, programmer, and IT lecturer, he enjoys writing both code and prose.