Member-only story
How Small Language Models Are Learning to Think
There’s a quiet shift happening in AI. For a while, bigger always seemed better. More parameters meant more power, and the race was on to build the biggest language model. But now, something interesting is happening: smaller models are learning how to reason, and they’re getting really good at it.
Recently, Microsoft released a new family of models called Phi-4-reasoning and Phi-4-reasoning-plus. These are relatively small models (just 14 billion parameters), but they’re able to tackle tough reasoning tasks in math, coding, planning, and even logic puzzles, sometimes beating models that are five times their size.
Teaching Models to Think, Step by Step
The answer lies in a combination of techniques: careful data curation, supervised fine-tuning (SFT), and a bit of reinforcement learning (RL). Together, these allow a smaller model to “think out loud” more effectively, mimicking how we might solve a problem step by step.
The process starts with a base model — Phi-4 — and a broad set of prompts from different domains like math, code, and safety. But instead of just throwing all this data into training, it will filtered carefully. They pick problems that are just hard enough and can teach the model something new.