Phi Silica SLM coming to Windows Runtime and Copilot PCs

At CES 2025 in Las Vegas this week, Microsoft’s head of Windows devices, Pavan Davuluri, announced that Phi Silica, a Small Language Model (SLM), will be integrated into the Windows runtime as part of Copilot in the first quarter of 2025 to provide offline use and performance boosts whilst also paving the way for additional features and privacy enhancements made possible through local processing.

What’s a Language Model?

Before diving into the details, it’s important to understand what a language model is. Language models are designed to comprehend, generate, and perform human-like language tasks, having been trained on vast amounts of data. However, not all language models are the same – they come in different sizes, large and small, each with unique strengths and weaknesses tailored to specific requirements.

The main differences between small and large language models lie in their size, capabilities, and resource requirements.

  • LLMs are ideal for applications needing high accuracy and versatility, such as advanced search, chatbots and content generation.
  • SLMs are generally more suited for specific, lightweight applications, like mobile apps and edge devices and laptops such which have local NPUs like Copilot+ PCs.

SLMs are coming to Windows 11

The Phi Silica SLM, which was first showcased at Microsoft Build in Seattle in May 2024, is designed to complement the Large Language Model (LLM) that runs in the cloud allowing specific AI workloads and processing to be run locally or handed over and run in parallel with the cloud based LLMs.

Small, but mighty, on-device SLM

Microsoft


Why? Well, whilst LLMs are typically faster and more accurate, they require cloud-based operations and can be costly to run and inflict  subscription fees (think Microsoft 365 Copilot). SLMs, on the other hand, can run many and other AI-driven applications and tasks locally on PCs, ensuring privacy and preventing data leakage to the cloud. However, SLMs are less sophisticated and require dedicated Neural Processing Units (NPUs) to provide these local AI capabilities. Hello Copilot+PCs.

Copilot+ PCs and AI PCs

The NPUs (Neural Processing Units) in Copilot+ PCs are designed to be highly power-efficient, capable of performing trillions of operations per second (TOPS) while consuming very little power. Specifically, on devices with Snapdragon X Elite processors, the Phi Silica model’s context processing uses only 4.8 milliwatt-hours (mWh) of energy on the NPU.

Additionally, the token iterator stage of the model shows a 56% improvement in power consumption compared to running on the CPU. This efficiency allows Phi Silica to operate without overloading the CPU and GPU, ensuring smooth performance and minimal impact on other applications.

Microsoft said that features like Windows Recall, Click-to-Do and other AI functionalities will soon be able to leverage these SLMs. Phi Silica uses a 3.3 billion parameter model, fine-tuned by Microsoft for both accuracy and speed and will. Improve performance, enhance privacy and enable more “offline” usage.


Leave a Reply