Copilot Runtime: Building AI into Windows

At the heart of Build 2024 was a clear focus on artificial intelligence (AI). From the introduction of Copilot+ PCs to the keynote addresses by Satya Nadella and Scott Guthrie, AI took center stage. Even the Azure CTO’s presentation highlighted support for AI in Azure hardware innovations.

During the initial years of Nadella’s tenure as CEO, he often emphasized the concept of “the intelligent cloud and the intelligent edge,” combining big data, machine learning, and edge computing. This laid the foundation for Microsoft’s AI strategy, leveraging Azure’s supercomputing capabilities to train and infer AI models in the cloud, regardless of their size.

Bringing AI to the Edge

Microsoft’s key announcements at Build revolved around shifting endpoint AI functionalities from Azure to users’ PCs, utilizing local AI accelerators for running inference on various algorithms. This move towards decentralized AI is supported by neural processing units (NPUs) in modern desktop silicon.

Hardware acceleration, a proven approach, has evolved from early vector processing hardware to today’s NPUs optimized for neural networks. Microsoft showcased several NPU-based applications on existing hardware, offering developers access through DirectML APIs and support for the ONNX inference runtime.

The introduction of the Windows Copilot Runtime at Build 2024 signaled a new era of endpoint-hosted AI services, providing developer libraries and over 40 machine learning models, including the NPU-focused Phi Silica for small language models.

AI Development Stack for Windows

The Windows Copilot Runtime serves as a platform for interacting with AI tools on Windows, running atop new silicon capabilities, libraries, and models. Key components include the DiskANN local vector store and the Windows Copilot Library APIs, offering robust support for RAG applications and more.

With a focus on both generative and computer vision models, the Windows Copilot Runtime empowers developers to leverage AI capabilities on Windows efficiently. The runtime’s models cater to various tasks, from text recognition to image processing, with options to switch between local and cloud APIs based on specific needs.

Phi Silica: Enhancing NPU Performance

Phi Silica, a new NPU-optimized small language model within the Windows Copilot Runtime, delivers text responses to prompt inputs. While limited in processing speed, Phi Silica offers reliability and integration with local agent orchestration, utilizing RAG techniques and DiskANN for enhanced performance.

Microsoft’s strategic integration of the Windows Copilot Runtime into the Windows developer stack signifies a substantial commitment to AI and semantic computing as integral parts of the Windows ecosystem.

Building Tools for Windows AI

As Microsoft continues to enhance its AI development tools, such as the AI Toolkit for Visual Studio Code, developers can expect robust support for model tuning and experimentation. The toolkit’s playground feature enables developers to test and refine models before deploying them in Copilots, ensuring seamless integration with Copilot+ PCs and future Windows releases.

With the upcoming release of the Windows App SDK and Copilot+ PC hardware, Microsoft aims to democratize AI integration in Windows applications, offering users secure and privacy-focused AI features while optimizing Azure’s infrastructure.

Copyright © 2024 IDG Communications, Inc.

Leave a Comment