Today we know that Intel has decided to publish its accelerator library as open source Intel NPU acceleration. Obviously, this library is compatible with Windows, as well as Linux. As a brief summary, thanks to this library, you can use Intel AI Boost NPU to run Language model (LLM) Light as TinyLlama. TinyLlama It is a compact model with only 1.1 billion parameters. This small size allows it to be adapted to many applications that require limited computing space and memory.
This library is clearly intended, for now, to be combined with New Intel Core Ultra processors. The company's first processors to integrate an NPU to handle AI-related workloads. This movement is clearly developer-focused. Now, ordinary users, who have this device and some programming experience, can take advantage of their Intel CPU in their AI work.
The Intel NPU Acceleration Library is now available on GitHub
He was Tony Munkolsmaysoftware engineer and technical evangelist, who made the announcement in Your official X account. He was the same one who demoed the LLM driver TinyLlama On MSI Prestige 16 AI Evo laptop Equipped with Intel Meteor Lake processor.
The open source NPU acceleration library is intended primarily for developers, but also for users naturalAnd with some programming experience, they can use it to run their own AI-powered chatbot on Meteor Lake.
For developers who have been asking, check out the new open source Intel NPU Acceleration library. I've just tested it on my MSI Prestige 16 AI Evo (Windows this time, but the library supports Linux as well) and following the GitHub documentation, I was able to get TinyLlama and Gemma-2b-it running without problems.
This is for developers working with NPU models, not really the production pipeline… and that's what you want to use DirectML/OpenVINO for.
Intel will speak on the record soon, but it's too good not to be involved.
For developers who have been asking, check out the newly open source Intel NPU Acceleration library. I've just tried it out on my MSI Prestige 16 AI Evo (running Windows this time, but the library supports Linux too) and following the GitHub documentation I was able to get TinyLlama up and running… pic.twitter.com/UPMujuKGGT
– Tony Mongkolsmai (@tonyongkolsmai) March 1, 2024
Because the NPU Acceleration library is designed specifically for Intel NPUs, it can only run on Intel Meteor Lake (Core Ultra) processors at this time. It is reasonable to expect that new generation processors such as Arrow Lake and Lunar Lake, also equipped with an NPU, will benefit from these sales. Now, these processors won't arrive until the end of the year. It might be more interesting, starting with that These CPUs will triple the performance of AI From Meteor Lake. This means that it will allow you to run larger LLMs on both laptops and desktops.
Finally, the library, which is important With less than half the jobs planned. What's missing is mixed-precision inference that can run on the NPU itself, the BFloat16 format for AI workloads, and heterogeneous NPU-GPU computing.
“Beer enthusiast. Subtly charming alcohol junkie. Wannabe internet buff. Typical pop culture lover.”