Speech to Intent

27th September 2023: NXP® Semiconductors today announced the VIT Speech to Intent engine, a natural language understanding engine that leverages edge computing to enable local voice control. Designed to rival cloud-based systems’ performance, VIT Speech to Intent does not require a cloud connection, supporting improved user privacy. VIT Speech to Intent is part of NXP’s Voice Intelligent Technology (VIT) software suite and allows people to speak naturally to smart machines across IoT, industrial and automotive applications, rather than memorize precise commands or phrases to operate the devices.

Whether in the smart home, smart factory or smart city, voice has become one of the primary user interfaces for smart devices. However, many of these smart devices require precise phrasing to execute the desired action or cloud connections to translate the user’s speech into device actions. Natural language understanding delivered by VIT Speech to Intent allows devices to understand users’ intent, without requiring exact phrasing or cloud connectivity. This opens new possibilities for innovation, particularly in the smart home and places where users’ hands may not be free such as hospitals or factory floors. From advanced AI-driven devices to context-aware voice commands using VIT Speech to Intent, NXP helps simplify the development of voice-first devices with software optimized for its MCUs and MPUs.

“As we move towards smart devices that can better anticipate and automate based on our needs, particularly in the smart home, voice has emerged as one of the preferred ways to communicate our preferences to devices,” said Rafael Sotomayor, Executive Vice President and General Manager, Secure Connected Edge, NXP. “VIT Speech to Intent allows people to interact with smart devices seamlessly, without needing to rely on specific keywords, delivering convenience and ease of use, reducing complexity and alleviating user frustration. This further enables the transition from a smart home to an autonomous one.”

With a small memory footprint and modest computational requirements, the VIT Speech to Intent engine is available and suitable for use on NXP devices including i.MX RT Crossover MCUs and RW61x MCUs and i.MX 8M Mini, i.MX 8M Plus, and i.MX 9x applications processors. The VIT Speech to Intent engine runs locally, eliminating the need for a cloud connection to support improved user privacy, lower latency, reduced power consumption and reduced costs. This can be used to support natural speech interfaces with a wide variety of applications, including smart watches, smart HVAC, and more.

Today, VIT Speech to Intent supports English language interactions, with support for Mandarin coming later this year. Additional support is planned for Spanish, German, Korean, French and Japanese in 2024.

The free VIT Wake Word and Voice Command Engines available through the MCUXpresso SDK and online model tool allows developers to start developing with NXP’s Voice Intelligent Technology Portfolio immediately. Applications requiring natural language understanding functionality can develop with VIT Speech to Intent by contacting their local sales team or visit NXP.com/SpeechToIntent.

Voice Intelligent Technology Suite

VIT Speech to Intent is part of the Voice Intelligent Technology (VIT) software suite, a fully comprehensive, local voice control software package. Based on advanced deep learning, VIT is comprised of an always-on Wake Word engine, a Voice Command engine, and a Speech to Intent engine. Developers can get started with our free, ready-to-use Wake Word and Voice Command engines available through the MCUXpresso SDK and supported by an online model creation tool. In addition, developers can upgrade to the newly released Speech to Intent Engine.