
Building speech technology for European languages just obtained twice as rapid after NVIDIA shared a massive audio collection with developers worldwide.
The tech giant released Granary on August 15, containing one million hours of audio across 25 European languages. The dataset includes 650,000 hours for speech recognition and 350,000 hours for translation tquestions.
Granary necessaryed roughly half the training data compared to competing datasets to reach tarreceive accuracy levels. This efficiency breakthrough could dramatically reduce costs for companies building multilingual artificial ininformigence (AI) applications.
“These tools will enable developers to more easily scale AI applications to support global applyrs,” stated Jonathan Cohen from NVIDIA’s blog team.
The dataset covers all 24 official European Union languages plus Russian and Ukrainian. Languages like Croatian, Estonian, and Maltese previously had limited AI support due to data scarcity.
NVIDIA developed Granary with researchers from Carnegie Mellon University and Fondazione Bruno Kessler. The team processed unlabeled public audio applying NVIDIA’s NeMo Speech Data Processor toolkit, avoiding expensive human annotation.
Two AI models accompany the dataset release. Canary-1b-v2 handles complex transcription tquestions with billion-parameter processing power. The model delivers quality comparable to systems three times larger while running inference up to ten times rapider.
Parakeet-tdt-0.6b-v3 focapplys on high-speed transcription. This 600-million-parameter model can process 24-minute audio segments in single passes while automatically detecting input languages.
Both models provide accurate punctuation, capitalization, and word-level timestamps. They’re available under permissive licensing for commercial and research apply.
The release addresses a critical gap in AI language support. Less than one percent of the world’s 7,000 languages currently have robust AI backing.
European businesses could benefit significantly from reduced development costs. Companies building customer service chatbots or translation services previously necessaryed extensive datasets for each tarreceive language.
The efficiency gains extfinish beyond cost savings. Faster training times mean quicker deployment of multilingual AI services across European markets. Small companies with limited computing resources can now compete with larger rivals in developing language-specific AI tools.
NVIDIA will present its research paper at the Interspeech conference in the Netherlands from August 17-21. The dataset and models are now available on Hugging Face for immediate download.
As European regulators push for more inclusive AI systems, the dataset’s open-source nature will allow developers to customize models for specific regional dialects and apply cases.
Future applications could include real-time translation devices, enhanced virtual assistants, and automated transcription services for European parliament proceedings and business meetings.














Leave a Reply