Deploy Voxtral Small 24B privately with full control
Run Mistral's multimodal AI model with advanced audio capabilities on our cloud infrastructure. Get fixed monthly pricing, complete data privacy, and unlimited usage.

Why Voxtral Small revolutionizes multimodal AI
Advanced audio understanding
Process speech, audio, and text simultaneously with dedicated transcription mode and long-form context handling up to 32k tokens for comprehensive understanding.
Complete privacy control
Your audio and text data never leaves our secure cloud infrastructure. Perfect for sensitive communications and confidential business applications.
Predictable costs
Pay a fixed monthly GPU rental fee instead of per-API-call costs. Scale usage without worrying about exponential billing as your application grows.
Built for voice-enabled enterprise applications

Speech transcription
High-accuracy speech-to-text conversion with support for 8 languages and specialized transcription mode for optimal results.
Audio translation
Real-time translation between languages directly from audio input, maintaining context and nuance across multilingual conversations.
Function calling from voice
Execute commands and trigger actions directly from voice input, enabling natural voice-controlled applications and workflows.
Long-form context
Handle extensive audio conversations and documents with 32k token context length for comprehensive understanding and response.
Built-in Q&A and summarization
Extract insights and create summaries from audio content automatically, perfect for meeting transcripts and voice notes.
Global deployment
Deploy across 210+ points of presence worldwide with smart routing to the nearest GPU for optimal performance and compliance.
Industries ready for voice-enabled AI
Customer service
Voice-enabled support systems
- Deploy AI-powered customer service that understands voice queries, transcribes calls automatically, and provides multilingual support. Handle customer interactions with natural voice processing while maintaining complete privacy.
Healthcare
Medical voice transcription
- Transcribe medical consultations, voice notes, and patient interviews with high accuracy. Process sensitive healthcare audio while maintaining HIPAA compliance and complete data privacy.
Legal
Confidential audio analysis
- Transcribe depositions, client meetings, and legal recordings with full attorney-client privilege protection. Process sensitive legal audio without data leaving your controlled environment.
Media & Content
Content creation and analysis
- Automatically transcribe interviews, podcasts, and video content. Generate summaries and extract insights from audio content for faster content production and analysis workflows.
How Everywhere Inference works
AI infrastructure built for performance and flexibility with Voxtral Small 24B
01
Choose your configuration
Select from pre-configured Voxtral Small 24B instances or customize your deployment based on performance and budget requirements.
02
Deploy in 3 clicks
Launch your private Voxtral Small 24B instance across our global infrastructure with smart routing to optimize performance and compliance.
03
Scale without limits
Use your model with unlimited requests at a fixed monthly cost. Scale your voice and text applications without worrying about per-call API fees.
With Everywhere Inference, you get enterprise-grade infrastructure management while maintaining complete control over your multimodal AI deployment.
Ready-to-use voice solutions
Voice assistant platform
Build intelligent voice assistants with speech recognition, natural language understanding, and multilingual support.

Meeting transcription suite
Automatically transcribe meetings, generate summaries, and extract action items from voice conversations.

Customer service automation
Deploy voice-enabled customer support that handles queries, transcribes calls, and provides multilingual assistance.

Frequently asked questions
What makes Voxtral Small different from other multimodal models?
Voxtral Small 24B combines state-of-the-art audio processing with best-in-class text performance. It offers dedicated transcription mode, 32k token context length, and native multilingual support across 8 languages, all while maintaining complete transparency and control.
Which languages does Voxtral Small support?
Voxtral Small provides native multilingual support for 8 major languages, enabling speech transcription, translation, and understanding across diverse global applications without compromising accuracy.
How does pricing work compared to API-based voice services?
Instead of paying per minute or per API call, you rent GPU capacity at a fixed monthly rate. This eliminates usage-based billing surprises and can be significantly more cost-effective for high-volume voice applications.
Is my audio data really private with Everywhere Inference?
Yes, your audio and text data never leaves our secure infrastructure. Unlike SaaS AI services, your voice inputs and outputs stay within your controlled environment, making it perfect for sensitive communications and regulatory compliance.
Can I use function calling directly from voice input?
Absolutely. Voxtral Small supports native function calling from voice, allowing users to execute commands and trigger actions through natural speech, enabling powerful voice-controlled applications and workflows.
Deploy Voxtral Small 24B today
Transform your applications with advanced voice and text capabilities. Get started with predictable pricing and unlimited usage.