Voice API
Build interactive voice applications with text-to-speech, DTMF input, call recording, and real-time transcription.
Text-to-Speech
Natural-sounding voices in 40+ languages and accents.
Call Recording
Record calls with automatic transcription and storage.
IVR Builder
Create interactive voice menus with DTMF and speech input.
Quick Start
Make your first outbound call with text-to-speech. The API returns immediately while the call is placed asynchronously.
curl -X POST https://api.canarymsg.dev/v1/voice/calls \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '
{
"to": "+15551234567",
"from": "+15559876543",
"tts": {
"text": "Hello! This is a call from Canary. Press 1 to confirm.",
"voice": "en-US-Neural2-F"
}
}
'Response
{
"id": "call_abc123xyz",
"to": "+15551234567",
"from": "+15559876543",
"status": "queued",
"direction": "outbound",
"created_at": "2024-01-01T12:00:00Z"
}Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
to | string | Recipient phone number in E.164 format | |
from | string | Caller ID (must be a verified Canary number) | |
tts | object | * | Text-to-speech configuration |
audio_url | string | * | URL of audio file to play (MP3, WAV) |
record | boolean | Optional | Enable call recording |
transcribe | boolean | Optional | Enable real-time transcription |
gather | object | Optional | Gather DTMF or speech input |
webhook_url | string | Optional | URL for call status webhooks |
timeout | integer | Optional | Ring timeout in seconds (default: 30) |
* Either tts or audio_url is required
Text-to-Speech
Convert text to natural-sounding speech with support for multiple voices, languages, and SSML markup.
TTS Configuration
{
"tts": {
"text": "Hello! Your verification code is <say-as interpret-as='digits'>123456</say-as>",
"voice": "en-US-Neural2-F",
"speed": 1.0,
"pitch": 0
}
}Available Voices
| Voice ID | Language | Gender | Type |
|---|---|---|---|
en-US-Neural2-F | English (US) | Female | Neural |
en-US-Neural2-D | English (US) | Male | Neural |
en-GB-Neural2-A | English (UK) | Female | Neural |
es-ES-Neural2-A | Spanish (Spain) | Female | Neural |
fr-FR-Neural2-A | French | Female | Neural |
Gather DTMF Input
Collect keypad input from callers to build interactive voice menus. You can gather a specific number of digits or wait for a terminating key.
{
"to": "+15551234567",
"from": "+15559876543",
"tts": {
"text": "Please enter your 4-digit PIN followed by the pound key."
},
"gather": {
"type": "dtmf",
"num_digits": 4,
"finish_on_key": "#",
"timeout": 10,
"webhook_url": "https://yourapp.com/webhooks/dtmf"
}
}Webhook Payload
{
"call_id": "call_abc123xyz",
"type": "gather.completed",
"digits": "1234",
"finished_on_key": "#"
}Call Recording
Record calls for quality assurance, compliance, or training. Recordings are stored securely and can be transcribed automatically.
{
"to": "+15551234567",
"from": "+15559876543",
"tts": {
"text": "This call may be recorded for quality purposes."
},
"record": true,
"transcribe": true,
"recording_channels": "dual"
}Get Recording
{
"id": "rec_xyz789",
"call_id": "call_abc123xyz",
"duration": 45,
"url": "https://recordings.canarymsg.dev/rec_xyz789.mp3",
"transcription": {
"text": "Hello, this is a recorded message...",
"confidence": 0.95
}
}Call Status
| Status | Description |
|---|---|
| queued | Call is queued to be placed |
| ringing | Call is ringing at destination |
| in-progress | Call is connected and in progress |
| completed | Call ended normally |
| busy | Destination was busy |
| no-answer | No answer after timeout |
| failed | Call failed to connect |
SDK Examples
Node.js
import Canary from '@canary/node';
const canary = new Canary('YOUR_API_KEY');
const call = await canary.voice.call({
to: '+15551234567',
from: '+15559876543',
tts: {
text: 'Hello from Canary!',
voice: 'en-US-Neural2-F'
},
record: true
});
console.log('Call initiated:', call.id);Python
from canary import Canary
canary = Canary("YOUR_API_KEY")
call = canary.voice.call(
to="+15551234567",
from_="+15559876543",
tts={
"text": "Hello from Canary!",
"voice": "en-US-Neural2-F"
},
record=True
)
print(f"Call initiated: {call.id}")