Voice API

Build interactive voice applications with text-to-speech, DTMF input, call recording, and real-time transcription.

Text-to-Speech

Natural-sounding voices in 40+ languages and accents.

Call Recording

Record calls with automatic transcription and storage.

IVR Builder

Create interactive voice menus with DTMF and speech input.

Quick Start

Make your first outbound call with text-to-speech. The API returns immediately while the call is placed asynchronously.

POST/v1/voice/calls

curl -X POST https://api.canarymsg.dev/v1/voice/calls \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '
  {
    "to": "+15551234567",
    "from": "+15559876543",
    "tts": {
      "text": "Hello! This is a call from Canary. Press 1 to confirm.",
      "voice": "en-US-Neural2-F"
    }
  }
'

Response

{
  "id": "call_abc123xyz",
  "to": "+15551234567",
  "from": "+15559876543",
  "status": "queued",
  "direction": "outbound",
  "created_at": "2024-01-01T12:00:00Z"
}

Request Parameters

Parameter	Type	Required	Description
`to`	string		Recipient phone number in E.164 format
`from`	string		Caller ID (must be a verified Canary number)
`tts`	object	*	Text-to-speech configuration
`audio_url`	string	*	URL of audio file to play (MP3, WAV)
`record`	boolean	Optional	Enable call recording
`transcribe`	boolean	Optional	Enable real-time transcription
`gather`	object	Optional	Gather DTMF or speech input
`webhook_url`	string	Optional	URL for call status webhooks
`timeout`	integer	Optional	Ring timeout in seconds (default: 30)

* Either tts or audio_url is required

Text-to-Speech

Convert text to natural-sounding speech with support for multiple voices, languages, and SSML markup.

TTS Configuration

{
  "tts": {
    "text": "Hello! Your verification code is <say-as interpret-as='digits'>123456</say-as>",
    "voice": "en-US-Neural2-F",
    "speed": 1.0,
    "pitch": 0
  }
}

Available Voices

Voice ID	Language	Gender	Type
`en-US-Neural2-F`	English (US)	Female	Neural
`en-US-Neural2-D`	English (US)	Male	Neural
`en-GB-Neural2-A`	English (UK)	Female	Neural
`es-ES-Neural2-A`	Spanish (Spain)	Female	Neural
`fr-FR-Neural2-A`	French	Female	Neural

Gather DTMF Input

Collect keypad input from callers to build interactive voice menus. You can gather a specific number of digits or wait for a terminating key.

{
  "to": "+15551234567",
  "from": "+15559876543",
  "tts": {
    "text": "Please enter your 4-digit PIN followed by the pound key."
  },
  "gather": {
    "type": "dtmf",
    "num_digits": 4,
    "finish_on_key": "#",
    "timeout": 10,
    "webhook_url": "https://yourapp.com/webhooks/dtmf"
  }
}

Webhook Payload

{
  "call_id": "call_abc123xyz",
  "type": "gather.completed",
  "digits": "1234",
  "finished_on_key": "#"
}

Call Recording

Record calls for quality assurance, compliance, or training. Recordings are stored securely and can be transcribed automatically.

{
  "to": "+15551234567",
  "from": "+15559876543",
  "tts": {
    "text": "This call may be recorded for quality purposes."
  },
  "record": true,
  "transcribe": true,
  "recording_channels": "dual"
}

Get Recording

GET/v1/voice/calls/:id/recording

{
  "id": "rec_xyz789",
  "call_id": "call_abc123xyz",
  "duration": 45,
  "url": "https://recordings.canarymsg.dev/rec_xyz789.mp3",
  "transcription": {
    "text": "Hello, this is a recorded message...",
    "confidence": 0.95
  }
}

Call Status

Status	Description
queued	Call is queued to be placed
ringing	Call is ringing at destination
in-progress	Call is connected and in progress
completed	Call ended normally
busy	Destination was busy
no-answer	No answer after timeout
failed	Call failed to connect

SDK Examples

Node.js

import Canary from '@canary/node';

const canary = new Canary('YOUR_API_KEY');

const call = await canary.voice.call({
  to: '+15551234567',
  from: '+15559876543',
  tts: {
    text: 'Hello from Canary!',
    voice: 'en-US-Neural2-F'
  },
  record: true
});

console.log('Call initiated:', call.id);

Python

from canary import Canary

canary = Canary("YOUR_API_KEY")

call = canary.voice.call(
    to="+15551234567",
    from_="+15559876543",
    tts={
        "text": "Hello from Canary!",
        "voice": "en-US-Neural2-F"
    },
    record=True
)

print(f"Call initiated: {call.id}")

Next Steps

OTP & Verification WhatsApp API