Skip to Content

Voice Agent Setup

Configure how your AI agent handles phone calls, including voice selection, speech patterns, and custom pronunciations.

Access voice settings from Agent Settings → Call Settings.

Basic Settings

Voice Selection

Choose from a variety of Inworld voices for your agent. Each voice has distinct characteristics suited for different brand personalities. Use the Preview Voice button to hear how your selected voice sounds before saving.

Speaking Speed

Adjust how fast the agent speaks using the slider:

SpeedDescription
0.5xSlower pace, helpful for complex information
1.0xNormal conversational speed (default)
1.5xFaster pace, suitable for quick interactions

Escalation Phone Number

Enter the phone number (with country code, e.g., +12345678901) where calls should be transferred when the agent escalates to a human representative.

Messages

Configure what the agent says in different situations:

MessagePurpose
Greeting MessageFirst thing the agent says when answering a call
Farewell MessageSaid when ending the call
Unavailable MessagePlayed when no agents are available
Out of Hours MessagePlayed when calling outside business hours

Advanced Settings

Custom Keyterms

Add terms to improve speech recognition accuracy. This is useful for:

  • Brand names
  • Product codes or SKUs
  • Industry-specific terminology
  • Uncommon proper nouns

The agent will be biased toward recognizing these terms when customers speak them.

Max Call Duration

Set a maximum call length (up to 20 minutes). Calls will automatically end when this limit is reached. The default is 10 minutes.

IVR / Screening Support

Enable this for outbound calls to detect:

  • Voicemail greetings
  • Call screening prompts
  • Phone trees (IVR menus)

The system will navigate these before handing control back to the agent.

Pronunciation Settings

Use pronunciation mappings to correct how the agent pronounces specific words. This is essential for brand names, product names, or technical terms that text-to-speech engines might mispronounce.

Word-to-Word Mapping

Enter the word to match and the phonetic spelling the agent should use instead.

Example:

Word to MatchPronunciation
AcmeAck-mee
GIFJiff
SQLSequel

Using IPA for Precise Pronunciation

For more precise control, you can use the International Phonetic Alphabet (IPA) in the pronunciation field. IPA provides an exact specification of how words should be pronounced, eliminating ambiguity.

Common IPA examples:

WordIPA PronunciationDescription
NikeˈnaɪkiRhymes with “spiky”
AdidasˈædɪdæsEmphasis on first syllable
PorscheˈpɔːrʃəTwo syllables, not one
GiphyˈdʒɪfiSoft G sound

To use IPA:

  1. Find the correct IPA transcription for your word (resources like Wiktionary provide IPA for many words)
  2. Enter the IPA string in the “Pronunciation” field
  3. Use the Preview button to verify the pronunciation sounds correct

⚠️ IPA support depends on the underlying TTS engine. Test pronunciations before deploying to ensure they render correctly.

Tips for Pronunciation Mappings

  • Preview before saving: Always use the play button to hear how the pronunciation sounds
  • Case-insensitive matching: “Acme”, “ACME”, and “acme” will all be matched
  • Whole word matching: Only complete words are replaced, not partial matches
  • Keep it simple: Start with phonetic spellings before trying IPA

Verify Your Setup

After configuring voice settings:

  • Make a test call to hear the greeting message
  • Verify the agent’s voice and speed match your brand
  • Test pronunciation of key brand terms and product names
  • Confirm escalation transfers work correctly
  • Test out-of-hours and unavailable scenarios

Next Steps

Last updated on