Skip to main content

Voice Message Bot Integration

This guide explains how to integrate voice message functionality into your chat bot using OpenAI for transcription and response generation.

Workflow Overview

  1. User uploads a voice/audio message in the chat
  2. System detects the audio file upload and triggers voice processing workflow
  3. Audio is transcribed to text using OpenAI's Whisper API
  4. Transcribed text is sent to OpenAI's Chat Completion API for response
  5. Bot responds to the user's voice message with text

Implementation Steps

Voice message triggers workflow

  1. When an audio file is uploaded, the system automatically detects it:

Audio file upload detection

Important Trigger Configuration Note: In the trigger setup, you'll see two conditional triggers. For the first trigger that checks for audio files, the "If conditions are NOT met" field must be left empty. For the final trigger in the chain, you should select "default for unknown message" to handle all other message types.

  1. The audio is transcribed and processed:

Text transcription result

Quick Setup

Download and import these pre-configured components:

Technical Requirements

  • OpenAI API access for Whisper (transcription) and GPT (completion)
  • Supported audio formats: mp3, wav, ogg, m4a (max 25MB)