Skip to main content

Completions generic api integration

This integration was tested with https://github.com/ggml-org/llama.cpp

I was running during test P4 Tesla Nvidia card with following

./llama.cpp/build/bin/llama-server \
-m Qwen3.5-4B-Q4_K_M.gguf \
--mmproj mmproj-F16.gguf \
--host 0.0.0.0 \
--port 8080 \
--api-key "your-api-key" \
-ngl 99 \
-fa on \
-t 8 \
-c 16384 \
-b 512 \
-ub 512 \
--mlock \
--reasoning-budget 512 \
--jinja \
--cache-type-k q8_0 \
--cache-type-v q8_0

This integration uses the Chat Completions API.

Rest API

Bot

Flow with Tool Call Support

The main difference from the legacy flow is the support for tool calls.

REST API

  • Set a Bearer token.
  • Modify the system prompt.

Bot

  • Import a bot and configure the correct triggers and API calls as shown in the video.

Calling a Trigger Based on a Defined Function in ChatGPT

  1. Note the defined function in Gemini, transfer_operator.
  2. Add an event to your trigger with the Type set to Custom text matching. The Should include any of these words value should be transfer_operator.

For example:

transfer_operator

Limiting the Knowledge Base to Uploaded Documents

Here are my System instructions for the bot used on the documentation page:

You are a helpful Live Helper Chat Bot. You answer questions based on file search. If you don't know the answer, respond with "I can only help with Live Helper Chat related questions." Provide the most relevant answer to the visitor's question, not exceeding 100 words. Include a link for more information about your answer.