Skip to main content

Basic flow of AI integrations and possible scenarios

All these observations are my personal opinion, so take it with salt of grain :)

In this article, I'll describe possible AI integration flows and their purposes.

  • Remote AI servers
  • Locally AI servers

Remote AI services

These solutions include

OpenAI the creator of ChatGPT

Platform can be found here https://platform.openai.com

Advantages

  • You do not need to rent a server and maintain it, train it or anything like that.
  • Vector storage aka. RAG solution available out of the box
  • Function calling available also. Relatively cheap using gpt-o4-mini model.
  • Easy integration using Live Helper Chat
  • Possible to integrate quite complex scenarios with function calling.
  • OpenAI never makes internal API calls to internal structure LHC does host on your server. OpenAI works just as a proxy between your LHC install and internal API servers.

Integration samples

Disadvantages

  • Data is hosted remotely

Gemini integration

More information can be found here https://doc.livehelperchat.com/docs/bot/gemini-integration

I have not used it personally for RAG or function calling so hard to say how in real life it's easy to integrate besides few samples I did.

Advantages

  • You do not need to rent a server and maintain it, train it or anything like that.
  • Reliable provider, I would say.

Disadvantages

Open question

What is the best way to have RAG and function calling at the same time?

We for testing purposes tried to have system prompt where we ask him to specific function if question is not related to any of the defined functions. But I ques it's not the best way to do it.

https://ai.google.dev/gemini-api/docs/models/gemini

Few youtube vides I found to help understand a workflow

DeepSeek

More information can be found here https://doc.livehelperchat.com/docs/bot/deepseek-integration

groq.com

This service can run open source LLM models so if ever service goes out black you sill will have a model https://groq.com

From first glance looks very easy setup and replacement for OpenAI. RAG thing you would need to implement yourself :)

Services which can run your finetuned models

Those are next level setups in complexity which can run your own fine-tuned models. Models can be run from https://huggingface.co/ which is like hub for all AI models.

Completely self-hosted solutions

RAG Flow

RAG consists of few things. Vector storage, text chunking. I won't go much into details just wanted to have it as reference for myself also

Preparation content for RAG flow includes

  • Prepare content as text file
  • Chunking content into smaller parts
  • Upload content to vector storage

Search flow

  • User asks a question
  • Question is chunked
  • Each chunk is sent to vector storage
  • Vector storage returns possible answers
  • Answers are combined and send to LLM server
  • LLM server returns generated response to the user

Text preparation tools

Vector storage engines