Build Your Own RAG - Fully On-Premise

April 2, 2025

There are many scenarios where a cloud indexing is not allowed, sometimes due to corporate regulations or due to the kind of data. So how to give the best answers without reaching out to the cloud?

Applications which leverage Retrieval Augmented Generation (RAG) comprise the following components.

  • A vector search engine as the application’s heart.

  • Enterprise search connectors which index data into the search engine.

  • Tw large language models which are needed for

    • generating embeddings, i.e., the vectors,

    • generating completions, i.e., answers in natural language and a certain amount of reasoning.

  • A search interface or bot integration so that (authenticated) users can interact with the RAG.

Operating Your RAG Fully On-Premises

Besides the user experience, there are only two components which you need to move from the cloud to on-premises so that you RAG is fully on-premises. Namely the search engine and the large language model.

On-Premises Search Engines

Scalable and state-of-the-art search engines which support vector search on-premises are Elasticsearch and Apache Solr. Also commercial search engines such as the Squirro Insight Engine can be operated on-premises.

On-Premises LLMs

When it comes to large language models, you can use Ollama and run common language models directly on-premises. The only requirement is that you have a sufficiently strong hardware, as otherwise delays during indexing and even worse, at search time, will be far too long.

Enterprise search connectors or also the search interface can moreover be provided by the RheinInsights Retrieval Suite.

An Illustration on an Retrieval Augmented Generation On-Premises

RheinInsights Retrieval Suite and Data Privacy

Our RheinInsights Retrieval Suite is a perfect framework when it comes to scenarios where contents are not allowed to leave your premises. Our Suite is not a managed service or cloud offering but is operated fully under your sole governance. Even our Azure offering runs as an Azure Kubernetes Service (AKS) solution in your resource group, not in ours.

More blog posts
What is the RheinInsights Retrieval Suite? > Build Your Own RAG - Fully On-Premise > Sourcing Corporate Data Sets for Machine Learning