Deploying Defog On-Prem for the Enterprise

Architecture

Defog is designed to be set up with minimal effort. You will not have to install anything manually on your machine, except for Docker. The entire setup process is automated, and can be completed in under 30 minutes.

There are 3 docker images that you will need to deploy. These can all be deployed on the same machine, or on separate machines. The images are:

defog-docker-end-user: This service handles user management, authentication, and acts as a UI and authentication wrapper around the defog-backend.
defog-backend: This is the backend service that handles your metadata, golden queries, and user feedback. It also generates the prompts that are sent to the LLM, and does the final processing of the LLM output.
defog-vllm-onprem: This is the LLM service that generates the SQL queries from the prompts that are sent to it.

The relationship between the images is outlined in the image below. Architectural details about each of these images is included below. Docker On-Prem Architecture

defog-docker-end-user

This image is most relevant for non-technical end users that interact with Defog. It has multiple UI elements, handles authentication, and executes the SQL queries generated by the LLM. It also provides the backend and frontend API for the Defog backend.

Functionality

Through this image, general users can:

Log in and log out
Query data in plain English, and visualize the resulting results with semantic plotting

Additionally, admin users can:

Manage users
Connect to databases, and manage database connections
Define and manage database schemas
Align the model with instructions and golden queries
View and manage user feedback

Tech Stack

This image uses a Python webserver, and is built using the FastAPI framework. It comes with a built in PostgreSQL database to store user information, and uses SQLAlchemy as an ORM. However, users can choose connect to any SQL database that supports SQLAlchemy (including SQLServer, MySQL, Oracle, and SQLite).

The frontend is built using React.

defog-backend

Functionality

This image does the heavy lifting for converting a user's question into a fully formed prompt that can be sent to an LLM. It is responsible for:

Storing the metadata, instructions, and golden queries associated with a given API key
Selecting the appropriate subset of metadata, instructions, and golden queries for a given question
Generating prompts that can be sent to the LLM
Performing post-processing on the LLM output to generate the final SQL query

Tech Stack

This image uses a Python webserver, and is built using the FastAPI framework. It comes with a built in PostgreSQL database to store metadata, instructions, and golden queries, and uses SQLAlchemy as an ORM.

The pgvector extension is used to store the embeddings of the metadata, instructions, and golden queries. This allows for fast and efficient retrieval of the most relevant metadata, instructions, and golden queries for a given question – while eliminating the overhead of using a separate vector database.

However, the backend can be modified to connect to any other SQL database, and any other vector database. If you would like to use alternatives, please let your Defog support engineers know and they will modify the code accordingly.

defog-vllm-onprem

Functionality

This image is the LLM that generates the SQL queries from the prompts that are sent to it. It is a modified version of VLLM. This image contains CUDA and cuDNN drivers, as well as a modified versions of vLLM's Async server for handling multiple requests. It takes in a prompt as input, and generates a SQL query as output.

Tech Stack

This image is build on top of the pytorch base image, and includes an optimized installation of CUDA and cuDNN by default. It uses the PyTorch backend for the LLM, and the FastAPI framework for the webserver.

Pre-Requisites Slack Integration