RAG4j/p

Workshop JFall 2024

2024-09-27T04:00:00+00:00

The 7th of November 2024, Daniel and Jettro will visit Pathe in Ede for the annual JFall conference. In the morning, we will deliver our workshop Build the best knowledge retriever for your Large Language Model.. Generative AI is here to stay. Tools to generate text, images, or data are now common goods. Large Language models (LLMs) only have the knowledge they acquired through learning, and even that knowledge does not include all the details. To overcome the knowledge problem, the Retrieval Augmented Generation (RAG) pattern arose. An essential part of RAG is the retrieval part. Retrieval is not new. The search or retrieval domain is rich with tools, metrics and research. The new kid on the block is semantic search using vectors. Vector search got a jump start with the rise of LLMs and RAG. This workshop aims to build a high-quality retriever, integrate the retriever into your LLM solution and measure the overall quality of your RAG system.

The workshop uses our Rag4j/Rag4p framework, which we created especially for workshops. It is easy to learn, so you can focus on understanding and building the details of the components during the workshop. You experiment with different chunking mechanisms (sentence, max tokens, semantic). After that, you use various strategies to construct the context for the LLM (TopN, Window, Document, Hierarchical). To find the optimum combination, you’ll use quality metrics for the retriever as well as the other components of the RAG system. You can do the workshop using Python or Java (21). We provide access to a remote LLM (OpenAI). You can also run an open-source LLM on Ollama on your local machine.

We hope to see you in Pathe Ede and that you will enjoy the workshop. If you have any questions, feel free to contact us.

7th of November 2024 10:35 - 12:30 Space for Hands-on labs

Preparing for the workshop

First, choose your programming language (Java or Python). Then follow the instructions below:

Java: RAG4j

Python: RAG4p

The repositories contain a README.md file with instructions on setting up your environment.

The presentation used during the workshop is available here.

Introducing Rag4p GUI

2024-07-04T04:00:00+00:00

Yesterday was the first public appearance of my latest project, Rag4p-GUI. This project is a graphical interface for the Rag4p library. The goal for that project is to create a basic library or framework that we can use during workshops. You can understand the complete framework in an hour, essential for our workshops. The problem with the project is that it needed a GUI. So, I always used the command line during presentations, distracting me from the information I wanted to share. Therefore, I started working on a GUI. It could get a bit out of hand. In this post, I want to share the first version of the GUI. I’ll discuss the features that it has right now.

The GUI consists of three parts. The first part is indexing the content into a store. Rag4p provides access to Weaviate, OpenSearch and a custom in-memory database. The second part deals with retrieving the context for the Large Language Model. Again, you can use Weaviate, OpenSearch, or a custom in-memory database. The third part is the generation of the answer. Currently, the Rag4p project provides access to OpenAI, Amazon Bedrock and Ollama for LLMs.

Indexing

The project’s adaptability shines through in its ability to handle different datasets. With three different datasets, often in a small and larger variant, and a reader class for each dataset due to their non-uniform layout, the project ensures flexibility. Another essential component is the splitter, with the project offering three options: the sentence splitter, the max token splitter, and the single chunk splitter. You can test drive the different splitters using the Chunking tab, found after choosing the Indexing tab.

With the proper chunks, it is time to talk about embeddings. When working with semantic search and vectors to capture the semantics of your chunk, you have to choose the right embedder. The Rag4p project offers four embedding providers: OpenAI, Bedrock, Ollama and Onnx. Some providers offer different models. You can select the model you want to use. The Embedding tab allows you to test the different embedders. By selecting one of the backups of the internal data stores, you can visualise the embeddings and enter your query to locate the query embedding next to the embeddings in your store. The following image shows a sample of the screen.

Creating embeddings using services like OpenAI and Bedrock costs money and time. Therefore, the GUI provides an interface to store embeddings in a vector store and back up the in-memory store. You can manage the collections in Weaviate and OpenSearch through the interface. The following table presents an overview of the different collections and how they were realised, including what splitter, embedding, and dataset were used.

Retrieving

The retrieval part of the GUI is the most straightforward part. You choose the store to use. THe internal store, Weaviate or OpenSearch. For each store you select the collection to use. For Weaviate and OpenSearch you can choose hybrid search. Finally you select the retrieval strategy together with the amount of relevant chunks to retrieve. The following image shows the retrieval screen.

At the moment, Rag4p supports three retrieval strategies. The first strategy is the TopN strategy. This strategy retrieves the top N chunks based on the similarity between the query and the chunks. The second strategy is the Window strategy. This strategy retrieves the top N chunks based on the similarity between the query and the chunks and the chunks before and after the relevant chunk. The third strategy is the document strategy. This strategy retrieves the complete document of the appropriate chunk. The document retrieval includes the metadata of the document. This way, you can return information about the document’s title and author. The following image shows the retrieval screen.

Generating

Here we combine everything we have done so far. You select the content store to use with the collection. Select the retrieval strategy and the amount of relevant chunks to retrieve. Finally, you select the generator to use. The generator can be OpenAI, Bedrock or Ollama. The following image shows the generation screen. We used the document retrieval strategy in combination with hybrid search. With hybrid search, the speakers field is also searched. Therefore our talk at TeqNation is found.

The GUI is available on Github. The project is still in its early stages.

Workshop Teqnation 2024

2024-04-04T04:00:00+00:00

The 22nd of May 2024, Daniel and Jettro will visit DeFabrique in Utrecht for the annual Teqnation conference. In the morning, we will deliver our workshop The Art of Questions: Creating a Semantic Search-Based Question-Answering System with LLMs. This workshop makes use of our RAG4j/p project to teach a group of enthusiasts about the different aspects of Retrieval Augmented Generation. We will use the project to show the different parts of a RAG system. Time is short, still we want participants to experience the essential parts of a RAG system. As the project is small, little time is lost learning it.

People attending the workshop will start from the beginning and perform all the different steps needed to build and evaluate a RAG system. To give you an idea of the different steps, here is a list of the different steps:

Work with text; use a splitter to split the text into sentences or other chunks.
Use an embedder to create vectors from the chunks
Use a retriever to find the most relevant chunks for a given question
Use a generator to generate an answer based on the chunks and the question
Use a quality component to determine the quality of the generated text and the relevance of the retrieved chunks

We hope to see you in DeFabrique and that you will enjoy the workshop. If you have any questions, feel free to contact us.

22nd of May 2024 11:00 - 12:45 Room KALVERMELK 2C

Preparing for the workshop

First choose your programming language (Java or Python). Then follow the instructions below:

Java: RAG4j

Python: RAG4p

The repositories contain a README.md file with instructions on how to setup your environment.

Workshop jFokus 2024

2024-01-24T04:00:00+00:00

From 5 to 7 February 2024, Daniel and Jettro will visit Stockholm for the annual jFokus conference. On Monday, we will perform our workshop Creating a Semantic Search-Based Question-Answering System with LLMs. This session is the first occasion where we use our RAG4j/p project to teach a group of enthusiasts about the different aspects of Retrieval Augmented Generation. We will use the project to show the different parts of a RAG system. There is plenty of time to experiment with the different parts. As the project is small, little time is lost learning it.

Work with text; use a splitter to split the text into sentences or other chunks.
Use an embedder to create vectors from the chunks
Use a retriever to find the most relevant chunks for a given question
Use a generator to generate an answer based on the chunks and the question
Use a quality component to determine the quality of the generated text and the relevance of the retrieved chunks

We hope to see you in Stockholm and that you will enjoy the workshop. If you have any questions, feel free to contact us.

5 February 09:00 - 12:30 Room 26

Starting the workshop

First choose your programming language (Java or Python). Then follow the instructions below:

(Repositories become available on the day of the workshop)

Java: RAG4j

Python: RAG4p

The repositories contain a README.md file with instructions on how to setup your environment.

The presentation used during the workshop is available here.

Rag4p now available on PyPi

2024-01-24T04:00:00+00:00

The Rag4p project is started to facilitate workshops. When working on other projects, I was copying the code from the Rag4p project to the project I was working on. This was not very efficient. Therefore I decided to make the project available on PyPi. This way I can use the project as a dependency in other projects.

The project is available on PyPi

The source code of the project is available on Github

DSPy - Programming Language Models using Python

This is a sample application to learn about DSPy. DSPy separates the flow of your program in modules from the parameters like prompts and weights optimisation. It is a nice framework to create RAG, ReAct, and ChainOfThought programs. The framework is available on Github

You can find the Documentation here.

The project is available on Github. At the moment this project is under development. The project demonstrates how to import content from a wordpress blog into Weaviate and use DSPy to create a RAG and a multi step RAG solution.

Welcome to Rag4j!

2024-01-14T12:35:54+00:00

And we are live! This website is the homepage of RAG4j. A small Java framework to use RAG. Initially, it is for people to learn the different parts of a RAG system. If you are not familiar with RAG, head over to the Documentation section. Here, we will explain the basic concepts.

The main project will be available at rag4j. We are still working on JavaDoc, testing, and preparing the code for the public. The project will first appear at jFokus 2024. We might make small changes based on the conference or the customer for whom we do the workshop. We will always add a blog post for each conference, to highlight what we will do at that specific event.

Welcome to Rag4p!

2024-01-14T12:35:54+00:00

When we started with this project, we wanted to create an easy way to learn building RAG systems on the jvm. However, we did promise to give people a choice. Therefore we decided to create RAG4p next to RAG4j. RAG4p is a small Python framework to use RAG. Initially, it is for people to learn the different parts of a RAG system. If you are not familiar with RAG, head over to the Documentation section. Here, we will explain the basic concepts.

The main project is available at rag4p. We are still working on documentation, testing, and preparing the code for the public. The project will first appear at jFokus 2024. We might make small changes based on the conference or the customer for whom we do the workshop. We will always add a blog post for each conference, to highlight what we will do at that specific event.