Meet Verba: An open-source tool for building your own RAG retrieval incremental generation pipeline and using LLMs for internal-based output

Verba is an open-source project providing a simplified, user-friendly interface to RAG apps. One can dive into the data and quickly start relevant conversations.

Verba is more of a companion than just a tool when it comes to data querying and manipulation. Verba enables you to achieve all of this through paperwork, comparing and contrasting multiple sets of numbers, and data analysis—through Weaviate and Large Language Models (LLMs).

🔥 Keep up with important advances in AI research with our newsletter – subscribe now while it’s free!

Based on Weaviate’s sophisticated generative search engine, Verba automatically pulls the necessary background information from documents whenever a search is performed. It uses the processing power of LLM to provide a comprehensive, context-aware solution. Verba’s straightforward layout makes all this information easy to retrieve. Verba’s straightforward data import features support various file formats like .txt, .md and others. This technology automatically chunks and vectorizes data before one feeds it into VeViate, making it more suitable for search and retrieval.

Use the Create module and hybrid search options available in Weaviate to your advantage when working with Verba. These sophisticated search methods scan through papers in search of important reference parts, which then use large language models to provide in-depth responses to queries.

To improve the speed of future searches, Verba embeds both generated results and queries in Vaviate’s semantic cache. Before answering a question, Verba will look in its semantic cache to see if a similar answer has already been given.

An OpenAI API key is required regardless of deployment method to enable data input and query capabilities. Add the API key to the system environment variables or create a.env file when cloning the project.

Verba allows one to connect to Weaviate instances in a variety of ways, depending on the specific use case. If the VERBA_URL and VERBA_API_KEY environment variables are not present, Verba will use Weaviate Embedded instead. The easiest way to launch the Weaviate database for prototyping and testing is through a local deployment.

Verba provides simple instructions for importing data for further processing. Please note that OpenAI will cost based on the Access Key configuration before continuing to import data. OpenAI models are only used by Verba. Please note that using these models will incur an API key fee. Data embedding and answer generation are the primary cost drivers.

You can give a shot.

Verba has three main parts:

  • Anyone can host their Weaviate database on Weaviate Cloud Service (WCS) or their servers.
  • This FastAPI endpoint mediates between the Large Language Model provider and the Veviat data store.
  • The React frontend (delivered via a static FastAPI) provides a dynamic user interface for data exploration and manipulation. Development.

check GitHub and give it a try. All credit for this research goes to the researchers in this project. Also, don’t forget to participate Our 30k+ ML SubReddit, 40k+ Facebook community, Discord channelAnd Email newsletterWhere we share the latest AI research news, cool AI projects and more.

If you like our work, you will like our newsletter.

Dhanashree Shenwai is a Computer Science Engineer with a keen interest in the applications of AI and has good experience in FinTech companies covering Financial, Cards & Payments and Banking domains. She is passionate about discovering new technologies and advancements that make everyone’s life easier in today’s developing world.

🚀 Check out Noah AI: ChatGPT with hundreds of your Google Drive documents, spreadsheets, and presentations (Sponsored)

Leave a Comment