Developing a RAG chatbot on NextJS: Firebase Genkit and Firestore Tutorial

We will build an AI bot that answers questions on documentation using the new Firebase Genkit and Firestore Vector DB

Samyak Jain
Samyak Jain -
Developing a RAG chatbot on NextJS: Firebase Genkit and Firestore Tutorial

Ask AI Expert for Genkit

Skip this tutorial and get personalized steps and code for your usecase.

Ask AI Expert

We are all adding various Generative AI workflows in our apps. A common problem encountered is to get LLMs to answer based on our defined dataset.

We may want to provide information like:

  • Our internal documents like PDFs or reports.

  • Data provided by the user like notes, forms, appointment history etc.

  • Or latest information like news articles, and wikipedia pages.

Building a RAG pipeline:

To solve this, we will incorporate a RAG pipeline that can perform a semantic search and extract most relevant data to answer user's query from a huge corpus of 1000s of documents.

For this article, we will pick up a website and create a chatbot answering questions over it's content.

Getting Help: Dash Agent for Firebase Genkit

Building with a new package is a difficult task that requires learning it's concepts first. To save our time reading the documentation of Firebase Genkit, we will use the @Firebase Genkit agent in CommandDash.

Dash agents stay upto date with the latest docs, examples and github issues and assist us from the IDE.

We can install it from CommandDash marketplace in VSCode .

Firebase Genkit agent in commanddash marketplace

Now with the Firebase Genkit agent in place, we can directly ask it to assist us to build our usecase. Any interactions with the agent are quoted like below:

Hi Firebase Genkit! Let's start building 👋🏼

Getting Started

How to setup a Next.js project to use Firebase Genkit and Firestore

Here's a comprehensive guide on setting up a Next.js project to use Firebase Genkit with Firestore.

Prerequisites

  • Firebase Project: You need a Firebase project. If you don't have one, create one at https://console.firebase.google.com.

  • Node.js and npm: Ensure you have Node.js and npm installed on your machine.

  • Firebase CLI: Install the Firebase CLI using npm install -g firebase-tools.

Steps

  1. Create a Next.js Project

    npx create-next-app@latest my-genkit-nextjs-app
    cd my-genkit-nextjs-app

    Choose TypeScript as the language during setup.

  2. Initialize Genkit

    npm i -g genkit
    genkit init
    • Select Next.js as your deployment platform.

    • Select your preferred model provider (e.g., Gemini, Vertex AI).

    • Accept the defaults for other prompts.

    This will set up your project with sample Genkit files.

  3. Install the Firebase Plugin

    npm install --save @genkit-ai/firebase
  4. Configure the Firebase Plugin

    • In your genkit.ts file:

      import { firebase } from '@genkit-ai/firebase';
      import { configureGenkit } from '@genkit-ai/flow'; 
       
      configureGenkit({
          plugins: [firebase({ projectId: 'your-firebase-project-id' })], // Replace 'your-firebase-project-id'
          logLevel: 'debug',
          enableTracingAndMetrics: true
      });
  5. Set up Firestore

    • Initialize the Firebase Admin SDK in your genkit.ts file:

      import * as admin from 'firebase-admin';
       
      // ... other imports
       
      const app = admin.initializeApp({
        credential: admin.credential.cert('path/to/your/firebase-service-account-key.json'),
        });
      const firestore = getFirestore(app);
       
      // ... rest of your Genkit code
      • Replace path/to/your/firebase-service-account-key.json with the path to your Firebase service account key file. You can download this key file from the Firebase console (Project Settings > Service Accounts).

Populating Data in Firestore

For this article, we will pick the Firebase Genkit documentation and index it to our Firestore DB.

Indexing is converting text into embeddings which can then be used to rank text based on their semantic similarity. To make it easy, I've extracted the Genkit documentation into this JSON file.

Now let's ask the Dash Agent:

Write me a method that extracts data from the json file with schema [{title, description, content}] and saves the indexed data to the "documents" collection in firestore.

Chunk the data appropriately within the embedding model limits.

// app/genkit.ts
import { configureGenkit } from "@genkit-ai/core";
import { defineFirestoreRetriever, firebase } from "@genkit-ai/firebase";
import { googleAI } from "@genkit-ai/googleai";
import * as admin from 'firebase-admin';
import { getFirestore, FieldValue } from "firebase-admin/firestore";
import { textEmbeddingGecko001 } from "@genkit-ai/googleai";
import { embed } from "@genkit-ai/ai/embedder";
import * as fs from 'fs';
 
// your existing code
const firestore = getFirestore(app);
 
const indexConfig = {
  collection: "documents",
  contentField: "content",
  vectorField: "embedding",
  embedder: textEmbeddingGecko001,
};
 
async function indexWebsiteContent() {
  const filePath = 'your downloded file path';
  const data = JSON.parse(fs.readFileSync(filePath, 'utf-8'));
 
  for (const document of data) {
    // Handle embedding length limits
    const chunks = chunkText(document.content, 4500);
    for (const chunk of chunks) {
      const embedding = await embed({
        embedder: indexConfig.embedder,
        content: chunk,
      });
 
      await firestore.collection(indexConfig.collection).add({
        [indexConfig.vectorField]: FieldValue.vector(embedding),
        [indexConfig.contentField]: chunk,
        title: document.title,
        description: document.description,
      });
 
      console.log(`Indexed chunk of document: ${document.title}`);
    }
  }
}
 
function chunkText(text: string, chunkSize: number): string[] {
  const chunks: string[] = [];
  for (let i = 0; i < text.length; i += chunkSize) {
    chunks.push(text.slice(i, i + chunkSize));
  }
  return chunks;
}

Now, we will run the indexWebsiteContent as a one time API call to save our data to Firestore for querying later.

// pages/api/index-website-content.ts
import { NextApiHandler, NextApiRequest, NextApiResponse } from 'next';
import { indexWebsiteContent } from '../../app/genkit';
 
const handler: NextApiHandler = async (req: NextApiRequest, res: NextApiResponse) => {
    try {
    await indexWebsiteContent();
    res.status(200).json({ message: 'Website content indexed successfully' });
    } catch (error) {
    console.error(error);
    res.status(500).json({ message: 'Error indexing website content' });
    }
};
 
export default handler;

Upon running:

npm run dev
curl http://localhost:3000/api/index-content

The Firestore DB is now populated 🙌🏼 with embeddings in vector format!

Firestore DB populated with Vector indexes for RAG

Asking questions on this data

Moving back to Firebase Genkit Agent:

Create a chatRequest method that retrieves the data from the collection and creates a model request to answer user's query.

Use top 3 documents with Gemini Pro model.

import { configureGenkit } from "@genkit-ai/core";
import { defineFirestoreRetriever, firebase } from "@genkit-ai/firebase";
import { geminiPro, googleAI } from "@genkit-ai/googleai";
import * as admin from 'firebase-admin';
import { getFirestore, FieldValue } from "firebase-admin/firestore";
import { textEmbeddingGecko001 } from "@genkit-ai/googleai";
import { embed } from "@genkit-ai/ai/embedder";
import { generate } from "@genkit-ai/ai";
import * as fs from 'fs';
 
/// your existing code
 
const firestoreRetriever = defineFirestoreRetriever(
  {
    name: 'firestoreRetriever',
    firestore: firestore,
    collection: indexConfig.collection,
    vectorField: indexConfig.vectorField,
    contentField: indexConfig.contentField,
    embedder: indexConfig.embedder,
  }
)
export async function indexWebsiteContent() {
    // no change here
}
 
export async function chatRequest(query: string) {
  const docs = await firestoreRetriever({
  query: {
    content: [
      {
        text: query
      }
    ]
  },
  options: { limit: 3 }, // Retrieve top 3 documents
  });
 
 
  let context = docs.documents.map(doc => doc.content[0].text).join('\n\n');
  
  const response = await generate({
  model: geminiPro,
  prompt: `
      You are a helpful and informative AI assistant.
      Answer the user's question.
 
      Question: ${query}
      Context: ${context}
  `,
  });
 
  return response.text();
}
 
// your existing code

Done! We got the entire from ready for us.

Now, we can first test the answerQuery method as an API as we did before.

import { NextApiHandler, NextApiRequest, NextApiResponse } from 'next';
import { chatRequest } from '../../app/genkit';
 
const handler: NextApiHandler = async (req: NextApiRequest, res: NextApiResponse) => {
    try {
        // Assuming the query is passed in the request body
        const { query } = req.body;
 
        // If no query is provided, send an error response
        if (!query) {
            return res.status(400).json({ message: 'Please provide a query' });
        }
 
        let result = await chatRequest(query);
        res.status(200).json({ message: result });
    } catch (error) {
        console.error(error);
        res.status(500).json({ message: 'Error processing chat request' });
    }
};
 
export default handler;

Running the methods returns an error regarding an indexing prerequisite issue.

Error: 9 FAILED_PRECONDITION: Missing vector index configuration. Please create the required index with the following gcloud command: gcloud alpha firestore indexes composite create --project={your project name} --collection-group=documents --query-scope=COLLECTION --field-config=vector-config='{"dimension":"768","flat": "{}"}',field-path=embedding

This indicates we need to first set the Vector Index configuration for the Firestore DB using gcloud CLI. Once the CLI is installed, run this command:

gcloud alpha firestore indexes composite create \
--project={project} 
--collection-group=documents 
--query-scope=COLLECTION 
--field-config=vector-config={"dimension":"768","flat": "{}"},
field-path=embedding

You can copy find values specific to your project from the error message itself. Once index is created, you get it's ID.

Now, we re-run the same API request again and this time it succeeds 🙌🏼

Successfull postman request

Building the UI

We will ask CommandDash only to create us a basic UI that calls the chatRequest method.

Use Attach Snippets to Dash in the menu bar after selecting the below snippets to provide them as context

[page.tsx] [genkit.ts] [pages/api/chat-request.ts] update my next JS page to have a textfield and button that calls that chat request method and shows the response in the text box.

// app/page.tsx
"use client";
 
import { useState } from "react";
 
export default function Home() {
    const [query, setQuery] = useState("");
    const [response, setResponse] = useState("");
 
    const handleSubmit = async (event: any) => {
    event.preventDefault();
 
    try {
        const result = await fetch("/api/chat-request", {
            method: "POST",
            headers: { "Content-Type": "application/json" },
            body: JSON.stringify({ query }),
        });
 
        const data = await result.json();
        setResponse(data.message);
    } catch (error) {
        console.error("Error:", error);
        setResponse("Error processing chat request.");
    }
    };
 
    return (
    <main className="flex min-h-screen flex-col items-center justify-between p-24">
        {/* ... Existing HTML content ... */}
 
        <div className="mt-16 w-full max-w-5xl">
        <form onSubmit={handleSubmit}>
            <div className="mb-4">
            <label htmlFor="query" className="block text-gray-700 text-sm font-bold mb-2">
                Your question:
            </label>
            <input
                type="text"
                id="query"
                className="shadow appearance-none border rounded w-full py-2 px-3 text-gray-700 leading-tight focus:outline-none focus:shadow-outline"
                value={query}
                onChange={(e) => setQuery(e.target.value)}
            />
            </div>
            <button
            type="submit"
            className="bg-blue-500 hover:bg-blue-700 text-white font-bold py-2 px-4 rounded focus:outline-none focus-shadow-outline"
            >
            Ask
            </button>
        </form>
 
        {response && (
            <div className="mt-4 p-4 border border-gray-300 rounded">
            <p>{response}</p>
            </div>
        )}
        </div>
    </main>
    );
}

Great, Let's run our Next app and see the results:

NextJS RAG app using Firebase Genkit and Firestore DB

Congratulations! We've built our first RAG app with Genkit!

Now adapt this to your usecase and ship useful AI powered apps to your users.