Build Your First RAG Agent

30-minute tutorial: Create a bucket, upload documents, configure a RAG agent, and integrate with a React component.

This 30-minute tutorial walks you through creating a RAG (Retrieval Augmented Generation) agent on NFYio. You’ll create a bucket, upload documents, configure the agent, wait for indexing, query it via API, and integrate it into a React component.

Prerequisites

  • NFYio instance running (see Quick Start or Deploy with Docker)
  • API access key with agent permissions
  • Node.js 18+ (for the React example)
  • A few documents (PDF, TXT, or Markdown) to index

Step 1: Create a Bucket (2 min)

Create an S3 bucket to hold your documents:

# Set your NFYio storage endpoint
export NFYIO_ENDPOINT="http://localhost:7007"  # or https://storage.yourdomain.com

# Create bucket
aws s3 mb s3://my-knowledge-base \
  --endpoint-url $NFYIO_ENDPOINT

Step 2: Upload Documents (3 min)

Upload your documents. Supported formats: PDF, DOCX, TXT, Markdown, and images (for vision models).

# Upload a folder of documents
aws s3 cp ./my-docs/ s3://my-knowledge-base/documents/ --recursive \
  --endpoint-url $NFYIO_ENDPOINT

# Or upload individual files
aws s3 cp product-guide.pdf s3://my-knowledge-base/documents/ \
  --endpoint-url $NFYIO_ENDPOINT
aws s3 cp faq.md s3://my-knowledge-base/documents/ \
  --endpoint-url $NFYIO_ENDPOINT

Verify uploads:

aws s3 ls s3://my-knowledge-base/documents/ --endpoint-url $NFYIO_ENDPOINT

Step 3: Create the RAG Agent (5 min)

Create the agent via the NFYio API. You’ll need your API base URL and an access token.

Agent JSON Config

{
  "name": "Product Support Agent",
  "type": "rag",
  "config": {
    "bucket": "my-knowledge-base",
    "prefix": "documents/",
    "embedding": {
      "model": "text-embedding-3-small",
      "chunkSize": 512,
      "chunkOverlap": 64
    },
    "llm": {
      "model": "gpt-4o",
      "temperature": 0.2,
      "maxTokens": 1024
    },
    "retrieval": {
      "topK": 5,
      "similarityThreshold": 0.7
    }
  }
}

Create via API

# Set your gateway URL and token
export NFYIO_API="http://localhost:3000"  # or https://api.yourdomain.com
export NFYIO_TOKEN="your-access-token"

curl -X POST "$NFYIO_API/api/agents" \
  -H "Authorization: Bearer $NFYIO_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Product Support Agent",
    "type": "rag",
    "config": {
      "bucket": "my-knowledge-base",
      "prefix": "documents/",
      "embedding": {
        "model": "text-embedding-3-small",
        "chunkSize": 512,
        "chunkOverlap": 64
      },
      "llm": {
        "model": "gpt-4o",
        "temperature": 0.2,
        "maxTokens": 1024
      },
      "retrieval": {
        "topK": 5,
        "similarityThreshold": 0.7
      }
    }
  }'

Response:

{
  "id": "agt_abc123xyz",
  "name": "Product Support Agent",
  "type": "rag",
  "status": "indexing",
  "createdAt": "2026-03-01T12:00:00Z"
}

Save the id — you’ll need it for queries.

Step 4: Wait for Indexing (5–15 min)

The agent indexes documents asynchronously. Poll the status:

export AGENT_ID="agt_abc123xyz"

curl -s "$NFYIO_API/api/agents/$AGENT_ID" \
  -H "Authorization: Bearer $NFYIO_TOKEN" | jq .

When status is "ready", indexing is complete:

{
  "id": "agt_abc123xyz",
  "status": "ready",
  "indexedChunks": 142,
  "indexedAt": "2026-03-01T12:15:00Z"
}

Step 5: Query the Agent (2 min)

Send a question and get an answer:

curl -X POST "$NFYIO_API/api/agents/$AGENT_ID/query" \
  -H "Authorization: Bearer $NFYIO_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"query": "What is the return policy?"}'

Response:

{
  "answer": "Our return policy allows returns within 30 days of purchase...",
  "sources": [
    {
      "objectKey": "documents/faq.md",
      "chunkIndex": 2,
      "score": 0.89
    }
  ]
}

Step 6: React Component Integration (10 min)

Here’s a minimal React component that integrates with your RAG agent:

// RAGChat.tsx
'use client';

import { useState } from 'react';

const API_BASE = process.env.NEXT_PUBLIC_NFYIO_API || 'http://localhost:3000';
const AGENT_ID = process.env.NEXT_PUBLIC_RAG_AGENT_ID || 'agt_abc123xyz';

interface Message {
  role: 'user' | 'assistant';
  content: string;
  sources?: { objectKey: string; score: number }[];
}

export function RAGChat() {
  const [messages, setMessages] = useState<Message[]>([]);
  const [input, setInput] = useState('');
  const [loading, setLoading] = useState(false);

  const sendQuery = async () => {
    if (!input.trim() || loading) return;

    const userMessage: Message = { role: 'user', content: input };
    setMessages((prev) => [...prev, userMessage]);
    setInput('');
    setLoading(true);

    try {
      const token = localStorage.getItem('nfyio_token'); // Or your auth
      const res = await fetch(`${API_BASE}/api/agents/${AGENT_ID}/query`, {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
          Authorization: `Bearer ${token}`,
        },
        body: JSON.stringify({ query: input }),
      });

      if (!res.ok) throw new Error(res.statusText);
      const data = await res.json();

      setMessages((prev) => [
        ...prev,
        {
          role: 'assistant',
          content: data.answer,
          sources: data.sources,
        },
      ]);
    } catch (err) {
      setMessages((prev) => [
        ...prev,
        {
          role: 'assistant',
          content: `Error: ${err instanceof Error ? err.message : 'Unknown error'}`,
        },
      ]);
    } finally {
      setLoading(false);
    }
  };

  return (
    <div className="flex flex-col h-[400px] border rounded-lg p-4">
      <div className="flex-1 overflow-y-auto space-y-4 mb-4">
        {messages.map((m, i) => (
          <div
            key={i}
            className={`p-3 rounded-lg ${
              m.role === 'user'
                ? 'bg-blue-100 ml-8'
                : 'bg-gray-100 mr-8'
            }`}
          >
            <p>{m.content}</p>
            {m.sources && m.sources.length > 0 && (
              <p className="text-xs text-gray-500 mt-2">
                Sources: {m.sources.map((s) => s.objectKey).join(', ')}
              </p>
            )}
          </div>
        ))}
        {loading && (
          <div className="text-gray-500 italic">Thinking...</div>
        )}
      </div>
      <div className="flex gap-2">
        <input
          type="text"
          value={input}
          onChange={(e) => setInput(e.target.value)}
          onKeyDown={(e) => e.key === 'Enter' && sendQuery()}
          placeholder="Ask a question..."
          className="flex-1 border rounded px-3 py-2"
        />
        <button
          onClick={sendQuery}
          disabled={loading}
          className="px-4 py-2 bg-blue-600 text-white rounded hover:bg-blue-700 disabled:opacity-50"
        >
          Send
        </button>
      </div>
    </div>
  );
}

Environment Variables

# .env.local
NEXT_PUBLIC_NFYIO_API=https://api.yourdomain.com
NEXT_PUBLIC_RAG_AGENT_ID=agt_abc123xyz

Usage

// app/page.tsx
import { RAGChat } from '@/components/RAGChat';

export default function Page() {
  return (
    <main className="p-8">
      <h1>Product Support</h1>
      <RAGChat />
    </main>
  );
}

Troubleshooting

Agent Stuck in “indexing”

  • Cause: Large documents, slow embedding API, or errors during chunking.
  • Fix: Check agent service logs: docker compose logs nfyio-agent. Ensure OPENAI_API_KEY (or your embedding provider) is set and valid.

Empty or Irrelevant Answers

  • Cause: Low topK, high similarityThreshold, or chunks too small/large.
  • Fix: Increase topK to 7–10, lower similarityThreshold to 0.5, or adjust chunkSize (try 256 or 1024).

”Bucket not found” or 404

  • Cause: Bucket name typo or wrong prefix.
  • Fix: Verify bucket exists: aws s3 ls s3://my-knowledge-base --endpoint-url $NFYIO_ENDPOINT. Ensure prefix in config matches your object keys.

CORS Errors in Browser

  • Cause: NFYio API not allowing your frontend origin.
  • Fix: Set ALLOWED_ORIGINS in NFYio to include your app URL (e.g., https://app.yourdomain.com).

Rate Limits

  • Cause: Too many requests to embedding or LLM APIs.
  • Fix: Add client-side debouncing, or increase rate limits in NFYio configuration.

What’s Next

  • RAG Agents — Deep dive into chunking, embedding models, and advanced config
  • Embeddings — Embedding models and vector search
  • Agents API — Full API reference for agents