Semantic Search Over Your Files with nfyio and OpenAI

nfyio ships with a built-in embedding pipeline. When you upload a document, nfyio can automatically:

Extract the text content
Chunk it into segments
Send each chunk to OpenAI or Voyage AI for embedding
Store the vectors in pgvector
Let you query by semantic meaning via REST API

Prerequisites

nfyio running locally or on a server (see installation guide)
OpenAI API key

1. Configure the Embedding Provider

In your .env file:

# Enable embeddings
EMBEDDINGS_ENABLED=true
EMBEDDING_PROVIDER=openai

# OpenAI config
OPENAI_API_KEY=sk-...
OPENAI_EMBEDDING_MODEL=text-embedding-3-small

Or use Voyage AI:

EMBEDDING_PROVIDER=voyage
VOYAGE_API_KEY=pa-...
VOYAGE_EMBEDDING_MODEL=voyage-large-2-instruct

2. Upload a Document with Embedding

# Upload a file with embedding enabled
curl -X PUT "http://localhost:7007/your-bucket/report-q1-2026.pdf" \
  -H "x-nfyio-embed: true" \
  -H "Authorization: Bearer $YOUR_ACCESS_KEY" \
  --data-binary @report-q1-2026.pdf

The response includes an embedding job ID:

{
  "key": "report-q1-2026.pdf",
  "size": 82340,
  "embeddingJobId": "emb_7XkV9mNpQR",
  "status": "queued"
}

3. Check Embedding Status

curl http://localhost:3000/api/jobs/emb_7XkV9mNpQR \
  -H "Authorization: Bearer $YOUR_JWT"

{
  "id": "emb_7XkV9mNpQR",
  "status": "completed",
  "chunks": 24,
  "model": "text-embedding-3-small",
  "dimensions": 1536,
  "completedAt": "2026-02-22T10:15:32Z"
}

4. Query by Meaning

curl -X POST http://localhost:3000/api/search \
  -H "Authorization: Bearer $YOUR_JWT" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What were the revenue highlights in Q1 2026?",
    "bucketId": "your-bucket-id",
    "limit": 5,
    "threshold": 0.78
  }'

Response:

{
  "results": [
    {
      "chunkId": "chunk_001",
      "objectKey": "report-q1-2026.pdf",
      "score": 0.934,
      "content": "Revenue grew 42% year-over-year, reaching $4.2M ARR...",
      "metadata": {
        "page": 3,
        "chunkIndex": 2
      }
    }
  ],
  "totalResults": 5,
  "queryTime": 12
}

5. Use the Agentic RAG Runtime

For multi-step retrieval with GPT-4o:

curl -X POST http://localhost:7010/api/agents/run \
  -H "Authorization: Bearer $YOUR_JWT" \
  -H "Content-Type: application/json" \
  -d '{
    "agentId": "rag-qa",
    "input": "Summarize the key risks mentioned in the Q1 2026 report",
    "context": {
      "bucketId": "your-bucket-id"
    }
  }'

The agent will:

Retrieve relevant chunks via semantic search
Pass them as context to GPT-4o
Generate a grounded answer
Log the full execution trace

Summary

With nfyio, semantic search over your stored files is a 3-step operation:

Upload with x-nfyio-embed: true
Wait for embedding job to complete
Query via /api/search

No vector database configuration. No LangChain setup. Just self-hosted infrastructure that does it all.