Semantic Search Over Your Files with nfyio and OpenAI
Index your stored objects with OpenAI embeddings and query them by meaning using nfyio's built-in semantic search API powered by pgvector.
nfyio Team
Talya Smart & Technoplatz JV
nfyio ships with a built-in embedding pipeline. When you upload a document, nfyio can automatically:
- Extract the text content
- Chunk it into segments
- Send each chunk to OpenAI or Voyage AI for embedding
- Store the vectors in pgvector
- Let you query by semantic meaning via REST API
Prerequisites
- nfyio running locally or on a server (see installation guide)
- OpenAI API key
1. Configure the Embedding Provider
In your .env file:
# Enable embeddings
EMBEDDINGS_ENABLED=true
EMBEDDING_PROVIDER=openai
# OpenAI config
OPENAI_API_KEY=sk-...
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
Or use Voyage AI:
EMBEDDING_PROVIDER=voyage
VOYAGE_API_KEY=pa-...
VOYAGE_EMBEDDING_MODEL=voyage-large-2-instruct
2. Upload a Document with Embedding
# Upload a file with embedding enabled
curl -X PUT "http://localhost:7007/your-bucket/report-q1-2026.pdf" \
-H "x-nfyio-embed: true" \
-H "Authorization: Bearer $YOUR_ACCESS_KEY" \
--data-binary @report-q1-2026.pdf
The response includes an embedding job ID:
{
"key": "report-q1-2026.pdf",
"size": 82340,
"embeddingJobId": "emb_7XkV9mNpQR",
"status": "queued"
}
3. Check Embedding Status
curl http://localhost:3000/api/jobs/emb_7XkV9mNpQR \
-H "Authorization: Bearer $YOUR_JWT"
{
"id": "emb_7XkV9mNpQR",
"status": "completed",
"chunks": 24,
"model": "text-embedding-3-small",
"dimensions": 1536,
"completedAt": "2026-02-22T10:15:32Z"
}
4. Query by Meaning
curl -X POST http://localhost:3000/api/search \
-H "Authorization: Bearer $YOUR_JWT" \
-H "Content-Type: application/json" \
-d '{
"query": "What were the revenue highlights in Q1 2026?",
"bucketId": "your-bucket-id",
"limit": 5,
"threshold": 0.78
}'
Response:
{
"results": [
{
"chunkId": "chunk_001",
"objectKey": "report-q1-2026.pdf",
"score": 0.934,
"content": "Revenue grew 42% year-over-year, reaching $4.2M ARR...",
"metadata": {
"page": 3,
"chunkIndex": 2
}
}
],
"totalResults": 5,
"queryTime": 12
}
5. Use the Agentic RAG Runtime
For multi-step retrieval with GPT-4o:
curl -X POST http://localhost:7010/api/agents/run \
-H "Authorization: Bearer $YOUR_JWT" \
-H "Content-Type: application/json" \
-d '{
"agentId": "rag-qa",
"input": "Summarize the key risks mentioned in the Q1 2026 report",
"context": {
"bucketId": "your-bucket-id"
}
}'
The agent will:
- Retrieve relevant chunks via semantic search
- Pass them as context to GPT-4o
- Generate a grounded answer
- Log the full execution trace
Summary
With nfyio, semantic search over your stored files is a 3-step operation:
- Upload with
x-nfyio-embed: true - Wait for embedding job to complete
- Query via
/api/search
No vector database configuration. No LangChain setup. Just self-hosted infrastructure that does it all.
Written by
nfyio Team
Talya Smart & Technoplatz JV
Building the future of web design at Anti-Gravity. Passionate about creating beautiful, accessible experiences.