Performance Tuning nfyio: PostgreSQL, Redis, and SeaweedFS
Optimize every layer of the nfyio stack — PostgreSQL query performance, Redis memory management, SeaweedFS throughput, and embedding pipeline speed.
nfyio Team
Talya Smart & Technoplatz JV
A fresh nfyio deployment works well out of the box. At scale — millions of objects, thousands of embeddings per hour, concurrent RAG queries — you need to tune. This guide covers the high-impact optimizations for each layer of the stack.
PostgreSQL Tuning
PostgreSQL handles metadata, RLS policies, and pgvector embeddings. It’s the most critical component to tune.
Connection Pooling
Default max_connections = 100 is too low for production. Use PgBouncer instead of increasing connections directly:
# pgbouncer.ini
[databases]
nfyio = host=localhost port=5432 dbname=nfyio
[pgbouncer]
listen_port = 6432
listen_addr = 0.0.0.0
auth_type = md5
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 50
reserve_pool_size = 10
Connect your nfyio gateway to PgBouncer:
DATABASE_URL=postgresql://nfyio:password@localhost:6432/nfyio
Memory Configuration
For an 8 GB RAM server dedicated to PostgreSQL:
# postgresql.conf
shared_buffers = 2GB
effective_cache_size = 6GB
work_mem = 64MB
maintenance_work_mem = 512MB
wal_buffers = 64MB
For a 32 GB RAM server:
shared_buffers = 8GB
effective_cache_size = 24GB
work_mem = 256MB
maintenance_work_mem = 2GB
wal_buffers = 128MB
pgvector Index Tuning
Embedding search is the most expensive query. Tune the HNSW index:
-- Check current index
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'embeddings';
-- Drop and recreate with optimized parameters
DROP INDEX IF EXISTS embeddings_vector_idx;
CREATE INDEX embeddings_vector_idx
ON embeddings
USING hnsw (embedding vector_cosine_ops)
WITH (m = 24, ef_construction = 200);
-- Set search-time parameters
SET hnsw.ef_search = 100;
| Parameter | Default | Tuned | Effect |
|---|---|---|---|
m | 16 | 24 | More connections per node, better recall |
ef_construction | 64 | 200 | Better index quality, slower build |
ef_search | 40 | 100 | Better search recall, slightly slower queries |
Benchmark before and after:
EXPLAIN (ANALYZE, BUFFERS)
SELECT id, 1 - (embedding <=> '[0.1, 0.2, ...]'::vector) AS similarity
FROM embeddings
WHERE bucket_id = 'bucket_abc123'
ORDER BY embedding <=> '[0.1, 0.2, ...]'::vector
LIMIT 10;
Vacuum and Analyze
-- Check bloat
SELECT schemaname, tablename,
pg_size_pretty(pg_total_relation_size(schemaname || '.' || tablename)) AS total_size,
n_dead_tup,
last_autovacuum
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 10;
-- Aggressive vacuum on embeddings table
VACUUM (VERBOSE, ANALYZE) embeddings;
Autovacuum tuning for write-heavy tables:
ALTER TABLE embeddings SET (
autovacuum_vacuum_scale_factor = 0.02,
autovacuum_analyze_scale_factor = 0.01,
autovacuum_vacuum_cost_delay = 10
);
Redis Tuning
Redis handles job queues (embedding pipeline, agent tasks) and caching.
Memory Policy
# redis.conf
maxmemory 2gb
maxmemory-policy allkeys-lru
Pipeline Optimization
For batch embedding jobs, use Redis pipelines to reduce round trips:
# Check queue lengths
redis-cli LLEN nfyio:embedding:queue
redis-cli LLEN nfyio:agent:queue
# Monitor commands per second
redis-cli INFO stats | grep instantaneous_ops_per_sec
Persistence Tuning
For job queues where data loss is tolerable:
# Disable AOF, use infrequent RDB snapshots
appendonly no
save 900 1
save 300 10
For critical queues:
appendonly yes
appendfsync everysec
no-appendfsync-on-rewrite yes
Connection Limits
maxclients 10000
tcp-backlog 511
timeout 300
tcp-keepalive 60
SeaweedFS Tuning
SeaweedFS handles the actual object storage. Tuning focuses on throughput and replication.
Volume Size
Default 30 GB volumes. For large deployments, increase:
weed master -volumeSizeLimitMB=100000
Concurrent Uploads
Increase filer concurrency:
weed filer -maxMB=256 -concurrentUploadLimitMB=512
Compaction
SeaweedFS leaves gaps when objects are deleted. Schedule compaction:
# Check volume status
curl -s http://localhost:9333/vol/status | jq '.Volumes[] | select(.DeleteCount > 100)'
# Compact a volume
curl "http://localhost:9333/vol/vacuum?garbageThreshold=0.3"
Read Cache
Enable filer read cache for frequently accessed objects:
weed filer -cacheDir=/tmp/seaweedfs-cache -cacheSizeMB=4096
Embedding Pipeline Tuning
Batch Size
Process embeddings in batches instead of one-by-one:
# Check current setting
curl -s http://localhost:7010/config | jq '.embedding'
# Update batch size
curl -X PATCH http://localhost:7010/config \
-H "Authorization: Bearer $JWT" \
-H "Content-Type: application/json" \
-d '{"embedding": {"batch_size": 100, "max_concurrent": 10}}'
Model Selection
| Model | Dimensions | Speed | Cost | Quality |
|---|---|---|---|---|
text-embedding-3-small | 1536 | Fast | $0.02/1M tokens | Good |
text-embedding-3-large | 3072 | Medium | $0.13/1M tokens | Best |
text-embedding-ada-002 | 1536 | Fast | $0.10/1M tokens | Legacy |
For most use cases, text-embedding-3-small gives the best speed/quality tradeoff.
Chunk Size
Optimal chunk size depends on content type:
| Content Type | Chunk Size | Overlap |
|---|---|---|
| Technical docs | 512 tokens | 50 tokens |
| Blog posts | 800 tokens | 100 tokens |
| Legal documents | 1024 tokens | 200 tokens |
| Code files | 256 tokens | 25 tokens |
Benchmarking
Run a load test against your tuned nfyio instance:
# Upload throughput
for i in $(seq 1 100); do
curl -s -X PUT http://localhost:7007/bucket/test-file-$i.txt \
-H "Authorization: Bearer $JWT" \
-d "test content $i" &
done
wait
echo "100 uploads completed"
# Search latency
time curl -s -X POST http://localhost:3000/api/v1/search \
-H "Authorization: Bearer $JWT" \
-H "Content-Type: application/json" \
-d '{
"query": "how to deploy kubernetes",
"bucket": "production-data",
"limit": 10
}' | jq '.results | length'
Quick Reference
| Component | Key Setting | Default | Recommended |
|---|---|---|---|
| PostgreSQL | shared_buffers | 128 MB | 25% of RAM |
| PostgreSQL | work_mem | 4 MB | 64-256 MB |
| PostgreSQL | hnsw.ef_search | 40 | 100 |
| Redis | maxmemory | No limit | 2-4 GB |
| Redis | maxmemory-policy | noeviction | allkeys-lru |
| SeaweedFS | Volume size | 30 GB | 100 GB |
| SeaweedFS | Cache size | 0 | 4 GB |
| Embeddings | Batch size | 10 | 100 |
Key Takeaways
- PgBouncer in front of PostgreSQL handles connection multiplexing far better than raising
max_connections - HNSW index parameters (
m,ef_construction,ef_search) have the biggest impact on semantic search performance - Redis memory policy should be
allkeys-lrufor cache workloads — never let it hit OOM - SeaweedFS volume compaction reclaims disk space after deletions
- Batch embedding processing (100 items at a time) reduces API round trips and improves throughput
- Always benchmark before and after tuning — measure, don’t guess
For monitoring your tuned setup, see the Prometheus & Grafana guide.
Written by
nfyio Team
Talya Smart & Technoplatz JV
Building the future of web design at Anti-Gravity. Passionate about creating beautiful, accessible experiences.