Performance Tuning nfyio: PostgreSQL, Redis, and SeaweedFS

A fresh nfyio deployment works well out of the box. At scale — millions of objects, thousands of embeddings per hour, concurrent RAG queries — you need to tune. This guide covers the high-impact optimizations for each layer of the stack.

PostgreSQL Tuning

PostgreSQL handles metadata, RLS policies, and pgvector embeddings. It’s the most critical component to tune.

Connection Pooling

Default max_connections = 100 is too low for production. Use PgBouncer instead of increasing connections directly:

# pgbouncer.ini
[databases]
nfyio = host=localhost port=5432 dbname=nfyio

[pgbouncer]
listen_port = 6432
listen_addr = 0.0.0.0
auth_type = md5
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 50
reserve_pool_size = 10

Connect your nfyio gateway to PgBouncer:

DATABASE_URL=postgresql://nfyio:password@localhost:6432/nfyio

Memory Configuration

For an 8 GB RAM server dedicated to PostgreSQL:

# postgresql.conf
shared_buffers = 2GB
effective_cache_size = 6GB
work_mem = 64MB
maintenance_work_mem = 512MB
wal_buffers = 64MB

For a 32 GB RAM server:

shared_buffers = 8GB
effective_cache_size = 24GB
work_mem = 256MB
maintenance_work_mem = 2GB
wal_buffers = 128MB

pgvector Index Tuning

Embedding search is the most expensive query. Tune the HNSW index:

-- Check current index
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'embeddings';

-- Drop and recreate with optimized parameters
DROP INDEX IF EXISTS embeddings_vector_idx;

CREATE INDEX embeddings_vector_idx
ON embeddings
USING hnsw (embedding vector_cosine_ops)
WITH (m = 24, ef_construction = 200);

-- Set search-time parameters
SET hnsw.ef_search = 100;

Parameter	Default	Tuned	Effect
`m`	16	24	More connections per node, better recall
`ef_construction`	64	200	Better index quality, slower build
`ef_search`	40	100	Better search recall, slightly slower queries

Benchmark before and after:

EXPLAIN (ANALYZE, BUFFERS)
SELECT id, 1 - (embedding <=> '[0.1, 0.2, ...]'::vector) AS similarity
FROM embeddings
WHERE bucket_id = 'bucket_abc123'
ORDER BY embedding <=> '[0.1, 0.2, ...]'::vector
LIMIT 10;

Vacuum and Analyze

-- Check bloat
SELECT schemaname, tablename,
  pg_size_pretty(pg_total_relation_size(schemaname || '.' || tablename)) AS total_size,
  n_dead_tup,
  last_autovacuum
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 10;

-- Aggressive vacuum on embeddings table
VACUUM (VERBOSE, ANALYZE) embeddings;

Autovacuum tuning for write-heavy tables:

ALTER TABLE embeddings SET (
  autovacuum_vacuum_scale_factor = 0.02,
  autovacuum_analyze_scale_factor = 0.01,
  autovacuum_vacuum_cost_delay = 10
);

Redis Tuning

Redis handles job queues (embedding pipeline, agent tasks) and caching.

Memory Policy

# redis.conf
maxmemory 2gb
maxmemory-policy allkeys-lru

Pipeline Optimization

For batch embedding jobs, use Redis pipelines to reduce round trips:

# Check queue lengths
redis-cli LLEN nfyio:embedding:queue
redis-cli LLEN nfyio:agent:queue

# Monitor commands per second
redis-cli INFO stats | grep instantaneous_ops_per_sec

Persistence Tuning

For job queues where data loss is tolerable:

# Disable AOF, use infrequent RDB snapshots
appendonly no
save 900 1
save 300 10

For critical queues:

appendonly yes
appendfsync everysec
no-appendfsync-on-rewrite yes

Connection Limits

maxclients 10000
tcp-backlog 511
timeout 300
tcp-keepalive 60

SeaweedFS Tuning

SeaweedFS handles the actual object storage. Tuning focuses on throughput and replication.

Volume Size

Default 30 GB volumes. For large deployments, increase:

weed master -volumeSizeLimitMB=100000

Concurrent Uploads

Increase filer concurrency:

weed filer -maxMB=256 -concurrentUploadLimitMB=512

Compaction

SeaweedFS leaves gaps when objects are deleted. Schedule compaction:

# Check volume status
curl -s http://localhost:9333/vol/status | jq '.Volumes[] | select(.DeleteCount > 100)'

# Compact a volume
curl "http://localhost:9333/vol/vacuum?garbageThreshold=0.3"

Read Cache

Enable filer read cache for frequently accessed objects:

weed filer -cacheDir=/tmp/seaweedfs-cache -cacheSizeMB=4096

Embedding Pipeline Tuning

Batch Size

Process embeddings in batches instead of one-by-one:

# Check current setting
curl -s http://localhost:7010/config | jq '.embedding'

# Update batch size
curl -X PATCH http://localhost:7010/config \
  -H "Authorization: Bearer $JWT" \
  -H "Content-Type: application/json" \
  -d '{"embedding": {"batch_size": 100, "max_concurrent": 10}}'

Model Selection

Model	Dimensions	Speed	Cost	Quality
`text-embedding-3-small`	1536	Fast	$0.02/1M tokens	Good
`text-embedding-3-large`	3072	Medium	$0.13/1M tokens	Best
`text-embedding-ada-002`	1536	Fast	$0.10/1M tokens	Legacy

For most use cases, text-embedding-3-small gives the best speed/quality tradeoff.

Chunk Size

Optimal chunk size depends on content type:

Content Type	Chunk Size	Overlap
Technical docs	512 tokens	50 tokens
Blog posts	800 tokens	100 tokens
Legal documents	1024 tokens	200 tokens
Code files	256 tokens	25 tokens

Benchmarking

Run a load test against your tuned nfyio instance:

# Upload throughput
for i in $(seq 1 100); do
  curl -s -X PUT http://localhost:7007/bucket/test-file-$i.txt \
    -H "Authorization: Bearer $JWT" \
    -d "test content $i" &
done
wait
echo "100 uploads completed"

# Search latency
time curl -s -X POST http://localhost:3000/api/v1/search \
  -H "Authorization: Bearer $JWT" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "how to deploy kubernetes",
    "bucket": "production-data",
    "limit": 10
  }' | jq '.results | length'

Quick Reference

Component	Key Setting	Default	Recommended
PostgreSQL	`shared_buffers`	128 MB	25% of RAM
PostgreSQL	`work_mem`	4 MB	64-256 MB
PostgreSQL	`hnsw.ef_search`	40	100
Redis	`maxmemory`	No limit	2-4 GB
Redis	`maxmemory-policy`	noeviction	allkeys-lru
SeaweedFS	Volume size	30 GB	100 GB
SeaweedFS	Cache size	0	4 GB
Embeddings	Batch size	10	100

Key Takeaways

PgBouncer in front of PostgreSQL handles connection multiplexing far better than raising max_connections
HNSW index parameters (m, ef_construction, ef_search) have the biggest impact on semantic search performance
Redis memory policy should be allkeys-lru for cache workloads — never let it hit OOM
SeaweedFS volume compaction reclaims disk space after deletions
Batch embedding processing (100 items at a time) reduces API round trips and improves throughput
Always benchmark before and after tuning — measure, don’t guess

For monitoring your tuned setup, see the Prometheus & Grafana guide.