Core Concepts

Learn NFYio core concepts: buckets, objects, access keys, agents, embeddings, VPCs, workspaces, and roles.

This document introduces the key concepts in NFYio: storage primitives, AI agents, embeddings, networking, and access control.

Buckets

A bucket is a top-level container for objects. It is the S3-compatible equivalent of a folder or namespace.

  • Bucket names must be globally unique within your NFYio deployment
  • You can create buckets via the API, AWS CLI, or dashboard
  • Buckets support versioning, lifecycle policies, and access control
# Create a bucket
aws --endpoint-url http://localhost:7007 s3 mb s3://my-documents

# List buckets
aws --endpoint-url http://localhost:7007 s3 ls

Objects

An object is a file stored in a bucket. Each object has:

  • Key — Path-like identifier (e.g., documents/report.pdf)
  • Size — Size in bytes
  • Content-Type — MIME type
  • Metadata — Custom key-value pairs
  • Version ID — If versioning is enabled

Objects are stored in SeaweedFS and metadata in PostgreSQL.

# Upload an object
aws --endpoint-url http://localhost:7007 s3 cp report.pdf s3://my-documents/reports/

# Download an object
aws --endpoint-url http://localhost:7007 s3 cp s3://my-documents/reports/report.pdf ./

Access Keys

Access keys are credentials for programmatic access to NFYio storage. Each key has:

  • Access Key ID — Public identifier
  • Secret Access Key — Private secret (store securely)

Use them with AWS SDKs or the aws CLI by configuring the endpoint and credentials. Keys are scoped to a workspace or project and respect RBAC.

# Configure AWS CLI for NFYio
aws configure set aws_access_key_id YOUR_ACCESS_KEY
aws configure set aws_secret_access_key YOUR_SECRET_KEY
aws configure set default.region us-east-1

# Use with endpoint
aws --endpoint-url http://localhost:7007 s3 ls

Agents

Agents are AI-powered components that process data and respond to queries.

RAG Agent

Retrieval-Augmented Generation — Ingest documents, generate embeddings, and answer questions using your data.

  • Documents are chunked and embedded
  • Queries are embedded and matched via vector similarity
  • Relevant chunks are passed to the LLM as context
  • The LLM generates answers grounded in your corpus

LLM Agent

Large Language Model — Direct interaction with models (e.g., GPT-4o) for chat, summarization, or generation. Can be combined with RAG for knowledge-grounded responses.

Workflow Agent

Multi-step workflows — Chain multiple steps (retrieve → reason → act) using LangChain. Supports tools, policy checks, and branching logic.

Embeddings

Embeddings are dense vector representations of text. NFYio uses OpenAI or Voyage AI to convert:

  • Document chunks → vectors stored in pgvector
  • User queries → vectors used for similarity search

Vector search finds the most similar chunks to a query vector using cosine similarity or L2 distance in pgvector. This enables semantic search: “find documents about X” without exact keyword matches.

-- Conceptual: find similar chunks (actual API differs)
SELECT id, content, embedding <=> query_embedding AS distance
FROM document_chunks
ORDER BY distance
LIMIT 10;

VPCs, Subnets, Security Groups

NFYio supports virtual networking for multi-tenant isolation:

VPC (Virtual Private Cloud)

A VPC is an isolated network segment. Resources in a VPC can communicate privately; traffic to/from the internet is controlled.

Subnets

Subnets are subdivisions of a VPC. They define IP ranges and can be public or private.

Security Groups

Security groups act as firewalls. They define which inbound and outbound traffic is allowed for resources (e.g., agent runtimes, storage access).

Workspaces, Projects, Teams

NFYio organizes resources in a hierarchy:

Workspace

A workspace is the top-level tenant. It typically represents an organization or customer. All projects and teams belong to a workspace.

Project

A project groups related resources (buckets, agents, datasets) within a workspace. Use projects to separate environments (e.g., dev, staging, prod) or product lines.

Team

A team is a group of users with shared access to projects. Teams have roles that determine what members can do (read, write, admin).

Workspace (Acme Corp)
├── Project (Document AI)
│   ├── Bucket: documents
│   ├── RAG Agent: doc-qa
│   └── Team: doc-team (read/write)
└── Project (Analytics)
    ├── Bucket: raw-data
    └── Team: analytics-team (read)

Roles and Permissions

Roles

Roles define what a user or team can do:

RoleTypical Permissions
viewerRead buckets, objects, agents; run queries
editorViewer + upload, delete objects; create agents
adminEditor + manage buckets, teams, settings
ownerFull control over workspace/project

Permissions

Permissions are enforced at multiple layers:

  • Keycloak — Authentication and user attributes
  • NFYio Gateway — JWT validation, workspace/project membership
  • Storage Proxy — Bucket and object access checks
  • Agent Service — Workspace-scoped agent access
  • PostgreSQL RLS — Row-level security for metadata

Best Practices

  • Use the principle of least privilege: assign the minimum role needed
  • Prefer team-based access over individual grants
  • Rotate access keys periodically
  • Enable audit logging for sensitive operations