Skip to content

Operations

Configuration, tuning, and deployment guidance for production YAMS deployments.

Data Directory

YAMS persists content, indexes, and metadata in a single data directory.

Recommended locations: - Linux: /var/lib/yams - macOS: /usr/local/var/yams or /opt/yams/data - Containers: bind mount host path to /var/lib/yams

Permissions:

sudo useradd --system --home /var/lib/yams --shell /usr/sbin/nologin yams
sudo mkdir -p /var/lib/yams
sudo chown -R yams:yams /var/lib/yams
sudo chmod 0750 /var/lib/yams

Deployment

Batch/Automation (cron/systemd timer)

#!/usr/bin/env bash
set -euo pipefail
export YAMS_STORAGE="/var/lib/yams"
yams add /srv/docs --recursive --include="*.md" --tags "docs,import"

Systemd Service

[Unit]
Description=YAMS Service
After=network-online.target

[Service]
User=yams
Group=yams
Environment="YAMS_STORAGE=/var/lib/yams"
WorkingDirectory=/var/lib/yams
ExecStart=/usr/local/bin/yams serve
Restart=on-failure
NoNewPrivileges=true
ProtectSystem=full
ProtectHome=true
LimitNOFILE=16384

[Install]
WantedBy=multi-user.target

Containers

mkdir -p $HOME/yams-data
docker run --rm -it \
  -u 10001:10001 \
  -v $HOME/yams-data:/var/lib/yams \
  ghcr.io/trvon/yams:latest yams init --non-interactive

Resource controls: - Use --cpus and --memory limits - Mount code/config read-only; data directories write-only

Configuration

Discover options: yams --help

Guidelines: - Prefer explicit CLI flags for clarity - Use environment variables for secrets and deployment paths - Store site-specific values in EnvironmentFile (systemd) or .env (containers)

Key environment variables: - YAMS_STORAGE: data directory path - YAMS_CONFIG: config file override - YAMS_TUNING_PROFILE: efficient|balanced|aggressive - YAMS_MAX_ACTIVE_CONN: connection limit - YAMS_KEEPALIVE_MS: daemon keepalive interval

Performance Tuning

Auto-Tuning (Daemon)

YAMS includes centralized TuneAdvisor and ResourceTuner for adaptive behavior:

  • Metrics: CPU utilization (/proc deltas), memory (PSS/RSS), connection counts
  • Worker pools: Shrink when idle, grow under pressure
  • Auto-embed policy: Idle (default), Never, Always
  • Idle: embed only when daemon is idle; defer otherwise to background repair
  • Never: always defer to yams repair --embeddings
  • Always: embed immediately (increases ingest CPU)

Embedding Batch Controls

Adaptive DynamicBatcher with conservative defaults: - Safety factor: 0.90 (reserves token budget headroom) - Inter-batch pause: 0ms (add delay to reduce CPU spikes) - Advisory doc cap: unset (limits documents per batch)

Tuning Profiles

Profiles bundle multiple tuning heuristics:

Profile Use Case Behavior
efficient Resource-constrained hosts Slower pool growth, higher thresholds
balanced General purpose (default) Moderate ramp-up
aggressive High throughput under load Faster growth, lower thresholds

Set profile:

yams config set tuning.profile aggressive
yams daemon restart

Or via environment:

export YAMS_TUNING_PROFILE=aggressive

Quick Wins

  • Storage: NVMe SSD, ext4/xfs with relatime
  • Concurrency: ~physical core count for workers
  • Compression: zstd (balanced), LZMA (cold archives)
  • Chunking: 8-16 KiB (text), 32-64 KiB (binaries)
  • SQLite: WAL enabled, synchronous=NORMAL (bulk ingest), FULL (steady-state)
  • FTS5: Index only queried fields, run optimize after bulk ingest
  • OS limits: nofile >= 16384

Workload-Specific

Ingest-heavy: - Batch files (100-1000 docs) - Lower fsync cost (synchronous=NORMAL) - Defer FTS optimize until after ingest

Query-heavy: - Higher fsync levels (synchronous=FULL) - Increase caches (OS + DB) - Optimize query shapes (prefix/suffix, filters)

Mixed: - Schedule maintenance during off-peak - Conservative concurrency, throttle ingest during peak queries

Configuration Reference

See include/yams/config/config_defaults.h for available keys.

Area Config Key Effect When to Adjust
Worker pools [performance].num_worker_threads CPU parallelism Increase for bursts, decrease on constrained hosts
Request backlog [performance].max_concurrent_operations Task limit Lower for latency-sensitive, raise for long-running ingest
Chunking [chunking].min/max/average_chunk_size Dedupe granularity Smaller for text diffs, larger for binaries
Compression [compression].algorithm, zstd_level CPU vs storage Lower levels for real-time, raise for archival
Embeddings [embeddings].auto_on_add, batch_size Embedding cost Disable auto for cold ingest, trim batch if memory tight
Vector index [vector_database].index_type, num_partitions Recall vs latency Start with defaults, tune after baseline data
WAL [wal].sync_interval, enable_group_commit Fsync cadence Relax during bulk, tighten for steady-state

Vector Search Tuning

Embeddings: - Precompute offline to reduce ingest latency - Normalize vectors if metric requires (cosine)

Index parameters: - HNSW: tune M/efConstruction (build), efSearch (query) - IVF/PQ: select nlist/nprobe, quantization settings

Hybrid search: - Filter with keyword first to reduce vector candidates

Dimension Changes

When switching embedding models, align vector schema:

Convergence order: 1. Config: embeddings.embedding_dim (preferred) 2. Sentinel: $YAMS_STORAGE/vectors_sentinel.json 3. Model: reported dimensions (fallback)

Recommended workflow:

yams model download <name> --apply-config  # Updates config + recreates vectors.db
yams repair --embeddings                   # Regenerate vectors

Or use doctor:

yams doctor  # Accept "Recreate vectors.db" → "Restart daemon"
yams repair --embeddings

Maintenance

  • FTS optimize: After bulk ingest
  • WAL checkpoint: During low traffic
  • Integrity checks: Periodic verification on sample sets
  • Backups: Snapshot entire data directory at consistent points

Monitoring

Observability:

yams status --json  # Check pool sizes, connections
yams stats -v       # Detailed metrics

Health checks: - Implement periodic read/search probes - Alert on failures, slow queries, error rates

Key metrics: - Throughput: docs/s, bytes/s - Latency: p50/p95/p99 - Resources: CPU, RSS, IO wait, disk latency (p99)

See src/daemon/components/DaemonMetrics.cpp for implementation.

Security

  • Run as non-privileged user
  • Restrict data directory permissions (0750)
  • Avoid secrets in command lines; use environment or secret managers
  • For network endpoints: bind to private interface, use reverse proxy, enforce TLS upstream

Backups

  • Snapshot entire data directory
  • Quiesce writes during backup (low-traffic windows)
  • Store backups off-site with integrity checks
  • Test restore procedures periodically

Troubleshooting

Slow ingest: - Reduce fsync cost (synchronous=NORMAL) - Increase batch and transaction size - Check disk latency (NVMe performance)

Slow queries: - Run FTS optimize - Check cache sizing (hot set in memory) - For vector search: increase search params or pre-filtering

WAL growth: - Schedule checkpoints - Ensure consumers aren’t holding readers open

Memory pressure: - Reduce caches and concurrency - Lower embedding batch safety, increase inter-batch pauses - Use Idle auto-embed policy

Upgrades

  • Pin container tags or package versions
  • Backup before upgrades
  • Review release notes for migrations
  • Validate in staging first

References: - User Guide: CLI - Developer: Build System - Architecture: Daemon