Operations¶
Configuration, tuning, and deployment guidance for production YAMS deployments.
Data Directory¶
YAMS persists content, indexes, and metadata in a single data directory.
Recommended locations:
- Linux: /var/lib/yams
- macOS: /usr/local/var/yams or /opt/yams/data
- Containers: bind mount host path to /var/lib/yams
Permissions:
sudo useradd --system --home /var/lib/yams --shell /usr/sbin/nologin yams
sudo mkdir -p /var/lib/yams
sudo chown -R yams:yams /var/lib/yams
sudo chmod 0750 /var/lib/yams
Deployment¶
Batch/Automation (cron/systemd timer)¶
#!/usr/bin/env bash
set -euo pipefail
export YAMS_STORAGE="/var/lib/yams"
yams add /srv/docs --recursive --include="*.md" --tags "docs,import"
Systemd Service¶
[Unit]
Description=YAMS Service
After=network-online.target
[Service]
User=yams
Group=yams
Environment="YAMS_STORAGE=/var/lib/yams"
WorkingDirectory=/var/lib/yams
ExecStart=/usr/local/bin/yams serve
Restart=on-failure
NoNewPrivileges=true
ProtectSystem=full
ProtectHome=true
LimitNOFILE=16384
[Install]
WantedBy=multi-user.target
Containers¶
mkdir -p $HOME/yams-data
docker run --rm -it \
-u 10001:10001 \
-v $HOME/yams-data:/var/lib/yams \
ghcr.io/trvon/yams:latest yams init --non-interactive
Resource controls:
- Use --cpus and --memory limits
- Mount code/config read-only; data directories write-only
Configuration¶
Discover options: yams --help
Guidelines: - Prefer explicit CLI flags for clarity - Use environment variables for secrets and deployment paths - Store site-specific values in EnvironmentFile (systemd) or .env (containers)
Key environment variables:
- YAMS_STORAGE: data directory path
- YAMS_CONFIG: config file override
- YAMS_TUNING_PROFILE: efficient|balanced|aggressive
- YAMS_MAX_ACTIVE_CONN: connection limit
- YAMS_KEEPALIVE_MS: daemon keepalive interval
Performance Tuning¶
Auto-Tuning (Daemon)¶
YAMS includes centralized TuneAdvisor and ResourceTuner for adaptive behavior:
- Metrics: CPU utilization (/proc deltas), memory (PSS/RSS), connection counts
- Worker pools: Shrink when idle, grow under pressure
- Auto-embed policy: Idle (default), Never, Always
- Idle: embed only when daemon is idle; defer otherwise to background repair
- Never: always defer to
yams repair --embeddings - Always: embed immediately (increases ingest CPU)
Embedding Batch Controls¶
Adaptive DynamicBatcher with conservative defaults: - Safety factor: 0.90 (reserves token budget headroom) - Inter-batch pause: 0ms (add delay to reduce CPU spikes) - Advisory doc cap: unset (limits documents per batch)
Tuning Profiles¶
Profiles bundle multiple tuning heuristics:
| Profile | Use Case | Behavior |
|---|---|---|
efficient |
Resource-constrained hosts | Slower pool growth, higher thresholds |
balanced |
General purpose (default) | Moderate ramp-up |
aggressive |
High throughput under load | Faster growth, lower thresholds |
Set profile:
yams config set tuning.profile aggressive
yams daemon restart
Or via environment:
export YAMS_TUNING_PROFILE=aggressive
Quick Wins¶
- Storage: NVMe SSD, ext4/xfs with
relatime - Concurrency: ~physical core count for workers
- Compression: zstd (balanced), LZMA (cold archives)
- Chunking: 8-16 KiB (text), 32-64 KiB (binaries)
- SQLite: WAL enabled,
synchronous=NORMAL(bulk ingest),FULL(steady-state) - FTS5: Index only queried fields, run
optimizeafter bulk ingest - OS limits:
nofile >= 16384
Workload-Specific¶
Ingest-heavy:
- Batch files (100-1000 docs)
- Lower fsync cost (synchronous=NORMAL)
- Defer FTS optimize until after ingest
Query-heavy:
- Higher fsync levels (synchronous=FULL)
- Increase caches (OS + DB)
- Optimize query shapes (prefix/suffix, filters)
Mixed: - Schedule maintenance during off-peak - Conservative concurrency, throttle ingest during peak queries
Configuration Reference¶
See include/yams/config/config_defaults.h for available keys.
| Area | Config Key | Effect | When to Adjust |
|---|---|---|---|
| Worker pools | [performance].num_worker_threads |
CPU parallelism | Increase for bursts, decrease on constrained hosts |
| Request backlog | [performance].max_concurrent_operations |
Task limit | Lower for latency-sensitive, raise for long-running ingest |
| Chunking | [chunking].min/max/average_chunk_size |
Dedupe granularity | Smaller for text diffs, larger for binaries |
| Compression | [compression].algorithm, zstd_level |
CPU vs storage | Lower levels for real-time, raise for archival |
| Embeddings | [embeddings].auto_on_add, batch_size |
Embedding cost | Disable auto for cold ingest, trim batch if memory tight |
| Vector index | [vector_database].index_type, num_partitions |
Recall vs latency | Start with defaults, tune after baseline data |
| WAL | [wal].sync_interval, enable_group_commit |
Fsync cadence | Relax during bulk, tighten for steady-state |
Vector Search Tuning¶
Embeddings: - Precompute offline to reduce ingest latency - Normalize vectors if metric requires (cosine)
Index parameters:
- HNSW: tune M/efConstruction (build), efSearch (query)
- IVF/PQ: select nlist/nprobe, quantization settings
Hybrid search: - Filter with keyword first to reduce vector candidates
Dimension Changes¶
When switching embedding models, align vector schema:
Convergence order:
1. Config: embeddings.embedding_dim (preferred)
2. Sentinel: $YAMS_STORAGE/vectors_sentinel.json
3. Model: reported dimensions (fallback)
Recommended workflow:
yams model download <name> --apply-config # Updates config + recreates vectors.db
yams repair --embeddings # Regenerate vectors
Or use doctor:
yams doctor # Accept "Recreate vectors.db" → "Restart daemon"
yams repair --embeddings
Maintenance¶
- FTS optimize: After bulk ingest
- WAL checkpoint: During low traffic
- Integrity checks: Periodic verification on sample sets
- Backups: Snapshot entire data directory at consistent points
Monitoring¶
Observability:
yams status --json # Check pool sizes, connections
yams stats -v # Detailed metrics
Health checks: - Implement periodic read/search probes - Alert on failures, slow queries, error rates
Key metrics: - Throughput: docs/s, bytes/s - Latency: p50/p95/p99 - Resources: CPU, RSS, IO wait, disk latency (p99)
See src/daemon/components/DaemonMetrics.cpp for implementation.
Security¶
- Run as non-privileged user
- Restrict data directory permissions (0750)
- Avoid secrets in command lines; use environment or secret managers
- For network endpoints: bind to private interface, use reverse proxy, enforce TLS upstream
Backups¶
- Snapshot entire data directory
- Quiesce writes during backup (low-traffic windows)
- Store backups off-site with integrity checks
- Test restore procedures periodically
Troubleshooting¶
Slow ingest:
- Reduce fsync cost (synchronous=NORMAL)
- Increase batch and transaction size
- Check disk latency (NVMe performance)
Slow queries: - Run FTS optimize - Check cache sizing (hot set in memory) - For vector search: increase search params or pre-filtering
WAL growth: - Schedule checkpoints - Ensure consumers aren’t holding readers open
Memory pressure: - Reduce caches and concurrency - Lower embedding batch safety, increase inter-batch pauses - Use Idle auto-embed policy
Upgrades¶
- Pin container tags or package versions
- Backup before upgrades
- Review release notes for migrations
- Validate in staging first
References: - User Guide: CLI - Developer: Build System - Architecture: Daemon