Architecture: Plugin and Content Extraction¶
This overview reflects the current implementation of the plugin stack and content extraction pipeline. Every item below maps directly to production code.
For the user-facing plugin guide (writing and configuring plugins), see Plugins.
Manifests and Capability Negotiation¶
Source: src/daemon/resource/plugin_loader.cpp
PluginLoader::loadPluginopens a shared object, probes exported functions such asgetProviderName,createOnnxProvider, andcreateGraphAdapterV1, and records which interfaces the plugin implements.- Manifests declared in
docs/spec/plugin_metadata.schema.jsondescribe versioning and required capabilities; loader logic validates the manifest against this schema before registering providers. - Graph adapter factories are tracked in an in-process registry via
registerGraphAdapter/createGraphAdapterso knowledge-graph integrations can be selected at runtime.
Trust and Admission¶
Source: src/daemon/resource/plugin_host.cpp
PluginHost::trust/untrustmaintain a canonicalized allow-list persisted alongside the daemon data directory.PluginHost::enumerateTrustedfilters discovered libraries against that list, whichServiceManager::autoloadPluginsNowuses to decide which plugins can be materialized.
Native ABI Loader¶
Source: src/daemon/resource/abi_plugin_loader.cpp
- Resolves ABI entry points, maps them to C++ adapter classes, and converts plugin callbacks into daemon-facing interfaces (storage, model providers, etc.).
- Handles symbol version mismatches and throws structured errors that bubble back to the caller.
Content Extraction Pipeline¶
Content extraction is driven by plugins implementing the content_extractor_v1 interface. The daemon coordinates extraction during ingestion:
- Plugin discovery: The daemon scans for extractors via
PluginLoaderat startup, matching file types to capable plugins. - Extraction dispatch: During
add/ingest, thePostIngestQueuedispatches documents to registered extractors based on MIME type (e.g., PDF extractor, tree-sitter symbol extractor). - Symbol extraction: The tree-sitter plugin (
plugin-symbols) parses source files and emits function/class/import definitions into the knowledge graph, enabling symbol-aware search. - PDF extraction: The PDF extractor plugin uses qpdf/poppler to extract text content, which is then indexed for full-text and vector search.
Plugin Interfaces¶
| Interface | Purpose | Entry points |
|---|---|---|
| Model provider | Embedding generation (ONNX) | createOnnxProvider, getProviderName |
| Graph adapter | Knowledge graph backend | createGraphAdapterV1 |
| Content extractor | Text/symbol extraction | content_extractor_v1 ABI |
| Search provider | Plugin-contributed search results | search_provider_v1 ABI |
| Storage plugin | Object storage backends (S3/R2) | See Storage Plugin v1 |