enrich
fialr enrich <target> [options]Extract text from files and run local LLM inference via Ollama to generate structured metadata: filename tokens, semantic tags, a one-sentence summary, and a confidence score. Tier 1 files are always skipped.
Arguments
Section titled “Arguments”| Argument | Description |
|---|---|
target | Directory to enrich (required) |
Options
Section titled “Options”| Option | Description |
|---|---|
--execute | Apply enrichment metadata (not just report) |
--jobs-dir PATH | Directory for job artifacts (default: .fialr/jobs) |
--sensitivity-rules PATH | Path to sensitivity.yaml (default: config/sensitivity.yaml) |
Prerequisites
Section titled “Prerequisites”Enrichment requires:
- Ollama running locally at
http://localhost:11434(configurable infialr.toml) - A pulled model (default:
llama3.2, configurable under[enrichment].model)
What it does
Section titled “What it does”Tier restrictions
Section titled “Tier restrictions”Sensitivity tiers gate access to the enrichment pipeline:
| Tier | Access |
|---|---|
| 1 (RESTRICTED) | Never enters the pipeline. This is enforced, not advisory. |
| 2 (SENSITIVE) | Local LLM on extracted text only. No raw file content sent to inference. |
| 3 (INTERNAL) | Full local enrichment. |
Text extraction
Section titled “Text extraction”fialr extracts text from files using format-specific tools:
| Format | Extraction method |
|---|---|
| Scanned PDF | ocrmypdf + Tesseract OCR |
| Native PDF | pypdfium2 |
| Photos | piexif (EXIF metadata) |
| Audio | mutagen (ID3 tags) |
| Office documents | python-docx, openpyxl |
Inference
Section titled “Inference”Extracted text is sent to Ollama running on localhost. The inference layer is abstracted behind inference.py — cloud endpoints cannot be configured. The LLM returns structured JSON:
- Date — document subject date
- Entity — primary subject or organization
- Descriptor — semantic description
- Tags — semantic tags
- Summary — one-sentence summary
- Confidence — 0.0 to 1.0 score
Confidence routing
Section titled “Confidence routing”The confidence threshold (default: 0.7, configurable in fialr.toml under [enrichment].confidence_threshold) determines what happens with inference results:
- Above threshold — metadata is auto-applied to XATTRs and SQLite
- Below threshold — file is sent to the review queue with the LLM suggestion attached as a hint for manual review
Output
Section titled “Output”Dry-run:
enrich ~/Documents
──────────────────────────────────────────────────────── enriched 623 review 89 skipped 135 (tier 1: 12, no text: 123) errors 0Examples
Section titled “Examples”# Dry-run enrichmentfialr enrich ~/Documents
# Apply enrichment metadatafialr enrich ~/Documents --executeSee also
Section titled “See also”- Enrichment guide — walkthrough of the enrichment process
- Sensitivity Tiers — how tiers control enrichment access
- classify — check sensitivity tiers before enrichment