Module Map
Every module in fialr has a defined location and a single responsibility. The module map below is the source of truth for the codebase structure.
Module tree
Section titled “Module tree”fialr/├── __main__.py # python -m fialr support├── cli.py # unified CLI entry point├── core/│ ├── inventory.py # filesystem traversal, manifest generation│ ├── classifier.py # sensitivity tiering, category suggestion│ ├── planner.py # dry-run move plan generator│ ├── executor.py # approved plan execution engine│ ├── deduplicate.py # hash-based and near-duplicate detection│ ├── rename.py # template-driven naming engine│ ├── organize.py # schema-driven reorganization│ └── validate.py # integrity verification against manifests├── enrichment/│ ├── inference.py # abstracted local inference interface (Ollama)│ ├── extractor.py # text extraction: OCR, PDF, EXIF, ID3│ └── enrich.py # enrichment orchestrator├── metadata/│ ├── xattr.py # extended attribute read/write (platform-aware)│ ├── db.py # SQLite operations and schema│ └── export.py # on-demand sidecar export (JSON / YAML)├── platform/│ ├── base.py # platform adapter interface│ ├── macos.py # iCloud sync, APFS vault, com.fialr.* XATTRs│ ├── linux.py # user.fialr.* XATTRs, VeraCrypt default vault│ └── windows.py # NTFS ADS, VeraCrypt default vault├── reporting/│ ├── reports.py # job summary generation│ └── exporters.py # output format adapters (JSON, Markdown, CSV)├── plugins/│ └── base.py # plugin interface and hook registry└── utils/ ├── hashing.py # BLAKE3 primary, SHA256 secondary ├── logging.py # structured job logging ├── config.py # TOML config loader and validator ├── output.py # ANSI-colored CLI output (Bronze/Ash palette) └── help.py # custom CLI help renderer (gh/cargo-style)Module descriptions
Section titled “Module descriptions”cli.py
Section titled “cli.py”Unified CLI entry point. Parses arguments via argparse but renders help through a custom HelpRenderer that bypasses argparse’s default display. All user-facing output goes through the Output class to stderr. Machine-readable data goes to stdout.
The pipeline modules. Each handles one phase of the workflow.
| Module | Responsibility |
|---|---|
inventory.py | Traverse a directory, hash every file (BLAKE3 + SHA256), detect MIME types, apply the four-layer exclusion system, produce a manifest.json. Read-only. |
classifier.py | Apply sensitivity rules to assign tiers (1/2/3) and suggest categories. Uses structural signals only — never reads file content for Tier 1. |
planner.py | Read classifier output and schema, produce plan.csv with source/destination paths, proposed names, operation types, and conflict flags. Read-only. |
executor.py | Execute an approved plan. Pre-move hash verification, move/rename, post-move hash verification, checkpoint after every N operations. Refuses to run without reviewed=true. |
deduplicate.py | Group files by BLAKE3 hash. Select canonical copy per retention strategy. Move non-canonical copies to _dupes/. No deletions. |
rename.py | Apply the naming convention (YYYY-MM-DD_[entity]_[descriptor]_[version].[ext]). Derive tokens from metadata, reject generic names. |
organize.py | Schema-driven reorganization. Maps categories to directory paths using schema.yaml. |
validate.py | Post-execution verification: check paths exist, hashes match, XATTRs are correct. |
enrichment/
Section titled “enrichment/”Local-only AI enrichment pipeline. Tier 1 files never enter this pipeline.
| Module | Responsibility |
|---|---|
inference.py | Abstracted local inference interface. Calls Ollama on localhost. Returns structured JSON (date, entity, descriptor, tags, summary, confidence). Cloud endpoints cannot be configured. |
extractor.py | Text extraction from multiple formats: ocrmypdf + Tesseract for scanned PDFs, pypdfium2 for native PDFs, piexif for EXIF, mutagen for audio, python-docx and openpyxl for Office documents. |
enrich.py | Orchestrator. Routes files through extraction and inference. Enforces tier restrictions. Routes results above/below confidence threshold to auto-apply or review queue. |
metadata/
Section titled “metadata/”Data storage and export.
| Module | Responsibility |
|---|---|
db.py | SQLite operations. Schema creation, file records, path tracking, operations ledger, duplicate groups, review queue. |
xattr.py | Extended attribute read/write. Platform-aware key prefixes (com.fialr.* on macOS, user.fialr.* on Linux). Degrades gracefully on unsupported filesystems. |
export.py | On-demand sidecar file generation. JSON and YAML formats. Single-file and batch export. |
platform/
Section titled “platform/”Platform adapters. Core modules import from base.py and never contain if sys.platform.
| Module | Responsibility |
|---|---|
base.py | Abstract adapter interface. Runtime adapter selection via get_adapter(). |
macos.py | iCloud sync detection and pause (brctl), APFS encrypted sparse bundle vault, com.fialr.* XATTR namespace. |
linux.py | user.fialr.* XATTR namespace via pyxattr, VeraCrypt as default vault. |
windows.py | NTFS Alternate Data Streams (limited XATTR support), VeraCrypt as default vault. |
reporting/
Section titled “reporting/”Job output generation.
| Module | Responsibility |
|---|---|
reports.py | Generate human-readable job summaries (report.md) from job logs. |
exporters.py | Output format adapters for JSON, Markdown, and CSV. |
plugins/
Section titled “plugins/”| Module | Responsibility |
|---|---|
base.py | Plugin interface using Protocol (structural typing). Hook registry for extending fialr behavior. |
utils/
Section titled “utils/”Shared infrastructure.
| Module | Responsibility |
|---|---|
hashing.py | BLAKE3 (primary, canonical) and SHA256 (secondary, cross-tool verification) hash computation. xxhash explicitly excluded — not cryptographically sound. |
logging.py | Structured JSON logging for jobs. JobLogger writes to log.json in the job directory. Debug output suppressed by default, enabled with --verbose. |
config.py | TOML config loader and validator. Reads fialr.toml, provides nested key access. |
output.py | ANSI-colored CLI output using the Bronze/Ash brand palette. Writes to stderr. Respects NO_COLOR, FORCE_COLOR, and TTY detection. |
help.py | Custom help renderer. Grouped, aligned, color-aware output following gh/cargo/stripe conventions. |
Configuration files
Section titled “Configuration files”config/ fialr.toml # primary runtime configuration schema.yaml # versioned directory schema sensitivity.yaml # tier classification rules and patternsJob directory structure
Section titled “Job directory structure”Every operation creates a job directory under .fialr/jobs/:
.fialr/jobs/{YYYY-MM-DD}_{job-name}_{uuid}/ manifest.json # pre-execution file state snapshot plan.csv # proposed operations log.json # append-only structured operation log report.md # human-readable job summary checkpoint.json # last completed operation index for resumeTech stack
Section titled “Tech stack”| Component | Package | Purpose |
|---|---|---|
| Runtime | Python 3.11+ | — |
| Primary hash | blake3 | Canonical file identity |
| Secondary hash | hashlib (stdlib) | SHA256 for cross-tool verification |
| Database | sqlite3 (stdlib) | Metadata ledger and audit log |
| Inference | Ollama | Local LLM (localhost only) |
| OCR | ocrmypdf + Tesseract | Scanned PDF text extraction |
| pypdfium2 | Native PDF text extraction | |
| EXIF | piexif | Photo metadata |
| Audio | mutagen | Audio metadata (ID3 tags) |
| MIME | python-magic | File type detection |
| Office | python-docx, openpyxl | Word and Excel extraction |
| Config | tomllib (stdlib) | TOML parsing |
| Schema/rules | pyyaml | YAML parsing |
| Paths | pathlib (stdlib) | All path operations |
Dependency philosophy
Section titled “Dependency philosophy”stdlib first. Before adding a dependency, check if the standard library can do the job. Every new dependency must be logged with rationale.
See also
Section titled “See also”- Architecture Overview — design principles and platform layer
- Directory Schema — the versioned schema system