Skip to content

deduplicate

fialr dedup <target> [options]

Identify duplicate files by BLAKE3 content hash. In dry-run mode (default), report duplicate groups. With --execute, move non-canonical copies to _dupes/. No files are ever deleted.


ArgumentDescription
targetDirectory to deduplicate (required)
OptionDescription
--executeMove duplicates to _dupes/ (not just report)
--strategy STRATEGYCanonical selection strategy (default: shortest-path)
--jobs-dir PATHDirectory for job artifacts (default: .fialr/jobs)

dedup scans the target directory, groups files by BLAKE3 content hash, and identifies groups with more than one member. For each group, it selects one canonical copy and marks the rest as non-canonical.

The --strategy flag controls how the canonical copy is selected:

StrategySelection rule
shortest-pathFile with the shortest path is canonical (default)
oldest-mtimeFile with the oldest modification time is canonical
newest-mtimeFile with the newest modification time is canonical

When --execute is passed, non-canonical copies are moved to _dupes/ inside the target directory. The _dupes/ directory is a staging area for review. fialr never deletes files. Purging duplicates from _dupes/ is a manual operation.

Each moved file retains full provenance in XATTRs and SQLite:

  • Original path (com.fialr.original_path)
  • Original name (com.fialr.original_name)
  • Content hash (com.fialr.hash)
  • Job UUID (com.fialr.job_uuid)
  • Tier 1 files are never touched. They are flagged for manual review.
  • Pre-move and post-move hash verification for every file moved.
  • All operations logged to the append-only audit ledger.
  • Near-duplicate detection identifies version sequences (same stem, different versions) and reports them separately from exact duplicates.

Dry-run:

dedup ~/Documents
────────────────────────────────────────────────────────
total 847
unique 801
groups 19
dupes 46
space 128.4 MB reclaimable

Execution:

dedup ~/Documents --execute
────────────────────────────────────────────────────────
moved 46
skipped 0
errors 0

Terminal window
# Find duplicates (dry-run)
fialr dedup ~/Documents
# Move duplicates to _dupes/
fialr dedup ~/Documents --execute
# Use oldest-mtime retention strategy
fialr dedup ~/Documents --execute --strategy oldest-mtime