Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Hash Cache

rsdedup maintains a persistent hash cache at ~/.rsdedup/cache.db to avoid rehashing files that haven’t changed.

How it works

The cache is a key-value store (using sled) where:

  • Key: absolute file path
  • Value: cached metadata and hash values

Each cache entry stores:

  • File size
  • Modification time (seconds + nanoseconds)
  • Inode number
  • Hash algorithm used
  • Partial hash (first 4KB)
  • Full file hash
  • Timestamp of when the entry was cached

Cache invalidation

A cached hash is considered valid only if all of the following still match the current file:

  • Size
  • Modification time (mtime)
  • Inode number

If any of these differ, the file is rehashed and the cache entry is updated.

Cache operations

# Pre-populate the cache
rsdedup cache scan /path/to/directory

# View cache statistics
rsdedup cache stats

# Clear the cache
rsdedup cache clear

Disabling the cache

Use --no-cache to skip the cache entirely for a single run:

rsdedup dedup report --no-cache /path

This is useful for benchmarking or when you suspect cache corruption.

Cache location

The cache is stored at ~/.rsdedup/cache.db. The directory is created automatically on first use.

Incremental scanning

The cache scan command is incremental. On repeated runs, only files that have changed (or are new) are hashed. Files that haven’t changed are skipped. Both partial (4KB) and full hashes are stored for every file.