Suggestions
Ideas for future improvements, organized by category. Completed items have been moved to suggestions-done.md.
Grades:
- Urgency:
high(users need this),medium(nice to have),low(speculative/future) - Complexity:
low(hours),medium(days),high(weeks+)
Build Execution
Distributed builds
- Run builds across multiple machines, similar to distcc or icecream for C/C++.
- A coordinator node distributes work to worker nodes, each running rsconstruct in worker mode.
- Workers execute products and return outputs to the coordinator, which caches them locally.
- Challenges: network overhead for small products, identical tool versions across workers, local filesystem access.
- Urgency: low | Complexity: high
Sandboxed execution
- Run each processor in an isolated environment where it can only access its declared inputs.
- Prevents accidental undeclared dependencies.
- On Linux, namespaces can provide lightweight sandboxing.
- Urgency: low | Complexity: high
Content-addressable outputs (unchanged output pruning)
- Hash outputs too to skip downstream rebuilds when an input changes but produces identical output.
- Bazel calls this “unchanged output pruning.”
- Urgency: medium | Complexity: medium
Persistent daemon mode
- Keep rsconstruct running as a background daemon to avoid startup overhead.
- Benefits: instant file index via inotify, warm Lua VMs, connection pooling, faster incremental builds.
- Daemon listens on Unix socket (
.rsconstruct/daemon.sock). rsconstruct watchbecomes a client that triggers rebuilds on file events.- Urgency: low | Complexity: high
Persistent workers
- Keep long-running tool processes alive to avoid startup overhead.
- Instead of spawning
rufforpylintper invocation, keep one process alive and feed it files. - Bazel gets 2-4x speedup for Java this way. Could benefit pylint/mypy which have heavy startup.
- Multiplex variant: multiple requests to a single worker process via threads.
- Urgency: medium | Complexity: high
Dynamic execution (race local vs remote)
- Start both local and remote execution of the same product; use whichever finishes first and cancel the other.
- Useful when remote cache is slow or flaky.
- Configurable per-processor via execution strategy.
- Urgency: low | Complexity: high
Execution strategies per processor
- Map each processor to an execution strategy: local, remote, sandboxed, or dynamic.
- Different processors may benefit from different execution models.
- Config:
[processor.ruff] execution = "remote",[processor.cc_single_file] execution = "sandboxed". - Urgency: low | Complexity: medium
Build profiles
- Named configuration sets for different build scenarios (ci, dev, release).
- Profiles inherit from base configuration and override specific values.
- Usage:
rsconstruct build --profile=ci - Urgency: medium | Complexity: medium
Conditional processors
- Enable or disable processors based on conditions (environment variables, file existence, git branch, custom commands).
- Multiple conditions can be combined with
all/anylogic. - Urgency: low | Complexity: medium
Target aliases
- Define named groups of processors for easy invocation.
- Usage:
rsconstruct build @lint,rsconstruct build @test - Special aliases:
@all,@changed,@failed - File-based targeting:
rsconstruct build src/main.c - Urgency: medium | Complexity: medium
Graph & Query
Build graph query language
- Support queries like
rsconstruct query deps out/foo,rsconstruct query rdeps src/main.c,rsconstruct query processor:ruff. - Useful for debugging builds and CI systems that want to build only affected targets.
- Urgency: low | Complexity: medium
Affected analysis
- Given changed files (from
git diff), determine which products are affected and only build those. - Useful for large projects where a full build is expensive.
- Urgency: medium | Complexity: medium
Critical path analysis
- Identify the longest sequential chain of actions in a build.
- Helps users optimize their slowest builds by showing what’s actually on the critical path.
- Display with
rsconstruct build --critical-pathor include in--timingsoutput. - Urgency: medium | Complexity: medium
Extensibility
Plugin registry
- A central repository of community-contributed Lua plugins.
- Install with
rsconstruct plugin install eslint. - Registry could be a GitHub repository with a JSON index.
- Version pinning in
rsconstruct.toml. - Urgency: low | Complexity: high
Project templates
- Initialize new projects with pre-configured processors and directory structure.
rsconstruct init --template=python,rsconstruct init --template=cpp, etc.- Custom templates from local directories or URLs.
- Urgency: low | Complexity: medium
Rule composition / aspects
- Attach cross-cutting behavior to all targets of a certain type (e.g., “add coverage analysis to every C++ compile”).
- Urgency: low | Complexity: high
Output groups / subtargets
- Named subsets of a target’s outputs that can be requested selectively.
- E.g.,
rsconstruct build --output-group=debugor per-product subtarget selection. - Useful for targets that produce multiple output types (headers, binaries, docs).
- Urgency: low | Complexity: medium
Visibility / access control
- Restrict which processors can consume which files or directories.
- Prevents accidental cross-boundary dependencies in large repos.
- Config: per-processor
visibilityrules or directory-level.rsconstruct-visibilityfiles. - Urgency: low | Complexity: medium
Developer Experience
Build Event Protocol / structured event stream
- rsconstruct already has
--jsonon stdout with JSON Lines events (BuildEvent, ProductStart, ProductComplete, BuildSummary) and--tracefor Chrome trace format. - A proper Build Event Protocol (file or gRPC stream) would enable external dashboards, CI integrations, and build analytics services beyond what JSON Lines provides.
- Write events to a file (
--build-event-log=events.pb) or stream to a remote service. - Richer event types: action graph, configuration, progress, test results.
- Urgency: medium | Complexity: medium
Build notifications
- Desktop notifications when builds complete, especially for long builds.
- Platform-specific:
notify-send(Linux),osascript(macOS). - Config:
notify = true,notify_on_success = false. - Urgency: low | Complexity: low
Parallel dependency analysis
- The cpp analyzer scans files sequentially, which can be slow for large codebases.
- Parallelize header scanning using rayon or tokio.
- Urgency: low | Complexity: medium
IDE / LSP integration
- Language Server Protocol server for IDE integration.
- Features: diagnostics, code actions, hover info, file decorations.
- Plugins for VS Code, Neovim, Emacs.
- Urgency: low | Complexity: high
Build log capture
- Save stdout/stderr from each product execution to a log file.
- Config:
log_dir = ".rsconstruct/logs",log_retention = 10. rsconstruct log ruff:main.pyto view logs.- Urgency: low | Complexity: medium
Build timing history
- Store timing data to
.rsconstruct/timings.jsonafter each build. rsconstruct timingsshows slowest products, trends, time per processor.- Urgency: low | Complexity: medium
Remote cache authentication
- S3 and HTTP/HTTPS remote caches are already supported.
- Still needed: explicit bearer token support, GCS backend, and environment variable substitution for secrets in config.
- Urgency: medium | Complexity: medium
rsconstruct lint — Run only checkers
- Convenience command to run only checker processors.
- Equivalent to
rsconstruct build -p ruff,pylint,...but shorter. - Urgency: low | Complexity: low
Watch mode keyboard commands
- During
rsconstruct watch, supportr(rebuild),c(clean),q(quit),Enter(rebuild now),s(status). - Only activate when stdin is a TTY.
- Urgency: low | Complexity: medium
Layered config files
- Support config file layering: system (
/etc/rsconstruct/config.toml), user (~/.config/rsconstruct/config.toml), project (rsconstruct.toml). - Lower layers provide defaults, higher layers override.
- Per-command overrides via
[build],[watch]sections. - Similar to Bazel’s
.bazelrclayering. - Urgency: low | Complexity: low
Test sharding
- Split large test targets across multiple parallel shards.
- Set
TEST_TOTAL_SHARDSandTEST_SHARD_INDEXenvironment variables for test runners. - Config:
shard_count = 4per processor or product. - Useful for pytest/doctest processors when added.
- Urgency: low | Complexity: medium
Runfiles / runtime dependency trees
- Track runtime dependencies (shared libs, config files, data files) separately from build dependencies.
- Generate a runfiles directory per executable with symlinks to all transitive runtime deps.
- Useful for deployment, packaging, and containerization.
- Urgency: low | Complexity: high
On-demand processors (build_by_default = false)
- Today every declared processor runs on every
rsconstruct build. The only per-invocation escape hatches are-x name(remember every time) orenabled = falsein the config (remember to flip back). Neither fits the “this processor exists, don’t run it unless I ask” use case — common for slow lifecycle processors likepython_package,docker_build,publish,release_tarball. - Add a per-processor boolean field defaulting to true:
build_by_default = falseon a processor means it’s discovered and classified like any other, but its products are filtered out of the default run. - Prior art: meson’s
build_by_default: false, Bazel’stags = ["manual"], buck2’stags = ["manual"]. All use the same shape — declarative opt-out on the rule, per-invocation opt-in via target naming. - CLI semantics map cleanly onto existing
-p/-xmachinery:rsconstruct build→ excludesbuild_by_default = falseprocessors (new behaviour).rsconstruct build -p python_package→ includes onlypython_package; the-pexplicit inclusion overrides the default-off flag.rsconstruct build -p ruff,python_package→ includes both, including the opt-in one.rsconstruct build --all(new flag) → includes everything including on-demand processors. Useful for CI that wants to verify the opt-in path doesn’t bitrot.
- Example config:
[processor.python_package] build_by_default = false src_dirs = ["."] - Design considerations:
@allmeta-shortcut: the existing@checkers/@generatorsaliases should continue to mean “all of that type, subject to the default-off filter.” Users who want “all checkers including on-demand ones” would sayrsconstruct build --all -p @checkers— rare enough that the composition is fine.- Error on contradiction:
-p X -x Xalready errors;-p Xwhere X hasbuild_by_default = falseshould just work (explicit opt-in wins over declarative opt-out). - Watch mode:
rsconstruct watchshould honour the same default — don’t rebuild the package processor on every file save. Users who want watch-mode packaging can add-p python_packageto the watch invocation. - Discovery cost: on-demand processors still run discovery every build, because we need to know what their products would be (for output-conflict detection, graph completeness, and
--allsupport). This is negligible — discovery is O(files matched), not O(cost of running).
- Follow-up idea: named goals (meson-style aggregated targets or npm-style scripts) for the “I want a lint goal / deploy goal / ci goal” pattern. That’s Pattern B, layered above per-processor config — not needed to solve the basic on-demand case.
- Urgency: medium | Complexity: low
Decomposed cache key for richer --explain
- Today every product has a single descriptor key that mixes input checksum + config hash + tool-version hash + variant. A miss tells us “the key changed” but not which component.
--explaincan only sayBUILD (no cache entry)/BUILD (output missing)— not “your cflags changed” or “an input file changed”. - Store the three sub-hashes (input, config, tool) in a new redb table keyed by stable product identity —
(processor_iname, primary_path)whereprimary_pathis the first output for generators or the first input for checkers. - Schema:
product_components: (processor, primary_path) -> { input_hash, config_hash, tool_hash, timestamp }. ~100 bytes per product, so ~500KB extra disk for a 5000-product project. - Reads only on
--explain.classify_productsalready routes throughexplain_descriptor; extend that to look up the prior components row, recompute current components, diff the three, and return a richer reason likeBUILD (config changed: cflags, include_paths). - Writes only when explicitly tracking. Two reasonable gates:
- Option A (single flag):
--explainenables both write and read. CI runs without--explain→ zero overhead. Trade-off: the first explain run after enabling has no prior row → reports “no prior state” generically. Subsequent runs work fully. - Option B (separate
--track-changes/[build] track_changes = true): decouples capture from query. CI omits the flag → zero overhead. Devs opt in permanently via config. - Lean Option A: fewer flags, the existing
--explaincarries both ends of the lifecycle, and CI/CD pays nothing by default since neither flag is set.
- Option A (single flag):
- Tier 1 only. Says “input bucket changed” but not which file. For a
.ccfile with 100 headers, the user still doesn’t know which header. A future Tier 2 (per-input-file checksums) would resolve that at ~5-10x storage cost; defer until users ask. - Caveats: adds a third source of truth (alongside
descriptorsand the in-memory graph) to keep in sync. Stale entries (products dropped from config) accumulate harmlessly untilcache clear. - Urgency: medium | Complexity: medium
Caching & Performance
Deferred materialization
- Don’t write cached outputs to disk until they’re actually needed by a downstream product.
- Urgency: low | Complexity: high
Garbage collection policy
- Time-based or size-based cache policies: “keep cache under 1GB” or “evict entries older than 30 days.”
- Config:
max_size = "1GB",max_age = "30d",gc_policy = "lru". rsconstruct cache gcfor manual garbage collection.- Urgency: low | Complexity: medium
Shared cache across branches
- Surface in
rsconstruct statuswhen products are restorable from another branch. - Already works implicitly via input hash matching.
- Urgency: low | Complexity: low
Merkle tree input hashing
- Hash inputs as a Merkle tree rather than flat concatenation.
- More efficient for large input sets — changing one file only rehashes its branch, not all inputs.
- Also enables efficient transfer of input trees to remote execution workers.
- Urgency: low | Complexity: medium
Reproducibility
Hermetic builds
- Control all inputs beyond tool versions: isolate env vars, control timestamps, sandbox network, pin system libraries.
- Config:
hermetic = true,allowed_env = ["HOME", "PATH"]. - Verification:
rsconstruct build --verifybuilds twice and compares outputs. - Urgency: low | Complexity: high
Determinism verification
rsconstruct build --verifymode that builds each product twice and compares outputs.- Urgency: low | Complexity: medium
CI & Reporting
CI config generator
rsconstruct ci generateoutputs a GitHub Actions or GitLab CI config that runs the build.- Detects enabled processors and required tools, generates install steps and build commands.
- Supports
--format=github|gitlab|circleci. - Urgency: medium | Complexity: medium
HTML build report
- Generate a visual HTML dashboard of build times, cache hit rates, and processor statistics.
rsconstruct build --report=build.htmlorrsconstruct report.- Include charts for timing trends, per-processor breakdown, cache efficiency.
- Urgency: low | Complexity: medium
PR comment bot
- Post build results (pass/fail, timing, warnings) as a GitHub PR comment.
rsconstruct ci commentreads build output and posts via GitHub API.- Urgency: low | Complexity: medium
Content & Documentation
rsconstruct init --detect
rsconstruct smart autoalready scans and enables processors, but a dedicatedinit --detectcould go further.- Generate a complete
rsconstruct.tomlwith processor-specific config (src_dirs, extensions, tool paths). - Urgency: medium | Complexity: medium
rsconstruct fmt — Auto-format rsconstruct.toml
- Sort
[processor.*]sections alphabetically, align values, remove redundant defaults. - Urgency: low | Complexity: low
Cross-project term sync
- Automatically keep terms directories in sync across multiple repos.
- Could run as a daemon or a periodic CI job.
rsconstruct terms sync --repos=repo1,repo2or config-driven.- Urgency: low | Complexity: medium
Glossary generator
rsconstruct terms glossarygenerates a markdown glossary from the terms directory.- Optionally pulls definitions from context in the markdown files where terms are used.
- Urgency: low | Complexity: medium
Link checker processor
- Validate that URLs in markdown files are not broken (HTTP HEAD requests).
- Configurable timeout, retry, and allow/blocklist patterns.
- Cache results to avoid re-checking unchanged URLs.
- Urgency: medium | Complexity: medium
Image optimizer processor
- Compress and resize images referenced in markdown files.
- Uses tools like
optipng,jpegoptim,svgo. - Config: quality levels, max dimensions, output format.
- Urgency: low | Complexity: medium
HTML+JS compression and packaging
- Minify and bundle HTML, CSS, and JavaScript files for deployment.
- Could use tools like
terser(JS),csso(CSS),html-minifier(HTML). - Bundle multiple JS/CSS files into single outputs, generate source maps.
- Integrate with existing eslint/stylelint processors for a full web frontend pipeline.
- Urgency: medium | Complexity: medium
Processor Ecosystem
WASM processor plugins
- Beyond Lua, allow processors written in any language compiled to WebAssembly.
- Provides sandboxing, portability, and language flexibility.
- WASI for filesystem access within the sandbox.
- Urgency: low | Complexity: high
Processor marketplace / registry
- A central repository of community-contributed processor configs and Lua plugins.
- Install with
rsconstruct plugin install prettier. - Registry as a GitHub repository with a JSON index. Version pinning in
rsconstruct.toml. - Urgency: low | Complexity: high
Cleaning & Cache
Time-based cache purge
rsconstruct cache purge --older-than=7dto remove cache entries older than a given duration.- Currently only
cache clearexists which removes everything. - Walk the object store, check file mtimes, remove old entries.
- Urgency: medium | Complexity: low
Enhanced cache statistics
rsconstruct cache statscurrently shows minimal info.- Add: hit rate percentage, bytes saved vs rebuild time, per-processor breakdown, slowest processors.
- Helps users identify optimization opportunities.
- Urgency: medium | Complexity: medium
CLI & UX
Configuration
Environment variable expansion in config
- Allow
${env:HOME}or${env:CI}inrsconstruct.tomlto reference environment variables. - The variable substitution system already exists for
[vars]; extending it to env vars is natural. - Useful for CI/CD systems that pass secrets or paths via environment.
- Urgency: medium | Complexity: low
Per-processor batch size
- Each processor config has a
batchboolean, but batch size is global ([build] batch_size). - Different tools have different startup costs — fast tools benefit from large batches, slow tools from small ones.
- Add
batch_sizefield to individual processor configs, overriding the global default. - Urgency: medium | Complexity: low
Processor Ecosystem
Flake8 (Python linter)
- Many projects still use flake8 over ruff. Widely adopted.
- Checker processor using
flake8. Batch-capable. - Urgency: medium | Complexity: low
Security
Shell command execution from source file comments
EXTRA_*_SHELLdirectives execute arbitrary shell commands parsed from source file comments.- Document the security implications clearly.
- Urgency: medium | Complexity: low
Internal Cleanups
These are code-quality items surfaced by an architecture audit. Each is
localized; none block features. See architecture-observations.md for
larger structural items.
Consolidate processor discovery helpers
src/processors/mod.rsexposesdiscover_checker_products,discover_directory_products,checker_discover,checker_auto_detect,checker_auto_detect_with_scan_root,scan_or_skip— all similar, with subtle differences (some auto-applydep_auto, some don’t; some validate scan roots, some don’t).- Choosing the wrong helper is a silent correctness issue: a processor that picks
discover_checker_productswhen it neededchecker_discoverlosesdep_automerging and never finds out. - Collapse to one or two helpers with explicit flags for the variations. Document the contract each helper commits to.
- Urgency: medium | Complexity: low
Remove / complete remote_pull scaffold in ObjectStore
src/object_store/mod.rshas aremote_pullfield andtry_fetch_*helpers inoperations.rsthat nothing calls.- Either finish the feature (wire the fetch helpers into the classify path) or delete the scaffold. Unused public-ish surface rots.
- Urgency: low | Complexity: medium (complete) / low (delete)
Drop or use processor_type on ProcessorPlugin
src/registries/processor.rshasprocessor_typemarked#[allow(dead_code)]with a comment about a futureprocessors list --type=checkerfilter.- Either ship the filter or drop the field until it’s needed. Dead fields with comments accumulate.
- Urgency: low | Complexity: low
TOOLS registry is monolithic and unsorted
src/processors/mod.rshas ~170 entries in a static array mixing Python, Node, Ruby, Rust, Perl, System categories with no alphabetic ordering within groups.- Hard to find a tool when adding one; hard to audit for gaps (a tool with no install command makes
doctorsilently unhelpful). - Split per-runtime into separate files or sort alphabetically within a section. Add a unit test that every processor’s
required_tools()entries have a matchingTOOLSrow (this test exists — keep it; make the table easier to satisfy). - Urgency: low | Complexity: low
Centralize alias expansion
expand_aliasesinsrc/builder/build.rshandles@checkers/@generators/@toolname/ bare-name syntaxes. It’s called once for-pand once for-x. Any new alias shortcut has to be added there.- No duplication today, but the function is in
build.rsdespite being useful elsewhere (completion,processors list,analyzers used). Move to a dedicated module and make it the canonical expander. - Urgency: low | Complexity: low
Inconsistent error-handling idioms in processors
- Some processors use
anyhow::bail!, someanyhow::Context::with_context(), some construct custom messages. The coding-standards doc already calls forwith_contexton every I/O operation, but processor-level error shape varies. - Pick one idiom per category (tool-failure vs. config-error vs. internal-error) and retrofit. Makes
--jsonerror events more uniform too. - Urgency: low | Complexity: low
Config validation timing
- Unknown-field and must-field validation runs inside
Config::load, which is correct. However, some cross-field validations (e.g. “cc_single_file needs include_paths if compiling C++”) happen later during processor creation or build. - Either pull all semantic validation into
Config::load(sotoml checkcatches everything) or accept that semantic errors surface later and document which is which. - Urgency: low | Complexity: medium
products list CLI
- Users can run
rsconstruct graph show(full graph) orrsconstruct status(per-processor summary), but there’s no flat list of “here are every product that would execute, with its primary input and output.” - Add
rsconstruct products list(parallel toprocessors listandanalyzers used). Respects-p/-x/--targetfilters. - Urgency: low | Complexity: low
ProductTiming.start_offset not populated for batch execution
src/processors/mod.rsdefinesstart_offsetonProductTiming; it’s populated for non-batch execution but may be None for batch paths.- Trace visualizations (
--trace) look jagged or incomplete when batches are involved. - Urgency: low | Complexity: low