Tags

Tags give the ability to mark specific points in history as being important
  • v0.1.0

    protected
    release v0.1.0
    
    Initial public release. Bootstrapped fast to enable the agent-shape
    methodology research; no breaking-change history before this point.
    
    - `agent-shape.toml` schema (`subject`, `fixture`, `run`, `judge`,
      `tasks.tuning`, `tasks.holdout`, optional `commands.top_level`).
    - `jig run`: spawns `claude -p --output-format stream-json` per
      `(task, model, trial_index)` cell, captures the transcript, scores
      it with an LLM-as-judge, and emits a JSON or Markdown report.
    - `jig check`: validates an `agent-shape.toml` and, with `--binary`,
      cross-references the rubric's `commands.top_level` against the
      binary's `--help` output to catch rubric drift before it produces
      phantom inventions.
    - `jig render`: rerenders a previously-emitted JSON report as
      Markdown with no API calls.
    - `jig compare`: per-cell delta between baseline and treated reports,
      pure JSON-in / Markdown-out.
    - `jig rejudge`: re-scores transcripts in a checkpoint against an
      updated rubric without re-running the agent trials. Supports
      resume from output checkpoint and skips judge-error cells with a
      summary.
    - Trial checkpointing: every completed `(trial, verdict)` pair is
      appended as one JSON line so a killed run resumes without losing
      prior work.
    - Subject-mismatch guard on `run`: `--subject <name>` aborts when the
      TOML's `subject.name` does not match, preventing the wrong rubric
      from being applied to a fixture.
    - `[fixture].strip_env` field in `agent-shape.toml`: list of
      caller-side environment variables to remove before spawning the
      agent or fixture. Replaces an earlier hardcoded subject-specific
      strip list, so the runner is genuinely subject-agnostic. The
      default behavior strips known subject-tool environment variables
      before spawning so trials start from a clean slate.
    - `examples/agent-shape.example.toml`: worked example.
      `templates/agent-shape.toml`: starter template with REPLACE-ME
      markers for new adopters.
    - Library crate: `nomograph-jig` exposes `runner`, `judge`, `report`,
      `schema`, and `checkpoint` modules so callers can drive the harness
      programmatically.
    - Project hygiene for the first tag: shared `rustfmt.toml` (edition
      2024, max width 100) and `clippy.toml` (MSRV 1.88); integration
      test suite covering `--version`, `--help`, `check`, `render`, and
      `compare` against the release binary; `Makefile` with `build` /
      `test` / `lint` / `check` / `install` verbs; `CONTRIBUTING.md`
      short-form contributor guide; `deny.toml` with a standard license
      allow-list; self-hosted `agent-shape.toml` so jig itself can be
      measured under the methodology it implements.
    - Cargo metadata: `rust-version = "1.88"` pin and an `exclude` list
      so the published tarball drops CI config, local audit notes, and
      generated artifacts.
    
    - Inter-rater reliability via optional `double_score`; the judge runs
      twice and the per-trial absolute delta is reported.
    - Tuning vs holdout battery split is in the schema from v1; the
      holdout corpus is intentionally empty until independent authors
      populate it.
    
    - README written to a higher information density: badges, install,
      quickstart, command reference, methodology pointer.
    - CLAUDE.md with build verbs, release checklist, house style, and
      architecture notes.
    - Em dashes purged from every shipped artifact: source comments,
      help text, templates, README, CLAUDE.md, and the Markdown report's
      empty-list placeholder.
    
    [0.1.0]: https://gitlab.com/nomograph/jig/-/tags/v0.1.0