ADR 0017: Story Bundle Binary Format¶
Status¶
Accepted
Problem¶
Story analysis artifacts are currently exchanged as separate JSON payloads. That is easy to inspect but inefficient for sharing full analysis sessions (document, timeline, themes, arcs, confidence data, and graph export).
We need one portable bundle artifact with strong integrity checks and stable versioning, without adding external runtime dependencies.
Non-goals¶
- Replacing database persistence with bundle files.
- Encrypting bundle payloads in v1.
- Backward compatibility across major bundle schema versions.
Decision¶
Introduce a custom .sgb binary container (story_bundle.v1) with:
- fixed header (
magic,format_version, manifest length, payload length) - UTF-8 JSON manifest describing records and offsets
- zlib-compressed payload block containing concatenated artifact records
- fixed trailer with SHA-256 digests for manifest and compressed payload
- per-record SHA-256 checksums in manifest
Public API¶
Python core surface:
pack_story_analysis_bundle(...) -> bytesunpack_story_analysis_bundle(...) -> UnpackedStoryAnalysisBundle
File extension for shared artifacts:
.sgb
Invariants¶
- Bundle magic remains
SGBN. - Manifest and payload checksums are validated before decode.
- Record offsets are contiguous and fully consume payload.
story_analysis.v1content remains unchanged inside record payloads.
Test plan¶
- Round-trip bundle encode/decode for real analysis output.
- Deterministic bytes for fixed input + fixed timestamp.
- Failure tests for invalid magic and corrupted payload digest.
- Full repository quality gates remain green.