ADR 0006: Story-First Feature Extraction and Schema Enforcement¶
Status¶
Accepted
Problem¶
We need a robust "start from story, then extract features" pipeline with predictable schemas so downstream generation controls and evaluations stay stable over time.
Non-goals¶
- Full semantic NLP stack rollout in this ADR.
- Production analytics warehouse design.
- Replacing existing story CRUD contracts.
Decision¶
Introduce a deterministic chapter feature extraction pipeline:
- core extraction logic in
core/story_feature_pipeline.py - persistence adapter in
adapters/sqlite_feature_store.py - API endpoints for extraction and latest run retrieval
Adopt explicit schema/version enforcement:
- pydantic contract strict mode (
extra="forbid") - schema constant
story_features.v1 - table-level version registry (
feature_schema_versions) - fail-fast on schema mismatch
Public API¶
New endpoints:
POST /api/v1/stories/{story_id}/features/extractGET /api/v1/stories/{story_id}/features/latest
New CLI:
story-features
Invariants¶
- extraction starts from persisted story blueprint chapters
- feature run is owner-scoped and story-scoped
- schema version mismatch blocks writes/reads until migrated
Test plan¶
- unit tests for core extraction behavior
- adapter tests for persistence and schema mismatch failures
- API tests for extraction lifecycle and owner isolation
- contract tests for docs/entrypoints updates