- Created opus_orchestrator/nonfiction/classifier.py
- PurposeClassifier class with keyword-based classification
- LLM-enhanced classification (optional)
- ReaderPurpose enum (6 purposes)
- ClassificationResult dataclass
- Keyword classification covers:
- LEARN_HANDS_ON: how to, learn to, tutorial, skills, etc.
- UNDERSTAND: understand, why, concept, mental model, etc.
- TRANSFORM: change, become, improve, habits, etc.
- DECIDE: decide, choose, compare, vs, analysis
- REFERENCE: manual, handbook, comprehensive, API
- BE_INSPIRED: inspire, story, journey, biography
- Tests pass for all 6 purposes with high confidence
This is the foundation for the entire nonfiction pipeline (Issue #18).
- Created nonfiction_taxonomy.py with:
- ReaderPurpose enum (6 purposes)
- StructuralPattern enum (7 patterns)
- PURPOSE_STRUCTURE_MATRIX for intelligent selection
- NONFICTION_FRAMEWORKS (14+ frameworks)
- select_framework() function
- Created docs/NONFICTION_PIPELINE.md documenting the workflow
This is the foundation for Issue #16 (Nonfiction Underdeveloped)
Team 4: Architecture & Design
Fixed:
- #6: Created unified state adapter (state_adapter.py)
- StateAdapter class to convert between OpusState and OpusGraphState
- create_unified_state() for initializing either system
- graph_to_opus() and opus_to_graph() converters
- #11: Added output validation to BaseAgent
- validate_output() method that parses JSON and validates against Pydantic schemas
- Extracts JSON from markdown code blocks
- Returns validated model or error message
- #12: Already properly handled (orchestrator imports get_framework_prompt)
Team 3: Infrastructure & Config
Fixed:
- #4: GitHub Ingestor now works without token for public repos
- Token is now optional
- Uses unauthenticated requests (with rate limit warning) when no token
- Private repos still require token
- #14: Added startup API key validation
- get_config() now validates API keys at startup
- Raises clear error if neither OPENAI_API_KEY nor MINIMAX_API_KEY is set
- Fail-fast instead of silent failures
- #10: Added CostConfig for rate limiting and budget controls
- max_tokens_per_run: limit tokens per generation
- max_cost_usd: budget cap in dollars
- track_usage: enable/disable usage tracking
- price_per_million_tokens: pricing by model
Team 2: Agent & Workflow Repair Crew
Fixed:
- #5: CrewAI LLM factory now properly uses provider/model params
- Supports openai, anthropic, and minimax
- Raises error for unknown providers instead of silently using OpenAI
- Validates API keys are present
- #2: AutoGen critique now actually revises chapters
- iterate_chapter() now applies revision suggestions
- Uses Writer agent to revise based on critique feedback
- Returns revised_content in the result
NEW frameworks:
- Diátaxis Tutorial - Learn by doing a project
- Diátaxis How-To - Accomplish a specific task
- Diátaxis Explanation - Clarify and deepen understanding
- Diátaxis Reference - Complete information lookup
- Technical Manual - From foundations to mastery
- Codebase Tour - Document code systematically
- API Documentation - Complete API reference
NonfictionGenerator class to use these frameworks.
CLI integration with --framework flag.
Example:
opus generate --framework codebase-tour --concept 'Linux Kernel'
- LocalIngestor: Include ALL files by default (source code, configs, etc.)
- GitHubIngestor: Include ALL files by default
- AI witnesses everything and transforms it into documentation
- Filter only build artifacts (.pyc, .so, dist, build)
Philosophy: Don't filter what the AI can see - let it decide
what's relevant. The AI can document code directly!
Research Tools:
- SearchTool: Multiple backends (Tavily, Serper, Brave, DuckDuckGo)
- WikipediaTool: Wikipedia lookup
- AcademicSearchTool: CrossRef, Semantic Scholar
- ResearchOrchestrator: Comprehensive multi-source research
ResearchAgent:
- NOT just fact-checking - actively discovers NEW information
- Identifies trends beyond training data cutoff
- Generates innovations from cross-referencing sources
- Deep research with subtopics
VerifiedFactChecker:
- Live claim verification against web sources
- Confidence scoring
- Citation needed detection
Dependencies added: tavily, wikipedia, arxiv, duckduckgo-search
- Merge llm.py + llm_sync.py into single unified client
- Remove llm_sync.py (now just llm.py with both sync/async)
- Add requests to dependencies
- Add Dockerfile for containerized deployment
- Add .dockerignore
All issues resolved!
- Web UI: novice-friendly interface at / and /ui
- Upload endpoint: /upload for file uploads
- S3 upload: /upload/s3 for uploading to S3/MinIO
- CLI: opus ui command to start web UI only
- Full HTML/CSS/JS interface with drag-drop, tabs, etc.
- LocalIngestor class for files and directories
- CLI: opus ingest-local PATH
- Generate from local: opus generate --local ./my-notes/
- Support for extensions, recursive scanning, summarize
- Pattern-based exclusion (.git, __pycache__, etc.)
- Complete architecture diagram
- All features documented with status
- Full CLI, Python, API client examples
- Configuration reference
- Project structure
- Handle dict result properly in /generate endpoint
- Remove use_autogen from API (not supported in run_opus)
- API client now works: tested via CLI with local server
Now CLI works in both local and client/server mode:
# Local mode (default)
opus generate --concept ...
# API client mode
opus --api-url http://localhost:8000 generate --concept ...
opus --api-url https://opus-api.example.com generate --repo owner/repo
OpusAPIClient class for programmatic API access.
CLI Commands:
- generate: Full manuscript generation with full GitHub content
- serve: Start FastAPI server with OpenAPI docs
- ingest: Standalone GitHub ingestion
- frameworks: List all story frameworks
- config: Show configuration
- docs: Show comprehensive docs (terminal/markdown/html)
- api: Export OpenAPI spec
Server:
- FastAPI with /docs, /redoc interactive docs
- /generate, /ingest, /frameworks, /health endpoints
- OpenAPI 3.0 specification
Documentation:
- Terminal, markdown, and HTML formats
- Full API reference
- Framework documentation
- Environment variables guide
- Project structure
Fix: Use full GitHub content as seed (not just 5000 chars)
- OpusPydanticAgent with schema validation
- StorySeed, CharacterProfile, ChapterOutline, ChapterDraft schemas
- CritiqueResult, StyleGuide schemas
- Factory functions for each agent type
- Test: successfully generated StyleGuide with validated output
Usage:
from opus_orchestrator import create_style_guide_agent
agent = create_style_guide_agent()
result = agent.run_sync('Create a style guide for...')
# result is a validated StyleGuide object
- Wire GitHubIngestor into the orchestrator
- Test: successfully ingested 251k chars from The-Last-Love-Story repo
- Also verified: run_opus generates 1k+ word stories with AutoGen critique
- GitHubIngestor class to fetch repo contents
- Support for .md, .txt, .notes, .draft files
- Method to ingest from GitHub directly into orchestrator
- Export GitHubIngestor in __init__.py
Usage:
orch = OpusOrchestrator(book_type='fiction', genre='memoir')
content = await orch.ingest_from_github('mrhavens/my-notes')
await orch.run()
- Add CritiqueCrew initialization in OpusGraph.__init__
- Add use_autogen flag to enable/disable
- Add critique_summary and critique_iterations to state
- Rewrite node_write_chapters with AutoGen critique loop
- Each chapter now gets multi-agent critique (LiteraryCritic, GenreExpert, StoryEditor)
- Iteration loop until approved or max iterations
- Full integration complete
Based on Gemini's analysis:
1. Nodes now return dicts instead of mutating state
2. run() uses stream_mode='values'
3. Falls back to get_state() from checkpointer
4. Uses model_copy() for Pydantic updates
Full pipeline runs end-to-end:
- All 7 pre-writing stages (seed → scene descriptions)
- Style guide generation
- Chapter writing (3 chapters, 2,833 words tested)
Fixed result extraction from graph.stream().