- Created utils/retry.py with:
- RetryHandler with exponential backoff
- CircuitBreaker pattern
- Config for max attempts, delays
- Graceful degradation
- Updated LLM client to use retry logic
- API failures now retry with backoff
- Circuit breaker prevents cascade failures
- Graceful degradation on prolonged failures
This addresses the reliability gap identified in code review.
- test_nonfiction.py with 100+ test cases
- Tests for PurposeClassifier
- Tests for taxonomy/frameworks
- Tests for critique criteria
- Tests for all framework types
- Tests for intake agent
- Tests for multi-source ingest
This addresses the testing gap identified in code review.
Created agent_crawler.py:
AgentWebCrawler - AI-powered crawling that:
1. Analyzes site structure (LLM)
2. Decides what to crawl based on purpose
3. Scores relevance dynamically
4. Adapts as it learns more
5. Knows when it has enough
Purpose types:
- DOCUMENTATION - Technical docs, guides
- TRAINING - Learning materials
- KNOWLEDGE - General knowledge base
- RESEARCH - Research papers
- REFERENCE - Reference material
Usage:
Features:
- Content extraction (not HTML dump)
- Relevance scoring
- Rate limiting
- Configurable depth/pages
- Integration with multi-source ingest
Created academic_papers.py with 12 established academic paper types:
RESEARCH PAPERS:
1. Empirical Paper - experiments, data collection
2. Theoretical Paper - concepts, models, proofs
3. Methodology Paper - new methods/techniques
4. Case Study Paper - in-depth single case
5. Survey Paper - comprehensive field overview
ARGUMENTATIVE:
6. Position Paper - argue for a stance
7. Policy Brief - recommendations to decision-makers
CRITICAL ANALYSIS:
8. Critical Review - evaluate existing work
9. Meta-Analysis - statistical synthesis
SHORT FORMS:
10. Short Communication - brief findings report
11. Conference Proposal - conference abstract
12. Thesis Proposal - graduate research proposal
Each includes:
- Detailed stages
- Prompt templates
- Tone guidance
- Typical length
- Audience
Functions:
- get_academic_paper_types()
- suggest_academic_paper()
- Add --thread-id flag to CLI for checkpointing
- Add --resume flag to resume from checkpoint
- Generate UUID if no thread_id provided
- Display thread_id for user to save for resume
Usage:
opus generate --concept "My book" --thread-id abc123
# If fails:
opus generate --concept "My book" --thread-id abc123 --resume
Created research_integration.py to connect research agent to pipeline:
ResearchIntegrator class:
- research_for_book(): Research for entire book
- research_chapter(): Research specific chapter
- should_use_research(): Determine if purpose needs research
- get_research_stages(): When to integrate research
Research stages:
- Pre-writing: Gather research before writing
- Per-chapter: Research each chapter
- Verification: Check facts post-writing
- Enhancement: Strengthen content with research
Purpose-specific research config:
- UNDERSTAND: Deep research, include academic
- DECIDE: Deep, studies, data, comparisons
- TRANSFORM: Case studies, success stories
- LEARN_HANDS_ON: Best practices, methods
- REFERENCE: Comprehensive documentation
- BE_INSPIRED: Stories, journeys, examples
Functions:
- get_research_config_for_purpose()
The research agent is now integrated into the nonfiction pipeline.
- Created opus_orchestrator/nonfiction/classifier.py
- PurposeClassifier class with keyword-based classification
- LLM-enhanced classification (optional)
- ReaderPurpose enum (6 purposes)
- ClassificationResult dataclass
- Keyword classification covers:
- LEARN_HANDS_ON: how to, learn to, tutorial, skills, etc.
- UNDERSTAND: understand, why, concept, mental model, etc.
- TRANSFORM: change, become, improve, habits, etc.
- DECIDE: decide, choose, compare, vs, analysis
- REFERENCE: manual, handbook, comprehensive, API
- BE_INSPIRED: inspire, story, journey, biography
- Tests pass for all 6 purposes with high confidence
This is the foundation for the entire nonfiction pipeline (Issue #18).
- Created nonfiction_taxonomy.py with:
- ReaderPurpose enum (6 purposes)
- StructuralPattern enum (7 patterns)
- PURPOSE_STRUCTURE_MATRIX for intelligent selection
- NONFICTION_FRAMEWORKS (14+ frameworks)
- select_framework() function
- Created docs/NONFICTION_PIPELINE.md documenting the workflow
This is the foundation for Issue #16 (Nonfiction Underdeveloped)
Team 4: Architecture & Design
Fixed:
- #6: Created unified state adapter (state_adapter.py)
- StateAdapter class to convert between OpusState and OpusGraphState
- create_unified_state() for initializing either system
- graph_to_opus() and opus_to_graph() converters
- #11: Added output validation to BaseAgent
- validate_output() method that parses JSON and validates against Pydantic schemas
- Extracts JSON from markdown code blocks
- Returns validated model or error message
- #12: Already properly handled (orchestrator imports get_framework_prompt)
Team 3: Infrastructure & Config
Fixed:
- #4: GitHub Ingestor now works without token for public repos
- Token is now optional
- Uses unauthenticated requests (with rate limit warning) when no token
- Private repos still require token
- #14: Added startup API key validation
- get_config() now validates API keys at startup
- Raises clear error if neither OPENAI_API_KEY nor MINIMAX_API_KEY is set
- Fail-fast instead of silent failures
- #10: Added CostConfig for rate limiting and budget controls
- max_tokens_per_run: limit tokens per generation
- max_cost_usd: budget cap in dollars
- track_usage: enable/disable usage tracking
- price_per_million_tokens: pricing by model
Team 2: Agent & Workflow Repair Crew
Fixed:
- #5: CrewAI LLM factory now properly uses provider/model params
- Supports openai, anthropic, and minimax
- Raises error for unknown providers instead of silently using OpenAI
- Validates API keys are present
- #2: AutoGen critique now actually revises chapters
- iterate_chapter() now applies revision suggestions
- Uses Writer agent to revise based on critique feedback
- Returns revised_content in the result
NEW frameworks:
- Diátaxis Tutorial - Learn by doing a project
- Diátaxis How-To - Accomplish a specific task
- Diátaxis Explanation - Clarify and deepen understanding
- Diátaxis Reference - Complete information lookup
- Technical Manual - From foundations to mastery
- Codebase Tour - Document code systematically
- API Documentation - Complete API reference
NonfictionGenerator class to use these frameworks.
CLI integration with --framework flag.
Example:
opus generate --framework codebase-tour --concept 'Linux Kernel'
- LocalIngestor: Include ALL files by default (source code, configs, etc.)
- GitHubIngestor: Include ALL files by default
- AI witnesses everything and transforms it into documentation
- Filter only build artifacts (.pyc, .so, dist, build)
Philosophy: Don't filter what the AI can see - let it decide
what's relevant. The AI can document code directly!
Research Tools:
- SearchTool: Multiple backends (Tavily, Serper, Brave, DuckDuckGo)
- WikipediaTool: Wikipedia lookup
- AcademicSearchTool: CrossRef, Semantic Scholar
- ResearchOrchestrator: Comprehensive multi-source research
ResearchAgent:
- NOT just fact-checking - actively discovers NEW information
- Identifies trends beyond training data cutoff
- Generates innovations from cross-referencing sources
- Deep research with subtopics
VerifiedFactChecker:
- Live claim verification against web sources
- Confidence scoring
- Citation needed detection
Dependencies added: tavily, wikipedia, arxiv, duckduckgo-search
- Merge llm.py + llm_sync.py into single unified client
- Remove llm_sync.py (now just llm.py with both sync/async)
- Add requests to dependencies
- Add Dockerfile for containerized deployment
- Add .dockerignore
All issues resolved!
- Web UI: novice-friendly interface at / and /ui
- Upload endpoint: /upload for file uploads
- S3 upload: /upload/s3 for uploading to S3/MinIO
- CLI: opus ui command to start web UI only
- Full HTML/CSS/JS interface with drag-drop, tabs, etc.
- LocalIngestor class for files and directories
- CLI: opus ingest-local PATH
- Generate from local: opus generate --local ./my-notes/
- Support for extensions, recursive scanning, summarize
- Pattern-based exclusion (.git, __pycache__, etc.)
- Complete architecture diagram
- All features documented with status
- Full CLI, Python, API client examples
- Configuration reference
- Project structure
- Handle dict result properly in /generate endpoint
- Remove use_autogen from API (not supported in run_opus)
- API client now works: tested via CLI with local server
Now CLI works in both local and client/server mode:
# Local mode (default)
opus generate --concept ...
# API client mode
opus --api-url http://localhost:8000 generate --concept ...
opus --api-url https://opus-api.example.com generate --repo owner/repo
OpusAPIClient class for programmatic API access.