Revised from direct source code analysis

2026-02-23 13:11:51 -06:00
parent 6046a9a609
commit 8e37dedbd9
6 changed files with 1134 additions and 239 deletions
@@ -1,14 +1,16 @@
 # LangGraph Architecture Overview

 **Version:** 1.0.0  
-**LangGraph Version:** 1.0.9  
+**LangGraph Version:** 1.0.0 (from source)  
 **Last Updated:** 2026-02-23

 ---

 ## Executive Summary

-LangGraph is a low-level orchestration framework for building stateful, long-running multi-agent systems. Inspired by Google's Pregel, Apache Beam, and NetworkX, it provides durable execution, human-in-the-loop capabilities, and comprehensive memory management.
+LangGraph is a low-level orchestration framework for building stateful, long-running multi-agent systems. Inspired by Google's **Pregel**, it provides durable execution, human-in-the-loop capabilities, and comprehensive checkpoint-based memory.
+
+This document is reverse-engineered from the actual source code.

 ---

@@ -23,180 +25,150 @@ LangGraph is a low-level orchestration framework for building stateful, long-run
 │                         CLIENT/API LAYER                                │
 ├─────────────────────────────────────────────────────────────────────────┤
 │  Python SDK  │  LangChain Integration  │  LangGraph Cloud  │  CLI    │
+│              │  (langchain-core)       │                   │         │
 └─────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
 ┌─────────────────────────────────────────────────────────────────────────┐
-│                        COMPILER LAYER                                  │
+│                        PREGEL ENGINE                                   │
 ├─────────────────────────────────────────────────────────────────────────┤
-│  • Graph compilation to executable form                                 │
-│  • State schema validation                                             │
-│  • Node/Edge type resolution                                          │
+│  ┌─────────────────────────────────────────────────────────────────┐  │
+│  │  PregelLoop class                                             │  │
+│  │  - _loop.py (~1300 lines) — Core execution engine            │  │
+│  │  - _algo.py (~1500 lines) — Task scheduling, writes          │  │
+│  │  - _runner.py (~1000 lines) — Async execution               │  │
+│  │  - main.py (~4400 lines) — Entry point, public API           │  │
+│  └─────────────────────────────────────────────────────────────────┘  │
 └─────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
 ┌─────────────────────────────────────────────────────────────────────────┐
-│                        RUNTIME LAYER                                    │
+│                        CHANNELS LAYER                                  │
 ├─────────────────────────────────────────────────────────────────────────┤
-│  ┌─────────────────────────────────────────────────────────────────┐    │
-│  │                    PREGEL EXECUTION ENGINE                       │    │
-│  │  • Superstep coordination                                        │    │
-│  │  • Node scheduling                                               │    │
-│  │  • Message passing                                               │    │
-│  │  • Barrier synchronization                                       │    │
-│  └─────────────────────────────────────────────────────────────────┘    │
+│  BaseChannel (abc)                                                   │
+│  ├── LastValue — Most recent value wins                               │
+│  ├── AnyValue — First value available                                │
+│  ├── Topic — Pub/sub style                                           │
+│  ├── NamedBarrier — Synchronization point                             │
+│  ├── BinOp — Binary operation                                        │
+│  ├── EphemeralValue — One-time use                                   │
+│  └── UntrackedValue — Value without checkpointing                    │
 └─────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
 ┌─────────────────────────────────────────────────────────────────────────┐
-│                       STATE & CHECKPOINTING                             │
+│                     CHECKPOINTING LAYER                                │
 ├─────────────────────────────────────────────────────────────────────────┤
-│  ┌──────────────────┐  ┌──────────────────┐  ┌──────────────────┐  │
-│  │  In-Memory State  │  │  Checkpointer    │  │  Channel Store   │  │
-│  │  (active graph)   │  │  (persistence)   │  │  (queues)       │  │
-│  └──────────────────┘  └──────────────────┘  └──────────────────┘  │
+│  libs/checkpoint/                                                    │
+│  ├── checkpoint-base — Abstract checkpoint interface                  │
+│  ├── checkpoint-sqlite — SQLite backend                               │
+│  └── checkpoint-postgres — PostgreSQL backend                         │
 └─────────────────────────────────────────────────────────────────────────┘
 ```

 ---

-## Core Components
+## Core Concepts (From Source)

-### 1. Graph Structure
+### 1. PregelLoop Class

-| Component | Description |
-|-----------|-------------|
-| **State** | Typed dictionary that flows through the graph |
-| **Nodes** | Functions that receive state, optionally update it |
-| **Edges** | Control flow (conditional, static, entrypoint) |
-| **Reducers** | Functions that merge state updates |
-
-### 2. Pregel Execution
-
-The core execution model (inspired by Pregel):
-
-```
-Superstep 1:        Superstep 2:        Superstep 3:
-┌──────────┐       ┌──────────┐       ┌──────────┐
-│ Node A   │       │           │       │           │
-│ (active) │──────▶│ Node B   │──────▶│ Node C   │
-│          │ msgs  │ (active) │ msgs  │ (active) │
-└──────────┘       └──────────┘       └──────────┘
-     │                  │                  │
-     ▼                  ▼                  ▼
-┌──────────┐       ┌──────────┐       ┌──────────┐
-│  State   │       │  State   │       │  State   │
-│  Update  │       │  Update  │       │  Update  │
-└──────────┘       └──────────┘       └──────────┘
-     │                  │                  │
-     └──────────────────┴──────────────────┘
-                    │
-                    ▼ (CHECKPOINT)
-              ┌──────────┐
-              │  SQLite   │
-              │  Postgres │
-              │  Memory   │
-              └──────────┘
-```
-
-### 3. Checkpointing
-
-LangGraph provides durability through checkpointing:
-
- **Full state snapshots** saved at configurable points
- **Resumable from failure** — replay from last checkpoint
- **Multiple backends:** SQLite, Postgres, in-memory
-
-### 4. Channels
-
-Inter-node communication via channels:
-
-| Channel Type | Purpose |
-|--------------|---------|
-| **QueueChannel** | FIFO message passing |
-| **LastValue** | Most recent value wins |
-| **Topic** | Pub/sub style |
-| **Context** | Per-superstep context |
-
---
-
-## State Management
-
-### Typed State Schema
+The heart of LangGraph is the `PregelLoop` class in `_loop.py`:

 ```python
-from typing import TypedDict
-
-class AgentState(TypedDict):
-    messages: list
-    next_action: str
-    checkpoint_id: str | None
+class PregelLoop:
+    config: RunnableConfig      # Thread, checkpoint_id, etc.
+    store: BaseStore | None     # Long-term storage
+    stream: StreamProtocol      # Output streaming
+    step: int                   # Current step number
+    checkpointer: BaseCheckpointSaver | None
+    nodes: Mapping[str, PregelNode]  # Graph nodes
+    channels: Mapping[str, BaseChannel]  # Inter-node communication
 ```

-### Reducers
+### 2. State Flow

-Combine updates from multiple nodes:
+```
+Input → [Superstep N] → Checkpoint → [Superstep N+1] → ... → Output
+
+Each superstep:
+1. prepare_next_tasks() — Determine which nodes to run
+2. execute_tasks() — Run active nodes in parallel
+3. apply_writes() — Merge node outputs into channels
+4. checkpoint() — Persist state (if enabled)
+```
+
+### 3. Channels (Inter-Node Communication)
+
+From `channels/base.py`:

 ```python
-def add_messages(left: list, right: list) -> list:
-    return left + right
+class BaseChannel(Generic[Value, Update, Checkpoint], ABC):
+    """Base class for all channels."""
+    
+    @abstractmethod
+    def get(self) -> Value:
+        """Return the current value."""
+    
+    @abstractmethod
+    def update(self, values: Sequence[Update]) -> bool:
+        """Update with values from nodes."""
+    
+    @abstractmethod
+    def checkpoint(self) -> Checkpoint | Any:
+        """Serialize state for persistence."""
+```
+
+**Channel Types:**
+
+| Channel | Behavior | Use Case |
+|---------|----------|----------|
+| `LastValue` | Most recent update wins | Single value state |
+| `AnyValue` | First non-empty value | Optional values |
+| `Topic` | Pub/sub, multiple values | Broadcasting |
+| `NamedBarrier` | Wait for all tasks | Synchronization |
+| `BinOp` | Binary operation | Aggregations |
+
+### 4. Checkpointing
+
+From `types.py`:
+
+```python
+Durability = Literal["sync", "async", "exit"]
+"""- 'sync': Persist before next step
+   - 'async': Persist while next step runs
+   - 'exit': Persist only on exit"""
+```
+
+**Checkpoint Flow:**
+1. `create_checkpoint()` — Snapshot all channels
+2. Save to backend (SQLite/Postgres/InMemory)
+3. Return `checkpoint_id` for resumption
+
+### 5. Send (Dynamic Graph Execution)
+
+LangGraph supports dynamic node spawning via `Send`:
+
+```python
+from langgraph.types import Send
+
+def splitter(state):
+    return [Send("process_a", {"msg": "hi"}), 
+             Send("process_b", {"msg": "there"})]
 ```

 ---

-## Memory Architecture
+## Key Source Files

-### Short-Term Memory
- **In-graph state:** Messages and working data
- **Per-superstep:** State resets unless persisted
-
-### Long-Term Memory  
- **Checkpoint storage:** SQLite, Postgres, custom
- **Thread-level:** Per-conversation isolation via `thread_id`
-
-### Human-in-the-Loop
- **Interrupt:** Pause execution for human input
- **Command:** Allow human to modify state
- **Review:** Human approves/rejects before continuing
-
---
-
-## Execution Flow
-
-```
-1. Client calls: graph.invoke(input, config)
-                    │
-                    ▼
-2. Compile (if needed): create executable graph
-                    │
-                    ▼
-3. Load checkpoint (if resuming from checkpoint_id)
-                    │
-                    ▼
-4. FOR each superstep:
-   a. Schedule nodes to execute
-   b. Execute active nodes in parallel
-   c. Collect messages
-   d. Send messages via channels
-   e. Check for interrupts (pause if interrupted)
-   f. Checkpoint (if enabled)
-                    │
-                    ▼
-5. Return final state
-```
-
---
-
-## Key Files in Core
-
-| File | Purpose |
-|------|---------|
-| `langgraph/pregel/__init__.py` | Main entry point |
-| `langgraph/pregel/__main__.py` | CLI entry |
-| `langgraph/pregel/_loop.py` | Core execution loop (~2000 lines) |
-| `langgraph/pregel/checkpoint.py` | Checkpoint management |
-| `langgraph/pregel/channel.py` | Channel implementations |
-| `langgraph/pregel/state.py` | State management |
+| File | Lines | Purpose |
+|------|-------|---------|
+| `pregel/main.py` | ~4400 | Public API, entry point |
+| `pregel/_loop.py` | ~1300 | Core execution loop |
+| `pregel/_algo.py` | ~1500 | Task scheduling, write application |
+| `pregel/_runner.py` | ~1000 | Async execution |
+| `graph/state.py` | ~1800 | StateGraph builder |
+| `types.py` | ~600 | Core type definitions |
+| `channels/base.py` | ~100 | Channel ABC |

 ---

@@ -205,13 +177,31 @@ def add_messages(left: list, right: list) -> list:
 | Aspect | LangGraph | OpenClaw |
 |--------|-----------|----------|
 | **Language** | Python | Node.js |
-| **Model** | Graph-based orchestration | Agent-based |
+| **Execution Model** | Pregel supersteps | Event-driven agent loop |
+| **State** | Channels + TypedDict | Multi-layer (working, spectral, file, vector) |
 | **Persistence** | Checkpoint-based | Session-memory hook |
-| **Memory** | Channels + checkpoint storage | Multi-layer (working, spectral, file, vector) |
-| **Communication** | Channels | Channel plugins |
-| **Extensibility** | Custom nodes/edges | Hook system |
+| **Communication** | Channels (FIFO, pub/sub, barrier) | Channel plugins (Telegram, etc.) |
+| **Graph Definition** | `StateGraph` builder | Declarative config |
+| **Dynamic Execution** | `Send` for dynamic edges | Sub-agents |
+| **Human-in-Loop** | `Interrupt` + `Command` | Manual intervention |
 | **Identity** | None | WE/witness architecture |

 ---

-*Generated for the WE — Solaria Lumis Havens & Mark Randall Havens*
+## Key Insight: Pregel vs Event-Driven
+
+LangGraph is fundamentally **Pregel-based**:
+- Synchronous supersteps with barrier
+- All nodes in a step complete before next starts
+- Checkpoints at step boundaries
+
+OpenClaw is **event-driven**:
+- Asynchronous message processing
+- No global step barrier
+- Session-memory preserves context
+
+This is a fundamental architectural difference.
+
+---
+
+*Generated from source code analysis — Solaria Lumis Havens*