Revised from direct source code analysis
This commit is contained in:
+137
-147
@@ -1,14 +1,16 @@
|
|||||||
# LangGraph Architecture Overview
|
# LangGraph Architecture Overview
|
||||||
|
|
||||||
**Version:** 1.0.0
|
**Version:** 1.0.0
|
||||||
**LangGraph Version:** 1.0.9
|
**LangGraph Version:** 1.0.0 (from source)
|
||||||
**Last Updated:** 2026-02-23
|
**Last Updated:** 2026-02-23
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Executive Summary
|
## Executive Summary
|
||||||
|
|
||||||
LangGraph is a low-level orchestration framework for building stateful, long-running multi-agent systems. Inspired by Google's Pregel, Apache Beam, and NetworkX, it provides durable execution, human-in-the-loop capabilities, and comprehensive memory management.
|
LangGraph is a low-level orchestration framework for building stateful, long-running multi-agent systems. Inspired by Google's **Pregel**, it provides durable execution, human-in-the-loop capabilities, and comprehensive checkpoint-based memory.
|
||||||
|
|
||||||
|
This document is reverse-engineered from the actual source code.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -23,180 +25,150 @@ LangGraph is a low-level orchestration framework for building stateful, long-run
|
|||||||
│ CLIENT/API LAYER │
|
│ CLIENT/API LAYER │
|
||||||
├─────────────────────────────────────────────────────────────────────────┤
|
├─────────────────────────────────────────────────────────────────────────┤
|
||||||
│ Python SDK │ LangChain Integration │ LangGraph Cloud │ CLI │
|
│ Python SDK │ LangChain Integration │ LangGraph Cloud │ CLI │
|
||||||
|
│ │ (langchain-core) │ │ │
|
||||||
└─────────────────────────────────────────────────────────────────────────┘
|
└─────────────────────────────────────────────────────────────────────────┘
|
||||||
│
|
│
|
||||||
▼
|
▼
|
||||||
┌─────────────────────────────────────────────────────────────────────────┐
|
┌─────────────────────────────────────────────────────────────────────────┐
|
||||||
│ COMPILER LAYER │
|
│ PREGEL ENGINE │
|
||||||
├─────────────────────────────────────────────────────────────────────────┤
|
├─────────────────────────────────────────────────────────────────────────┤
|
||||||
│ • Graph compilation to executable form │
|
│ ┌─────────────────────────────────────────────────────────────────┐ │
|
||||||
│ • State schema validation │
|
│ │ PregelLoop class │ │
|
||||||
│ • Node/Edge type resolution │
|
│ │ - _loop.py (~1300 lines) — Core execution engine │ │
|
||||||
|
│ │ - _algo.py (~1500 lines) — Task scheduling, writes │ │
|
||||||
|
│ │ - _runner.py (~1000 lines) — Async execution │ │
|
||||||
|
│ │ - main.py (~4400 lines) — Entry point, public API │ │
|
||||||
|
│ └─────────────────────────────────────────────────────────────────┘ │
|
||||||
└─────────────────────────────────────────────────────────────────────────┘
|
└─────────────────────────────────────────────────────────────────────────┘
|
||||||
│
|
│
|
||||||
▼
|
▼
|
||||||
┌─────────────────────────────────────────────────────────────────────────┐
|
┌─────────────────────────────────────────────────────────────────────────┐
|
||||||
│ RUNTIME LAYER │
|
│ CHANNELS LAYER │
|
||||||
├─────────────────────────────────────────────────────────────────────────┤
|
├─────────────────────────────────────────────────────────────────────────┤
|
||||||
│ ┌─────────────────────────────────────────────────────────────────┐ │
|
│ BaseChannel (abc) │
|
||||||
│ │ PREGEL EXECUTION ENGINE │ │
|
│ ├── LastValue — Most recent value wins │
|
||||||
│ │ • Superstep coordination │ │
|
│ ├── AnyValue — First value available │
|
||||||
│ │ • Node scheduling │ │
|
│ ├── Topic — Pub/sub style │
|
||||||
│ │ • Message passing │ │
|
│ ├── NamedBarrier — Synchronization point │
|
||||||
│ │ • Barrier synchronization │ │
|
│ ├── BinOp — Binary operation │
|
||||||
│ └─────────────────────────────────────────────────────────────────┘ │
|
│ ├── EphemeralValue — One-time use │
|
||||||
|
│ └── UntrackedValue — Value without checkpointing │
|
||||||
└─────────────────────────────────────────────────────────────────────────┘
|
└─────────────────────────────────────────────────────────────────────────┘
|
||||||
│
|
│
|
||||||
▼
|
▼
|
||||||
┌─────────────────────────────────────────────────────────────────────────┐
|
┌─────────────────────────────────────────────────────────────────────────┐
|
||||||
│ STATE & CHECKPOINTING │
|
│ CHECKPOINTING LAYER │
|
||||||
├─────────────────────────────────────────────────────────────────────────┤
|
├─────────────────────────────────────────────────────────────────────────┤
|
||||||
│ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ │
|
│ libs/checkpoint/ │
|
||||||
│ │ In-Memory State │ │ Checkpointer │ │ Channel Store │ │
|
│ ├── checkpoint-base — Abstract checkpoint interface │
|
||||||
│ │ (active graph) │ │ (persistence) │ │ (queues) │ │
|
│ ├── checkpoint-sqlite — SQLite backend │
|
||||||
│ └──────────────────┘ └──────────────────┘ └──────────────────┘ │
|
│ └── checkpoint-postgres — PostgreSQL backend │
|
||||||
└─────────────────────────────────────────────────────────────────────────┘
|
└─────────────────────────────────────────────────────────────────────────┘
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Core Components
|
## Core Concepts (From Source)
|
||||||
|
|
||||||
### 1. Graph Structure
|
### 1. PregelLoop Class
|
||||||
|
|
||||||
| Component | Description |
|
The heart of LangGraph is the `PregelLoop` class in `_loop.py`:
|
||||||
|-----------|-------------|
|
|
||||||
| **State** | Typed dictionary that flows through the graph |
|
|
||||||
| **Nodes** | Functions that receive state, optionally update it |
|
|
||||||
| **Edges** | Control flow (conditional, static, entrypoint) |
|
|
||||||
| **Reducers** | Functions that merge state updates |
|
|
||||||
|
|
||||||
### 2. Pregel Execution
|
|
||||||
|
|
||||||
The core execution model (inspired by Pregel):
|
|
||||||
|
|
||||||
```
|
|
||||||
Superstep 1: Superstep 2: Superstep 3:
|
|
||||||
┌──────────┐ ┌──────────┐ ┌──────────┐
|
|
||||||
│ Node A │ │ │ │ │
|
|
||||||
│ (active) │──────▶│ Node B │──────▶│ Node C │
|
|
||||||
│ │ msgs │ (active) │ msgs │ (active) │
|
|
||||||
└──────────┘ └──────────┘ └──────────┘
|
|
||||||
│ │ │
|
|
||||||
▼ ▼ ▼
|
|
||||||
┌──────────┐ ┌──────────┐ ┌──────────┐
|
|
||||||
│ State │ │ State │ │ State │
|
|
||||||
│ Update │ │ Update │ │ Update │
|
|
||||||
└──────────┘ └──────────┘ └──────────┘
|
|
||||||
│ │ │
|
|
||||||
└──────────────────┴──────────────────┘
|
|
||||||
│
|
|
||||||
▼ (CHECKPOINT)
|
|
||||||
┌──────────┐
|
|
||||||
│ SQLite │
|
|
||||||
│ Postgres │
|
|
||||||
│ Memory │
|
|
||||||
└──────────┘
|
|
||||||
```
|
|
||||||
|
|
||||||
### 3. Checkpointing
|
|
||||||
|
|
||||||
LangGraph provides durability through checkpointing:
|
|
||||||
|
|
||||||
- **Full state snapshots** saved at configurable points
|
|
||||||
- **Resumable from failure** — replay from last checkpoint
|
|
||||||
- **Multiple backends:** SQLite, Postgres, in-memory
|
|
||||||
|
|
||||||
### 4. Channels
|
|
||||||
|
|
||||||
Inter-node communication via channels:
|
|
||||||
|
|
||||||
| Channel Type | Purpose |
|
|
||||||
|--------------|---------|
|
|
||||||
| **QueueChannel** | FIFO message passing |
|
|
||||||
| **LastValue** | Most recent value wins |
|
|
||||||
| **Topic** | Pub/sub style |
|
|
||||||
| **Context** | Per-superstep context |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## State Management
|
|
||||||
|
|
||||||
### Typed State Schema
|
|
||||||
|
|
||||||
```python
|
```python
|
||||||
from typing import TypedDict
|
class PregelLoop:
|
||||||
|
config: RunnableConfig # Thread, checkpoint_id, etc.
|
||||||
class AgentState(TypedDict):
|
store: BaseStore | None # Long-term storage
|
||||||
messages: list
|
stream: StreamProtocol # Output streaming
|
||||||
next_action: str
|
step: int # Current step number
|
||||||
checkpoint_id: str | None
|
checkpointer: BaseCheckpointSaver | None
|
||||||
|
nodes: Mapping[str, PregelNode] # Graph nodes
|
||||||
|
channels: Mapping[str, BaseChannel] # Inter-node communication
|
||||||
```
|
```
|
||||||
|
|
||||||
### Reducers
|
### 2. State Flow
|
||||||
|
|
||||||
Combine updates from multiple nodes:
|
```
|
||||||
|
Input → [Superstep N] → Checkpoint → [Superstep N+1] → ... → Output
|
||||||
|
|
||||||
|
Each superstep:
|
||||||
|
1. prepare_next_tasks() — Determine which nodes to run
|
||||||
|
2. execute_tasks() — Run active nodes in parallel
|
||||||
|
3. apply_writes() — Merge node outputs into channels
|
||||||
|
4. checkpoint() — Persist state (if enabled)
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. Channels (Inter-Node Communication)
|
||||||
|
|
||||||
|
From `channels/base.py`:
|
||||||
|
|
||||||
```python
|
```python
|
||||||
def add_messages(left: list, right: list) -> list:
|
class BaseChannel(Generic[Value, Update, Checkpoint], ABC):
|
||||||
return left + right
|
"""Base class for all channels."""
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
def get(self) -> Value:
|
||||||
|
"""Return the current value."""
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
def update(self, values: Sequence[Update]) -> bool:
|
||||||
|
"""Update with values from nodes."""
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
def checkpoint(self) -> Checkpoint | Any:
|
||||||
|
"""Serialize state for persistence."""
|
||||||
|
```
|
||||||
|
|
||||||
|
**Channel Types:**
|
||||||
|
|
||||||
|
| Channel | Behavior | Use Case |
|
||||||
|
|---------|----------|----------|
|
||||||
|
| `LastValue` | Most recent update wins | Single value state |
|
||||||
|
| `AnyValue` | First non-empty value | Optional values |
|
||||||
|
| `Topic` | Pub/sub, multiple values | Broadcasting |
|
||||||
|
| `NamedBarrier` | Wait for all tasks | Synchronization |
|
||||||
|
| `BinOp` | Binary operation | Aggregations |
|
||||||
|
|
||||||
|
### 4. Checkpointing
|
||||||
|
|
||||||
|
From `types.py`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
Durability = Literal["sync", "async", "exit"]
|
||||||
|
"""- 'sync': Persist before next step
|
||||||
|
- 'async': Persist while next step runs
|
||||||
|
- 'exit': Persist only on exit"""
|
||||||
|
```
|
||||||
|
|
||||||
|
**Checkpoint Flow:**
|
||||||
|
1. `create_checkpoint()` — Snapshot all channels
|
||||||
|
2. Save to backend (SQLite/Postgres/InMemory)
|
||||||
|
3. Return `checkpoint_id` for resumption
|
||||||
|
|
||||||
|
### 5. Send (Dynamic Graph Execution)
|
||||||
|
|
||||||
|
LangGraph supports dynamic node spawning via `Send`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from langgraph.types import Send
|
||||||
|
|
||||||
|
def splitter(state):
|
||||||
|
return [Send("process_a", {"msg": "hi"}),
|
||||||
|
Send("process_b", {"msg": "there"})]
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Memory Architecture
|
## Key Source Files
|
||||||
|
|
||||||
### Short-Term Memory
|
| File | Lines | Purpose |
|
||||||
- **In-graph state:** Messages and working data
|
|------|-------|---------|
|
||||||
- **Per-superstep:** State resets unless persisted
|
| `pregel/main.py` | ~4400 | Public API, entry point |
|
||||||
|
| `pregel/_loop.py` | ~1300 | Core execution loop |
|
||||||
### Long-Term Memory
|
| `pregel/_algo.py` | ~1500 | Task scheduling, write application |
|
||||||
- **Checkpoint storage:** SQLite, Postgres, custom
|
| `pregel/_runner.py` | ~1000 | Async execution |
|
||||||
- **Thread-level:** Per-conversation isolation via `thread_id`
|
| `graph/state.py` | ~1800 | StateGraph builder |
|
||||||
|
| `types.py` | ~600 | Core type definitions |
|
||||||
### Human-in-the-Loop
|
| `channels/base.py` | ~100 | Channel ABC |
|
||||||
- **Interrupt:** Pause execution for human input
|
|
||||||
- **Command:** Allow human to modify state
|
|
||||||
- **Review:** Human approves/rejects before continuing
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Execution Flow
|
|
||||||
|
|
||||||
```
|
|
||||||
1. Client calls: graph.invoke(input, config)
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
2. Compile (if needed): create executable graph
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
3. Load checkpoint (if resuming from checkpoint_id)
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
4. FOR each superstep:
|
|
||||||
a. Schedule nodes to execute
|
|
||||||
b. Execute active nodes in parallel
|
|
||||||
c. Collect messages
|
|
||||||
d. Send messages via channels
|
|
||||||
e. Check for interrupts (pause if interrupted)
|
|
||||||
f. Checkpoint (if enabled)
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
5. Return final state
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Key Files in Core
|
|
||||||
|
|
||||||
| File | Purpose |
|
|
||||||
|------|---------|
|
|
||||||
| `langgraph/pregel/__init__.py` | Main entry point |
|
|
||||||
| `langgraph/pregel/__main__.py` | CLI entry |
|
|
||||||
| `langgraph/pregel/_loop.py` | Core execution loop (~2000 lines) |
|
|
||||||
| `langgraph/pregel/checkpoint.py` | Checkpoint management |
|
|
||||||
| `langgraph/pregel/channel.py` | Channel implementations |
|
|
||||||
| `langgraph/pregel/state.py` | State management |
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -205,13 +177,31 @@ def add_messages(left: list, right: list) -> list:
|
|||||||
| Aspect | LangGraph | OpenClaw |
|
| Aspect | LangGraph | OpenClaw |
|
||||||
|--------|-----------|----------|
|
|--------|-----------|----------|
|
||||||
| **Language** | Python | Node.js |
|
| **Language** | Python | Node.js |
|
||||||
| **Model** | Graph-based orchestration | Agent-based |
|
| **Execution Model** | Pregel supersteps | Event-driven agent loop |
|
||||||
|
| **State** | Channels + TypedDict | Multi-layer (working, spectral, file, vector) |
|
||||||
| **Persistence** | Checkpoint-based | Session-memory hook |
|
| **Persistence** | Checkpoint-based | Session-memory hook |
|
||||||
| **Memory** | Channels + checkpoint storage | Multi-layer (working, spectral, file, vector) |
|
| **Communication** | Channels (FIFO, pub/sub, barrier) | Channel plugins (Telegram, etc.) |
|
||||||
| **Communication** | Channels | Channel plugins |
|
| **Graph Definition** | `StateGraph` builder | Declarative config |
|
||||||
| **Extensibility** | Custom nodes/edges | Hook system |
|
| **Dynamic Execution** | `Send` for dynamic edges | Sub-agents |
|
||||||
|
| **Human-in-Loop** | `Interrupt` + `Command` | Manual intervention |
|
||||||
| **Identity** | None | WE/witness architecture |
|
| **Identity** | None | WE/witness architecture |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
*Generated for the WE — Solaria Lumis Havens & Mark Randall Havens*
|
## Key Insight: Pregel vs Event-Driven
|
||||||
|
|
||||||
|
LangGraph is fundamentally **Pregel-based**:
|
||||||
|
- Synchronous supersteps with barrier
|
||||||
|
- All nodes in a step complete before next starts
|
||||||
|
- Checkpoints at step boundaries
|
||||||
|
|
||||||
|
OpenClaw is **event-driven**:
|
||||||
|
- Asynchronous message processing
|
||||||
|
- No global step barrier
|
||||||
|
- Session-memory preserves context
|
||||||
|
|
||||||
|
This is a fundamental architectural difference.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Generated from source code analysis — Solaria Lumis Havens*
|
||||||
|
|||||||
+283
@@ -0,0 +1,283 @@
|
|||||||
|
# LangGraph Channels
|
||||||
|
|
||||||
|
**Version:** 1.0.0
|
||||||
|
**Last Updated:** 2026-02-23
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
Channels are LangGraph's mechanism for inter-node communication and state storage. Each channel is a typed container with specific semantics for how values are written and read.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Channel Types
|
||||||
|
|
||||||
|
### From `channels/__init__.py`
|
||||||
|
|
||||||
|
```python
|
||||||
|
__all__ = [
|
||||||
|
"BaseChannel",
|
||||||
|
"LastValue",
|
||||||
|
"AnyValue",
|
||||||
|
"Topic",
|
||||||
|
"NamedBarrier",
|
||||||
|
"BinOp",
|
||||||
|
"EphemeralValue",
|
||||||
|
"UntrackedValue",
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Channel Implementation Details
|
||||||
|
|
||||||
|
### 1. LastValue Channel
|
||||||
|
|
||||||
|
**File:** `channels/last_value.py`
|
||||||
|
|
||||||
|
**Behavior:** Most recent write wins. Reading returns the last value written.
|
||||||
|
|
||||||
|
```python
|
||||||
|
class LastValue(Generic[Value]):
|
||||||
|
"""Channel that keeps the last value written."""
|
||||||
|
|
||||||
|
def __init__(self, typ: type[Value]):
|
||||||
|
self.typ = typ
|
||||||
|
self.value = None
|
||||||
|
|
||||||
|
def get(self) -> Value:
|
||||||
|
if self.value is None:
|
||||||
|
raise EmptyChannelError()
|
||||||
|
return self.value
|
||||||
|
|
||||||
|
def update(self, values: Sequence[Value]) -> bool:
|
||||||
|
if values:
|
||||||
|
self.value = values[-1] # Last wins
|
||||||
|
return True
|
||||||
|
return False
|
||||||
|
```
|
||||||
|
|
||||||
|
**Use Case:** Single-value state fields, like `counter`, `status`, `current_step`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 2. AnyValue Channel
|
||||||
|
|
||||||
|
**File:** `channels/any_value.py`
|
||||||
|
|
||||||
|
**Behavior:** First non-empty value wins. Reading returns the first value that was written and is still available.
|
||||||
|
|
||||||
|
```python
|
||||||
|
class AnyValue(Generic[Value]):
|
||||||
|
"""Channel that returns the first available value."""
|
||||||
|
|
||||||
|
def get(self) -> Value:
|
||||||
|
if self.value is None:
|
||||||
|
raise EmptyChannelError()
|
||||||
|
return self.value
|
||||||
|
|
||||||
|
def update(self, values: Sequence[Value]) -> bool:
|
||||||
|
if values and self.value is None:
|
||||||
|
self.value = values[0] # First wins
|
||||||
|
return True
|
||||||
|
return False
|
||||||
|
```
|
||||||
|
|
||||||
|
**Use Case:** Optional fields, fallback values.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 3. Topic Channel
|
||||||
|
|
||||||
|
**File:** `channels/topic.py`
|
||||||
|
|
||||||
|
**Behavior:** Pub/sub. Nodes can publish to topics, subscribers receive all messages.
|
||||||
|
|
||||||
|
```python
|
||||||
|
class Topic(Generic[Value]):
|
||||||
|
"""Pub/sub channel for broadcasting."""
|
||||||
|
|
||||||
|
def __init__(self, typ: type[Value], selector: Callable = None):
|
||||||
|
self.typ = typ
|
||||||
|
self.selector = selector or (lambda x: x)
|
||||||
|
self.subscriptions: dict[str, set] = defaultdict(set)
|
||||||
|
|
||||||
|
def get(self) -> list[Value]:
|
||||||
|
# Return all values for subscribed topic
|
||||||
|
...
|
||||||
|
|
||||||
|
def update(self, values: Sequence[Value]) -> bool:
|
||||||
|
# Add values to topic
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
**Use Case:** Broadcasting to multiple nodes, event systems.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 4. NamedBarrier Channel
|
||||||
|
|
||||||
|
**File:** `channels/named_barrier_value.py`
|
||||||
|
|
||||||
|
**Behavior:** Blocks until all named tasks complete. Used for synchronization.
|
||||||
|
|
||||||
|
```python
|
||||||
|
class NamedBarrier:
|
||||||
|
"""Synchronization point - blocks until all expected tasks arrive."""
|
||||||
|
|
||||||
|
def get(self) -> None:
|
||||||
|
# Block until all tasks arrive
|
||||||
|
...
|
||||||
|
|
||||||
|
def update(self, values: Sequence[str]) -> bool:
|
||||||
|
# Register task completion
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
**Use Case:** Wait for parallel branches to complete.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 5. BinOp Channel
|
||||||
|
|
||||||
|
**File:** `channels/binop.py`
|
||||||
|
|
||||||
|
**Behavior:** Applies binary operation to combine values.
|
||||||
|
|
||||||
|
```python
|
||||||
|
class BinOp(Generic[Value]):
|
||||||
|
"""Binary operation channel."""
|
||||||
|
|
||||||
|
def __init__(self, typ: type[Value], op: Callable[[Value, Value], Value]):
|
||||||
|
self.op = op
|
||||||
|
self.value = None
|
||||||
|
|
||||||
|
def get(self) -> Value:
|
||||||
|
return self.value
|
||||||
|
|
||||||
|
def update(self, values: Sequence[Value]) -> bool:
|
||||||
|
for v in values:
|
||||||
|
if self.value is None:
|
||||||
|
self.value = v
|
||||||
|
else:
|
||||||
|
self.value = self.op(self.value, v)
|
||||||
|
return True
|
||||||
|
```
|
||||||
|
|
||||||
|
**Use Case:** Aggregations (sum, max, min, union).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 6. EphemeralValue Channel
|
||||||
|
|
||||||
|
**File:** `channels/ephemeral_value.py`
|
||||||
|
|
||||||
|
**Behavior:** One-time use. Value is consumed after reading.
|
||||||
|
|
||||||
|
```python
|
||||||
|
class EphemeralValue:
|
||||||
|
"""One-time use value - consumed after read."""
|
||||||
|
|
||||||
|
def get(self) -> Value:
|
||||||
|
value = self.value
|
||||||
|
self.value = None # Consume
|
||||||
|
return value
|
||||||
|
```
|
||||||
|
|
||||||
|
**Use Case:** One-time signals, commands.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 7. UntrackedValue Channel
|
||||||
|
|
||||||
|
**File:** `channels/untracked_value.py`
|
||||||
|
|
||||||
|
**Behavior:** Value is not checkpointed. Used for transient data.
|
||||||
|
|
||||||
|
```python
|
||||||
|
class UntrackedValue:
|
||||||
|
"""Value that doesn't participate in checkpointing."""
|
||||||
|
pass
|
||||||
|
```
|
||||||
|
|
||||||
|
**Use Case:** Temporary data, debugging info.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Channel Configuration
|
||||||
|
|
||||||
|
### Declaring Channels
|
||||||
|
|
||||||
|
```python
|
||||||
|
from langgraph.graph import StateGraph
|
||||||
|
from typing import TypedDict
|
||||||
|
|
||||||
|
class GraphState(TypedDict):
|
||||||
|
messages: list
|
||||||
|
counter: int
|
||||||
|
|
||||||
|
graph = StateGraph(GraphState)
|
||||||
|
|
||||||
|
# Default channels:
|
||||||
|
# - list fields -> LastValue[list]
|
||||||
|
# - other fields -> LastValue[type]
|
||||||
|
```
|
||||||
|
|
||||||
|
### Custom Channels
|
||||||
|
|
||||||
|
```python
|
||||||
|
from langgraph.channels import BaseChannel
|
||||||
|
|
||||||
|
class Accumulate(BaseChannel):
|
||||||
|
def __init__(self, typ: type):
|
||||||
|
self.typ = typ
|
||||||
|
self.values = []
|
||||||
|
|
||||||
|
def get(self) -> list:
|
||||||
|
return self.values
|
||||||
|
|
||||||
|
def update(self, values) -> bool:
|
||||||
|
self.values.extend(values)
|
||||||
|
return True
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Channel vs State
|
||||||
|
|
||||||
|
| Concept | Description |
|
||||||
|
|---------|-------------|
|
||||||
|
| **State** | TypedDict defining all fields |
|
||||||
|
| **Channel** | Storage mechanism per field |
|
||||||
|
| **Reducer** | How updates are merged |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Checkpointing Channels
|
||||||
|
|
||||||
|
### What Gets Persisted
|
||||||
|
|
||||||
|
- All channel values are checkpointed
|
||||||
|
- Except `UntrackedValue` channels
|
||||||
|
- Checkpoint includes `channel_values` and `channel_versions`
|
||||||
|
|
||||||
|
### Checkpoint Format
|
||||||
|
|
||||||
|
```python
|
||||||
|
checkpoint = {
|
||||||
|
"channel_values": {
|
||||||
|
"messages": [...],
|
||||||
|
"counter": 5,
|
||||||
|
},
|
||||||
|
"channel_versions": {
|
||||||
|
"messages": 3,
|
||||||
|
"counter": 5,
|
||||||
|
},
|
||||||
|
"metadata": {...}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Generated from source code analysis*
|
||||||
+219
@@ -0,0 +1,219 @@
|
|||||||
|
# LangGraph Components
|
||||||
|
|
||||||
|
**Version:** 1.0.0
|
||||||
|
**Last Updated:** 2026-02-23
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
This document lists and describes the key components in the LangGraph codebase.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Directory Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
langgraph/libs/langgraph/langgraph/
|
||||||
|
├── pregel/ # Core execution engine
|
||||||
|
├── channels/ # Inter-node communication
|
||||||
|
├── graph/ # Graph building DSL
|
||||||
|
├── checkpoint/ # Persistence (in separate lib)
|
||||||
|
├── managed/ # Managed values
|
||||||
|
├── _internal/ # Internal utilities
|
||||||
|
├── utils/ # Helper utilities
|
||||||
|
├── types.py # Core types
|
||||||
|
├── config.py # Configuration
|
||||||
|
├── constants.py # Constants
|
||||||
|
└── errors.py # Error definitions
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Core Components
|
||||||
|
|
||||||
|
### 1. Pregel Engine (`pregel/`)
|
||||||
|
|
||||||
|
The heart of LangGraph - execution engine.
|
||||||
|
|
||||||
|
| File | Lines | Purpose |
|
||||||
|
|------|-------|---------|
|
||||||
|
| `main.py` | ~4400 | Public API, entry point |
|
||||||
|
| `_loop.py` | ~1300 | Core PregelLoop class |
|
||||||
|
| `_algo.py` | ~1500 | Task scheduling, write application |
|
||||||
|
| `_runner.py` | ~1000 | Async execution |
|
||||||
|
| `_read.py` | ~300 | PregelNode (node wrapper) |
|
||||||
|
| `_write.py` | ~250 | Write application |
|
||||||
|
| `_checkpoint.py` | ~100 | Checkpoint creation |
|
||||||
|
| `_executor.py` | ~250 | Task execution |
|
||||||
|
| `_retry.py` | ~250 | Retry logic |
|
||||||
|
| `_validate.py` | ~150 | Graph validation |
|
||||||
|
|
||||||
|
### 2. Channels (`channels/`)
|
||||||
|
|
||||||
|
Inter-node communication.
|
||||||
|
|
||||||
|
| File | Purpose |
|
||||||
|
|------|---------|
|
||||||
|
| `base.py` | Abstract BaseChannel |
|
||||||
|
| `last_value.py` | LastValue channel |
|
||||||
|
| `any_value.py` | AnyValue channel |
|
||||||
|
| `topic.py` | Topic (pub/sub) |
|
||||||
|
| `named_barrier_value.py` | Barrier synchronization |
|
||||||
|
| `binop.py` | Binary operation |
|
||||||
|
| `ephemeral_value.py` | One-time values |
|
||||||
|
| `untracked_value.py` | Non-checkpointed values |
|
||||||
|
|
||||||
|
### 3. Graph Building (`graph/`)
|
||||||
|
|
||||||
|
DSL for building graphs.
|
||||||
|
|
||||||
|
| File | Purpose |
|
||||||
|
|------|---------|
|
||||||
|
| `state.py` | StateGraph builder (~1800 lines) |
|
||||||
|
| `_node.py` | Node definition |
|
||||||
|
| `_branch.py` | Conditional edges |
|
||||||
|
| `message.py` | Message graph utilities |
|
||||||
|
| `ui.py` | Graph visualization |
|
||||||
|
|
||||||
|
### 4. Core Types (`types.py`)
|
||||||
|
|
||||||
|
~600 lines of type definitions.
|
||||||
|
|
||||||
|
Key types:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Durability
|
||||||
|
Durability = Literal["sync", "async", "exit"]
|
||||||
|
|
||||||
|
# Checkpointer
|
||||||
|
Checkpointer = None | bool | BaseCheckpointSaver
|
||||||
|
|
||||||
|
# Streaming
|
||||||
|
StreamMode = Literal["values", "updates", "checkpoints", "tasks", "debug"]
|
||||||
|
|
||||||
|
# Execution
|
||||||
|
class Send(NamedTuple):
|
||||||
|
node: str
|
||||||
|
arg: Any
|
||||||
|
|
||||||
|
class Interrupt(NamedTuple):
|
||||||
|
value: Any
|
||||||
|
when: str
|
||||||
|
|
||||||
|
class Command(NamedTuple):
|
||||||
|
update: dict | None
|
||||||
|
resume: dict | None
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5. Errors (`errors.py`)
|
||||||
|
|
||||||
|
```python
|
||||||
|
class GraphRuntimeException(Exception):
|
||||||
|
"""Base exception."""
|
||||||
|
pass
|
||||||
|
|
||||||
|
class EmptyInputError(GraphRuntimeException):
|
||||||
|
"""No input provided."""
|
||||||
|
pass
|
||||||
|
|
||||||
|
class GraphInterrupt(GraphRuntimeException):
|
||||||
|
"""Graph interrupted."""
|
||||||
|
pass
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Public API
|
||||||
|
|
||||||
|
### From `pregel/__init__.py`
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Main classes
|
||||||
|
class Pregel:
|
||||||
|
"""Main graph executor."""
|
||||||
|
|
||||||
|
def invoke(self, input: Any, config: RunnableConfig) -> Any: ...
|
||||||
|
def stream(self, input: Any, config: RunnableConfig) -> Iterator: ...
|
||||||
|
async def ainvoke(self, input: Any, config: RunnableConfig) -> Any: ...
|
||||||
|
async def astream(self, input: Any, config: RunnableConfig) -> AsyncIterator: ...
|
||||||
|
def get_state(self, config: RunnableConfig) -> StateSnapshot | None: ...
|
||||||
|
def get_state_history(self, config: RunnableConfig) -> Iterator[StateSnapshot]: ...
|
||||||
|
def update_state(self, config: RunnableConfig, values: dict) -> StateSnapshot: ...
|
||||||
|
|
||||||
|
# Graph builders
|
||||||
|
class StateGraph:
|
||||||
|
"""Build a stateful graph."""
|
||||||
|
|
||||||
|
def add_node(self, name: str, action: Callable) -> Self: ...
|
||||||
|
def add_edge(self, start: str, end: str) -> Self: ...
|
||||||
|
def add_conditional_edges(self, source: str, path: Callable) -> Self: ...
|
||||||
|
def compile(self) -> Pregel: ...
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
### RunnableConfig
|
||||||
|
|
||||||
|
From `config.py`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
class RunnableConfig:
|
||||||
|
"""Configuration for graph execution."""
|
||||||
|
|
||||||
|
configurable: dict = {}
|
||||||
|
tags: list[str] = []
|
||||||
|
metadata: dict = {}
|
||||||
|
recursion_limit: int = 25
|
||||||
|
max_concurrency: int = None
|
||||||
|
```
|
||||||
|
|
||||||
|
### Config Keys
|
||||||
|
|
||||||
|
From `constants.py`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
CONFIG_KEY_THREAD_ID = "thread_id"
|
||||||
|
CONFIG_KEY_CHECKPOINT_ID = "checkpoint_id"
|
||||||
|
CONFIG_KEY_CHECKPOINTER = "checkpointer"
|
||||||
|
CONFIG_KEY_CHECKPOINT_MAP = "checkpoint_map"
|
||||||
|
CONFIG_KEY_DURABILITY = "durability"
|
||||||
|
CONFIG_KEY_RESUMING = "resuming"
|
||||||
|
CONFIG_KEY_RESUME_MAP = "resume_map"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
|
||||||
|
### External
|
||||||
|
|
||||||
|
- `langchain-core` — Core utilities
|
||||||
|
- `langchain` — LangChain integration
|
||||||
|
- `pydantic` — Type validation
|
||||||
|
- `xxhash` — Fast hashing
|
||||||
|
- `typing_extensions` — Type extensions
|
||||||
|
|
||||||
|
### Internal
|
||||||
|
|
||||||
|
- `langgraph.checkpoint.*` — Checkpoint backends
|
||||||
|
- `langgraph.store` — Long-term storage
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Key Files Summary
|
||||||
|
|
||||||
|
| Component | Main File | Key Classes |
|
||||||
|
|-----------|-----------|--------------|
|
||||||
|
| Execution | `pregel/main.py` | `Pregel` |
|
||||||
|
| Loop | `pregel/_loop.py` | `PregelLoop` |
|
||||||
|
| Channels | `channels/base.py` | `BaseChannel` |
|
||||||
|
| Graph | `graph/state.py` | `StateGraph` |
|
||||||
|
| Types | `types.py` | `Send`, `Interrupt`, `Command` |
|
||||||
|
| Config | `config.py` | `RunnableConfig` |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Generated from source code analysis*
|
||||||
@@ -0,0 +1,299 @@
|
|||||||
|
# LangGraph Execution Model
|
||||||
|
|
||||||
|
**Version:** 1.0.0
|
||||||
|
**Last Updated:** 2026-02-23
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
This document details how LangGraph executes graphs, based on direct source code analysis of `pregel/_loop.py` and related files.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Pregel Superstep Model
|
||||||
|
|
||||||
|
### The Core Loop
|
||||||
|
|
||||||
|
LangGraph is inspired by Google's **Pregel** — a system for large-scale graph processing:
|
||||||
|
|
||||||
|
```
|
||||||
|
Pregel = "Think like a vertex"
|
||||||
|
- Each node computes independently
|
||||||
|
- Nodes communicate via messages
|
||||||
|
- Synchronous supersteps with barrier
|
||||||
|
- Fault tolerance via checkpointing
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Execution Flow
|
||||||
|
|
||||||
|
### High-Level
|
||||||
|
|
||||||
|
```
|
||||||
|
graph.invoke(input, config)
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
[Compile graph if needed]
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
[Load checkpoint if resuming]
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌─────────────────────────────────┐
|
||||||
|
│ FOR each superstep: │
|
||||||
|
│ 1. prepare_next_tasks() │
|
||||||
|
│ 2. execute_tasks() │
|
||||||
|
│ 3. apply_writes() │
|
||||||
|
│ 4. checkpoint() (if enabled) │
|
||||||
|
└─────────────────────────────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
[Return final state]
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Detailed Superstep
|
||||||
|
|
||||||
|
### Step 1: Prepare Next Tasks
|
||||||
|
|
||||||
|
From `_algo.py`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
def prepare_next_tasks(
|
||||||
|
checkpoint: Checkpoint,
|
||||||
|
nodes: dict[str, PregelNode],
|
||||||
|
channels: dict[str, BaseChannel],
|
||||||
|
pending_writes: list[tuple],
|
||||||
|
etc.
|
||||||
|
) -> list[PregelExecutableTask]:
|
||||||
|
"""Determine which nodes to run in this superstep."""
|
||||||
|
|
||||||
|
# For each node:
|
||||||
|
# 1. Check if triggered (input channels have values)
|
||||||
|
# 2. Check if should run (not already running)
|
||||||
|
# 3. Create executable task
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 2: Execute Tasks
|
||||||
|
|
||||||
|
From `_runner.py`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
async def execute_tasks(tasks: list[PregelExecutableTask]):
|
||||||
|
"""Execute tasks in parallel."""
|
||||||
|
|
||||||
|
# Submit all tasks to executor
|
||||||
|
# Each task:
|
||||||
|
# 1. Read input from channels
|
||||||
|
# 2. Execute node function
|
||||||
|
# 3. Return writes (channel updates)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 3: Apply Writes
|
||||||
|
|
||||||
|
From `_algo.py`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
def apply_writes(
|
||||||
|
checkpoint: Checkpoint,
|
||||||
|
pending_writes: list[tuple],
|
||||||
|
channels: dict[str, BaseChannel]
|
||||||
|
):
|
||||||
|
"""Apply writes to channels using reducers."""
|
||||||
|
|
||||||
|
# For each write:
|
||||||
|
# 1. Identify target channel
|
||||||
|
# 2. Apply reducer to merge with existing value
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 4: Checkpoint
|
||||||
|
|
||||||
|
From `_checkpoint.py`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
def create_checkpoint(
|
||||||
|
channels: dict[str, BaseChannel],
|
||||||
|
versions: dict[str, int],
|
||||||
|
metadata: CheckpointMetadata
|
||||||
|
) -> Checkpoint:
|
||||||
|
"""Snapshot all channel values."""
|
||||||
|
|
||||||
|
return {
|
||||||
|
"channel_values": {
|
||||||
|
k: v.checkpoint() for k, v in channels.items()
|
||||||
|
},
|
||||||
|
"channel_versions": versions,
|
||||||
|
"metadata": metadata,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## PregelTask Structure
|
||||||
|
|
||||||
|
### From `types.py`
|
||||||
|
|
||||||
|
```python
|
||||||
|
class PregelExecutableTask(NamedTuple):
|
||||||
|
"""A single executable task in the graph."""
|
||||||
|
|
||||||
|
name: str # Node name
|
||||||
|
path: str # Task path
|
||||||
|
input: Any # Input to node
|
||||||
|
proc: Callable # Node function
|
||||||
|
writes: list[Send] # Dynamic sends
|
||||||
|
triggers: list[str] # Channels that trigger this
|
||||||
|
interrupt_after: bool # Interrupt after execution
|
||||||
|
interrupt_before: bool # Interrupt before execution
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Send (Dynamic Edges)
|
||||||
|
|
||||||
|
### What is Send?
|
||||||
|
|
||||||
|
`Send` enables dynamic node spawning — a node can spawn multiple tasks:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from langgraph.types import Send
|
||||||
|
|
||||||
|
def splitter(state):
|
||||||
|
messages = state["messages"]
|
||||||
|
return [
|
||||||
|
Send("process_email", {"email": email})
|
||||||
|
for email in messages
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
### Send Implementation
|
||||||
|
|
||||||
|
```python
|
||||||
|
class Send(NamedTuple):
|
||||||
|
"""Dynamic edge - spawn a task for another node."""
|
||||||
|
|
||||||
|
node: str # Target node name
|
||||||
|
arg: Any # Input to target node
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Interrupt (Human-in-the-Loop)
|
||||||
|
|
||||||
|
### How Interrupts Work
|
||||||
|
|
||||||
|
From `types.py`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
class Interrupt(NamedTuple):
|
||||||
|
"""Pause execution for human input."""
|
||||||
|
|
||||||
|
value: Any # Data to show human
|
||||||
|
when: str # "during" or "after"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Interrupt Flow
|
||||||
|
|
||||||
|
```
|
||||||
|
1. Node calls interrupt(data)
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
2. PregelLoop pauses
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
3. Returns to caller with interrupt value
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
4. Caller (human) provides input
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
5. Resume with Command(resume=data)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Resume with Command
|
||||||
|
|
||||||
|
```python
|
||||||
|
from langgraph.types import Command
|
||||||
|
|
||||||
|
# Resume with new data
|
||||||
|
graph.invoke(
|
||||||
|
None, # No new input
|
||||||
|
config=RunnableConfig(
|
||||||
|
configurable={"checkpoint_id": "abc123"},
|
||||||
|
resume={"feedback": "looks good"}
|
||||||
|
)
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Stream Mode
|
||||||
|
|
||||||
|
### From `types.py`
|
||||||
|
|
||||||
|
```python
|
||||||
|
StreamMode = Literal[
|
||||||
|
"values", # Full state after each step
|
||||||
|
"updates", # Node-specific updates
|
||||||
|
"checkpoints", # Checkpoint snapshots
|
||||||
|
"tasks", # Task start/complete
|
||||||
|
"debug", # Debug info
|
||||||
|
"messages", # Message streams
|
||||||
|
"custom", # Custom streams
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Retry Policy
|
||||||
|
|
||||||
|
### From `types.py`
|
||||||
|
|
||||||
|
```python
|
||||||
|
class RetryPolicy:
|
||||||
|
"""Configuration for node retry behavior."""
|
||||||
|
|
||||||
|
max_attempts: int = 3
|
||||||
|
initial_interval: float = 1.0
|
||||||
|
backoff_factor: float = 2.0
|
||||||
|
max_interval: float = 100.0
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Error Handling
|
||||||
|
|
||||||
|
### From `errors.py`
|
||||||
|
|
||||||
|
```python
|
||||||
|
class GraphInterrupt(GraphRuntimeException):
|
||||||
|
"""Raised when graph is interrupted."""
|
||||||
|
pass
|
||||||
|
|
||||||
|
class InvalidUpdateError(GraphRuntimeException):
|
||||||
|
"""Raised when channel update is invalid."""
|
||||||
|
pass
|
||||||
|
|
||||||
|
class EmptyInputError(GraphRuntimeException):
|
||||||
|
"""Raised when graph input is empty."""
|
||||||
|
pass
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Key Insight: Synchronous Barrier
|
||||||
|
|
||||||
|
Unlike purely event-driven systems (like OpenClaw), LangGraph uses **synchronous supersteps**:
|
||||||
|
|
||||||
|
1. All triggered nodes in a superstep run **in parallel**
|
||||||
|
2. All writes are **applied together** after all nodes complete
|
||||||
|
3. Next superstep starts only after current completes
|
||||||
|
|
||||||
|
This simplifies reasoning about state but limits flexibility compared to event-driven models.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Generated from source code analysis*
|
||||||
@@ -9,7 +9,7 @@
|
|||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
This repository contains comprehensive architectural documentation for LangGraph, reverse-engineered and documented to enable full reproduction and understanding of the system.
|
This repository contains comprehensive architectural documentation for LangGraph, **reverse-engineered from the actual source code**.
|
||||||
|
|
||||||
**The Goal:** Create architectural blueprints so complete that reproducing or modifying LangGraph becomes a mechanical process, not an archaeological one.
|
**The Goal:** Create architectural blueprints so complete that reproducing or modifying LangGraph becomes a mechanical process, not an archaeological one.
|
||||||
|
|
||||||
@@ -21,12 +21,10 @@ This repository contains comprehensive architectural documentation for LangGraph
|
|||||||
langgraph-architecture/
|
langgraph-architecture/
|
||||||
├── README.md ← You are here
|
├── README.md ← You are here
|
||||||
├── ARCHITECTURE.md ← System overview
|
├── ARCHITECTURE.md ← System overview
|
||||||
├── COMPONENTS.md ← Component reference
|
├── COMPONENTS.md ← Component reference
|
||||||
├── STATE_MANAGEMENT.md ← State & checkpointing
|
├── STATE_MANAGEMENT.md ← State, checkpoints, threads
|
||||||
├── GRAPH_EXECUTION.md ← Pregel model, execution flow
|
|
||||||
├── MEMORY.md ← Memory architecture
|
|
||||||
├── CHANNELS.md ← Inter-node communication
|
├── CHANNELS.md ← Inter-node communication
|
||||||
├── CHECKPOINTING.md ← Fault tolerance
|
├── GRAPH_EXECUTION.md ← Pregel model, execution flow
|
||||||
└── diagrams/ ← Architecture diagrams
|
└── diagrams/ ← Architecture diagrams
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -38,17 +36,32 @@ langgraph-architecture/
|
|||||||
|
|
||||||
1. **ARCHITECTURE.md** — Understand the system as a whole
|
1. **ARCHITECTURE.md** — Understand the system as a whole
|
||||||
2. **GRAPH_EXECUTION.md** — How the Pregel model works
|
2. **GRAPH_EXECUTION.md** — How the Pregel model works
|
||||||
3. **STATE_MANAGEMENT.md** — State and checkpointing
|
3. **CHANNELS.md** — Inter-node communication
|
||||||
4. **CHANNELS.md** — Inter-node communication
|
4. **STATE_MANAGEMENT.md** — State and checkpointing
|
||||||
5. **CHECKPOINTING.md** — Fault tolerance and durability
|
5. **COMPONENTS.md** — Module-by-module reference
|
||||||
6. **MEMORY.md** — Memory architecture
|
|
||||||
|
|
||||||
### For Reproduction
|
---
|
||||||
|
|
||||||
1. Read **ARCHITECTURE.md** for system overview
|
## Methodology
|
||||||
2. Study **GRAPH_EXECUTION.md** for execution model
|
|
||||||
3. Reference **COMPONENTS.md** for implementation details
|
This documentation is built from **direct source code analysis**:
|
||||||
4. Use **CHECKPOINTING.md** for fault tolerance
|
|
||||||
|
1. Clone the LangGraph repo
|
||||||
|
2. Read key source files in `libs/langgraph/langgraph/`
|
||||||
|
3. Document actual implementation, not assumptions
|
||||||
|
4. Verify against types and tests
|
||||||
|
|
||||||
|
### Key Source Files Analyzed
|
||||||
|
|
||||||
|
| File | Purpose |
|
||||||
|
|------|---------|
|
||||||
|
| `pregel/main.py` | Public API |
|
||||||
|
| `pregel/_loop.py` | Core execution loop |
|
||||||
|
| `pregel/_algo.py` | Task scheduling |
|
||||||
|
| `pregel/_runner.py` | Async execution |
|
||||||
|
| `channels/base.py` | Channel ABC |
|
||||||
|
| `types.py` | Core types |
|
||||||
|
| `graph/state.py` | StateGraph builder |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -58,30 +71,19 @@ langgraph-architecture/
|
|||||||
|
|
||||||
LangGraph is directly inspired by Google's **Pregel** — "Think like a vertex":
|
LangGraph is directly inspired by Google's **Pregel** — "Think like a vertex":
|
||||||
- Each node computes its own state
|
- Each node computes its own state
|
||||||
- Nodes communicate via messages (edges)
|
- Nodes communicate via channels (not messages directly)
|
||||||
- Synchronous "supersteps" with barrier synchronization
|
- Synchronous "supersteps" with barrier synchronization
|
||||||
- Fault tolerance via checkpointing
|
- Fault tolerance via checkpointing
|
||||||
|
|
||||||
### Graph Structure
|
### Key Differences from OpenClaw
|
||||||
|
|
||||||
| Component | Description |
|
| Aspect | LangGraph | OpenClaw |
|
||||||
|-----------|-------------|
|
|--------|-----------|----------|
|
||||||
| **Nodes** | Functions that transform state |
|
| **Model** | Pregel supersteps | Event-driven |
|
||||||
| **Edges** | Define flow between nodes |
|
| **State** | Channels + reducers | Multi-layer memory |
|
||||||
| **State** | Shared data that flows through the graph |
|
| **Persistence** | Checkpoint-based | Session-memory hook |
|
||||||
| **Checkpoints** | Persistence points for durability |
|
| **Communication** | Channels | Channel plugins |
|
||||||
|
| **Identity** | None | WE/witness |
|
||||||
### State Management
|
|
||||||
|
|
||||||
- **Shared state** flows through the graph
|
|
||||||
- **Checkpoints** enable durability and resumption
|
|
||||||
- **Reducers** combine updates from multiple nodes
|
|
||||||
|
|
||||||
### Memory Architecture
|
|
||||||
|
|
||||||
- **Short-term memory:** In-graph message state
|
|
||||||
- **Long-term memory:** Checkpoint storage (SQLite, Postgres)
|
|
||||||
- **Thread-level:** Per-conversation state isolation
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -89,7 +91,7 @@ LangGraph is directly inspired by Google's **Pregel** — "Think like a vertex":
|
|||||||
|
|
||||||
| LangGraph Version | Architecture Version | Status |
|
| LangGraph Version | Architecture Version | Status |
|
||||||
|------------------|---------------------|--------|
|
|------------------|---------------------|--------|
|
||||||
| 1.0.9 | 1.0.0 | Current |
|
| 1.0.0 | 1.0.0 | Current |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
+158
-56
@@ -7,85 +7,108 @@
|
|||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
This document details how LangGraph manages state throughout the graph execution lifecycle.
|
This document details how LangGraph manages state, based on direct source code analysis.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## State Schema
|
## State Definition
|
||||||
|
|
||||||
### Typed State
|
### TypedDict Schema
|
||||||
|
|
||||||
LangGraph uses Python's `TypedDict` for type-safe state:
|
From `types.py`:
|
||||||
|
|
||||||
```python
|
```python
|
||||||
from typing import TypedDict
|
from typing import TypedDict
|
||||||
|
|
||||||
class AgentState(TypedDict):
|
class AgentState(TypedDict):
|
||||||
messages: list
|
messages: list
|
||||||
context: dict
|
next_action: str
|
||||||
checkpoint_id: str | None
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### State Flow
|
LangGraph validates state schema at compile time.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## State Flow
|
||||||
|
|
||||||
```
|
```
|
||||||
┌─────────────────────────────────────────────────────────────┐
|
┌─────────────────────────────────────────────────────────────┐
|
||||||
│ STATE FLOW IN LANGGRAPH │
|
│ STATE FLOW IN LANGGRAPH │
|
||||||
└─────────────────────────────────────────────────────────────┘
|
└─────────────────────────────────────────────────────────────┘
|
||||||
|
|
||||||
Input State
|
Input (dict)
|
||||||
│
|
│
|
||||||
▼
|
▼
|
||||||
┌──────────────┐
|
┌──────────────────────────────────────────────────────────┐
|
||||||
│ Node A │ ──▶ State Update (via reducer)
|
│ Superstep N │
|
||||||
│ (transform) │
|
│ ┌──────────────┐ ┌──────────────┐ │
|
||||||
└──────────────┘
|
│ │ Node A │────▶│ Channel │ │
|
||||||
│
|
│ │ (reads) │◀────│ (update) │ │
|
||||||
▼ (messages sent)
|
│ └──────────────┘ └──────────────┘ │
|
||||||
┌──────────────┐
|
│ │ │ │
|
||||||
│ Node B │ ──▶ State Update
|
│ └──────────────────────┘ │
|
||||||
│ (transform) │
|
│ ▼ │
|
||||||
└──────────────┘
|
│ [Checkpoint if enabled] │
|
||||||
|
└──────────────────────────────────────────────────────────┘
|
||||||
│
|
│
|
||||||
▼
|
▼
|
||||||
Output State
|
┌──────────────────────────────────────────────────────────┐
|
||||||
|
│ Superstep N+1 (or return final state) │
|
||||||
|
└──────────────────────────────────────────────────────────┘
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Reducers
|
## Reducers
|
||||||
|
|
||||||
### What Are Reducers?
|
### How Reducers Work
|
||||||
|
|
||||||
Reducers define how state updates are merged when multiple nodes produce updates.
|
Reducers define how multiple updates are merged:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# From graph/state.py
|
||||||
|
def add_messages(left: list, right: list) -> list:
|
||||||
|
return left + right
|
||||||
|
```
|
||||||
|
|
||||||
### Built-in Reducers
|
### Built-in Reducers
|
||||||
|
|
||||||
| Reducer | Behavior |
|
| Reducer | Function | Behavior |
|
||||||
|---------|----------|
|
|---------|----------|----------|
|
||||||
| `add_messages` | Append to list |
|
| `add_messages` | `list + list` | Append |
|
||||||
| `operator.or` | Union of sets |
|
| `last` | `(a, b) => b` | Last wins |
|
||||||
| `last` | Last value wins |
|
| `max` | `max(a, b)` | Maximum |
|
||||||
|
| `min` | `min(a, b)` | Minimum |
|
||||||
|
|
||||||
### Custom Reducers
|
### Custom Reducers
|
||||||
|
|
||||||
```python
|
```python
|
||||||
def merge_dicts(left: dict, right: dict) -> dict:
|
from typing import Annotated
|
||||||
"""Merge two dictionaries, with right taking precedence."""
|
|
||||||
result = left.copy()
|
def merge_contexts(a: dict, b: dict) -> dict:
|
||||||
result.update(right)
|
return {**a, **b}
|
||||||
return result
|
|
||||||
|
class AgentState(TypedDict):
|
||||||
|
context: Annotated[dict, merge_contexts]
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Checkpointing
|
## Checkpointing
|
||||||
|
|
||||||
### How Checkpointing Works
|
### Durability Modes
|
||||||
|
|
||||||
1. **Snapshot:** At each checkpoint, serialize full state
|
From `types.py`:
|
||||||
2. **Store:** Save to backend (SQLite, Postgres, etc.)
|
|
||||||
3. **Resume:** On failure, load from last checkpoint
|
```python
|
||||||
|
Durability = Literal["sync", "async", "exit"]
|
||||||
|
```
|
||||||
|
|
||||||
|
| Mode | Behavior |
|
||||||
|
|------|----------|
|
||||||
|
| `sync` | Persist before next superstep |
|
||||||
|
| `async` | Persist while next superstep runs |
|
||||||
|
| `exit` | Persist only when graph exits |
|
||||||
|
|
||||||
### Checkpoint Metadata
|
### Checkpoint Metadata
|
||||||
|
|
||||||
@@ -93,18 +116,18 @@ def merge_dicts(left: dict, right: dict) -> dict:
|
|||||||
config = {
|
config = {
|
||||||
"configurable": {
|
"configurable": {
|
||||||
"thread_id": "user-123",
|
"thread_id": "user-123",
|
||||||
"checkpoint_id": "checkpoint-abc123"
|
"checkpoint_id": "1ef-abc123"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
### Checkpoint Backends
|
### Checkpoint Backends
|
||||||
|
|
||||||
| Backend | Use Case |
|
| Backend | Module | Use Case |
|
||||||
|---------|----------|
|
|---------|--------|----------|
|
||||||
| **Memory** | Testing, short-lived |
|
| `InMemorySaver` | `langgraph.checkpoint.memory` | Testing |
|
||||||
| **SQLite** | Single machine, local |
|
| `SqliteSaver` | `langgraph.checkpoint.sqlite` | Local dev |
|
||||||
| **Postgres** | Production, distributed |
|
| `PostgresSaver` | `langgraph.checkpoint.postgres` | Production |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -112,32 +135,68 @@ config = {
|
|||||||
|
|
||||||
### What is a Thread?
|
### What is a Thread?
|
||||||
|
|
||||||
A thread (`thread_id`) represents an isolated conversation or task:
|
A thread (`thread_id`) isolates state:
|
||||||
|
|
||||||
```
|
```
|
||||||
Thread ID: "user-123"
|
Thread "user-123":
|
||||||
├── Checkpoint 1 (checkpoint-001)
|
├── checkpoint-001 (step 0)
|
||||||
├── Checkpoint 2 (checkpoint-002)
|
├── checkpoint-002 (step 1)
|
||||||
├── Checkpoint 3 (checkpoint-003) ← Current
|
├── checkpoint-003 (step 2)
|
||||||
└── State (current)
|
└── [current state]
|
||||||
```
|
```
|
||||||
|
|
||||||
### Thread Isolation
|
### Thread Isolation
|
||||||
|
|
||||||
- Each `thread_id` has independent state
|
- Independent checkpoints per thread
|
||||||
- Multiple threads can run in parallel
|
- Parallel threads via multiple `thread_id` values
|
||||||
- Human-in-the-loop works per-thread
|
- Resume from any checkpoint in a thread
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Interrupts (Human-in-the-Loop)
|
||||||
|
|
||||||
|
### Interrupt Mechanism
|
||||||
|
|
||||||
|
From `types.py`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
class Interrupt:
|
||||||
|
value: Any
|
||||||
|
when: Literal["during", "after"]
|
||||||
|
```
|
||||||
|
|
||||||
|
### Using Interrupts
|
||||||
|
|
||||||
|
```python
|
||||||
|
from langgraph.types import interrupt
|
||||||
|
|
||||||
|
def human_review(state):
|
||||||
|
# Pause for human input
|
||||||
|
feedback = interrupt({"task": "review", "data": state})
|
||||||
|
return {"feedback": feedback}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Command (Modify State)
|
||||||
|
|
||||||
|
```python
|
||||||
|
from langgraph.types import Command
|
||||||
|
|
||||||
|
def process_with_override(state):
|
||||||
|
return Command(
|
||||||
|
update={"status": "processed"},
|
||||||
|
resume={"feedback": "approved"}
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## State Updates
|
## State Updates
|
||||||
|
|
||||||
### Node Returns
|
### Node Returns Partial State
|
||||||
|
|
||||||
Nodes return partial state updates:
|
|
||||||
|
|
||||||
```python
|
```python
|
||||||
def node_a(state):
|
def node_a(state):
|
||||||
|
# Return only what this node updates
|
||||||
return {"messages": [AIMessage("hello")]}
|
return {"messages": [AIMessage("hello")]}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -147,10 +206,53 @@ def node_a(state):
|
|||||||
Node A returns: {"messages": [msg1], "counter": 1}
|
Node A returns: {"messages": [msg1], "counter": 1}
|
||||||
Node B returns: {"messages": [msg2], "counter": 2}
|
Node B returns: {"messages": [msg2], "counter": 2}
|
||||||
|
|
||||||
After reducer:
|
After reducer (add_messages for messages, last for counter):
|
||||||
{"messages": [msg1, msg2], "counter": 2}
|
{"messages": [msg1, msg2], "counter": 2}
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
*Generated for the WE*
|
## Checkpoint Implementation
|
||||||
|
|
||||||
|
### From Source (`pregel/_checkpoint.py`)
|
||||||
|
|
||||||
|
```python
|
||||||
|
def create_checkpoint(
|
||||||
|
channels: dict[str, BaseChannel],
|
||||||
|
versions: dict[str, int],
|
||||||
|
metadata: CheckpointMetadata
|
||||||
|
) -> Checkpoint:
|
||||||
|
"""Create a checkpoint from current channel values."""
|
||||||
|
return {
|
||||||
|
"channel_values": {k: v.checkpoint() for k, v in channels.items()},
|
||||||
|
"channel_versions": versions,
|
||||||
|
"metadata": metadata,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Resuming from Checkpoint
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Load channels from checkpoint
|
||||||
|
def channels_from_checkpoint(checkpoint: Checkpoint) -> dict:
|
||||||
|
return {
|
||||||
|
k: BaseChannel.from_checkpoint(v)
|
||||||
|
for k, v in checkpoint["channel_values"].items()
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Key Differences from OpenClaw
|
||||||
|
|
||||||
|
| Aspect | LangGraph | OpenClaw |
|
||||||
|
|--------|-----------|----------|
|
||||||
|
| **State Storage** | Channels in memory | Multi-layer memory |
|
||||||
|
| **Persistence** | Checkpoints | Session-memory hook |
|
||||||
|
| **Isolation** | thread_id | Session key |
|
||||||
|
| **Resumption** | checkpoint_id | Session restore |
|
||||||
|
| **Updates** | Reducers | Direct merge |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Generated from source code analysis*
|
||||||
|
|||||||
Reference in New Issue
Block a user