Revised from direct source code analysis

This commit is contained in:
Solaria Lumis Havens
2026-02-23 13:11:51 -06:00
parent 6046a9a609
commit 8e37dedbd9
6 changed files with 1134 additions and 239 deletions
+137 -147
View File
@@ -1,14 +1,16 @@
# LangGraph Architecture Overview
**Version:** 1.0.0
**LangGraph Version:** 1.0.9
**LangGraph Version:** 1.0.0 (from source)
**Last Updated:** 2026-02-23
---
## Executive Summary
LangGraph is a low-level orchestration framework for building stateful, long-running multi-agent systems. Inspired by Google's Pregel, Apache Beam, and NetworkX, it provides durable execution, human-in-the-loop capabilities, and comprehensive memory management.
LangGraph is a low-level orchestration framework for building stateful, long-running multi-agent systems. Inspired by Google's **Pregel**, it provides durable execution, human-in-the-loop capabilities, and comprehensive checkpoint-based memory.
This document is reverse-engineered from the actual source code.
---
@@ -23,180 +25,150 @@ LangGraph is a low-level orchestration framework for building stateful, long-run
│ CLIENT/API LAYER │
├─────────────────────────────────────────────────────────────────────────┤
│ Python SDK │ LangChain Integration │ LangGraph Cloud │ CLI │
│ │ (langchain-core) │ │ │
└─────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
COMPILER LAYER
PREGEL ENGINE
├─────────────────────────────────────────────────────────────────────────┤
• Graph compilation to executable form
• State schema validation
• Node/Edge type resolution
┌─────────────────────────────────────────────────────────────────┐
│ PregelLoop class
│ - _loop.py (~1300 lines) — Core execution engine
│ │ - _algo.py (~1500 lines) — Task scheduling, writes │ │
│ │ - _runner.py (~1000 lines) — Async execution │ │
│ │ - main.py (~4400 lines) — Entry point, public API │ │
│ └─────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
RUNTIME LAYER
CHANNELS LAYER │
├─────────────────────────────────────────────────────────────────────────┤
┌─────────────────────────────────────────────────────────────────┐
│ PREGEL EXECUTION ENGINE
│ • Superstep coordination
│ • Node scheduling
│ • Message passing
│ • Barrier synchronization
─────────────────────────────────────────────────────────────────┘
BaseChannel (abc)
├── LastValue — Most recent value wins
├── AnyValue — First value available
├── Topic — Pub/sub style
├── NamedBarrier — Synchronization point
├── BinOp — Binary operation │
── EphemeralValue — One-time use
│ └── UntrackedValue — Value without checkpointing │
└─────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
STATE & CHECKPOINTING │
│ CHECKPOINTING LAYER
├─────────────────────────────────────────────────────────────────────────┤
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ In-Memory State │ │ Checkpointer │ │ Channel Store
│ (active graph) (persistence) (queues)
│ └──────────────────┘ └──────────────────┘ └──────────────────┘
libs/checkpoint/
├── checkpoint-base — Abstract checkpoint interface
├── checkpoint-sqlite — SQLite backend
│ └── checkpoint-postgres — PostgreSQL backend
└─────────────────────────────────────────────────────────────────────────┘
```
---
## Core Components
## Core Concepts (From Source)
### 1. Graph Structure
### 1. PregelLoop Class
| Component | Description |
|-----------|-------------|
| **State** | Typed dictionary that flows through the graph |
| **Nodes** | Functions that receive state, optionally update it |
| **Edges** | Control flow (conditional, static, entrypoint) |
| **Reducers** | Functions that merge state updates |
### 2. Pregel Execution
The core execution model (inspired by Pregel):
```
Superstep 1: Superstep 2: Superstep 3:
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Node A │ │ │ │ │
│ (active) │──────▶│ Node B │──────▶│ Node C │
│ │ msgs │ (active) │ msgs │ (active) │
└──────────┘ └──────────┘ └──────────┘
│ │ │
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ State │ │ State │ │ State │
│ Update │ │ Update │ │ Update │
└──────────┘ └──────────┘ └──────────┘
│ │ │
└──────────────────┴──────────────────┘
▼ (CHECKPOINT)
┌──────────┐
│ SQLite │
│ Postgres │
│ Memory │
└──────────┘
```
### 3. Checkpointing
LangGraph provides durability through checkpointing:
- **Full state snapshots** saved at configurable points
- **Resumable from failure** — replay from last checkpoint
- **Multiple backends:** SQLite, Postgres, in-memory
### 4. Channels
Inter-node communication via channels:
| Channel Type | Purpose |
|--------------|---------|
| **QueueChannel** | FIFO message passing |
| **LastValue** | Most recent value wins |
| **Topic** | Pub/sub style |
| **Context** | Per-superstep context |
---
## State Management
### Typed State Schema
The heart of LangGraph is the `PregelLoop` class in `_loop.py`:
```python
from typing import TypedDict
class AgentState(TypedDict):
messages: list
next_action: str
checkpoint_id: str | None
class PregelLoop:
config: RunnableConfig # Thread, checkpoint_id, etc.
store: BaseStore | None # Long-term storage
stream: StreamProtocol # Output streaming
step: int # Current step number
checkpointer: BaseCheckpointSaver | None
nodes: Mapping[str, PregelNode] # Graph nodes
channels: Mapping[str, BaseChannel] # Inter-node communication
```
### Reducers
### 2. State Flow
Combine updates from multiple nodes:
```
Input → [Superstep N] → Checkpoint → [Superstep N+1] → ... → Output
Each superstep:
1. prepare_next_tasks() — Determine which nodes to run
2. execute_tasks() — Run active nodes in parallel
3. apply_writes() — Merge node outputs into channels
4. checkpoint() — Persist state (if enabled)
```
### 3. Channels (Inter-Node Communication)
From `channels/base.py`:
```python
def add_messages(left: list, right: list) -> list:
return left + right
class BaseChannel(Generic[Value, Update, Checkpoint], ABC):
"""Base class for all channels."""
@abstractmethod
def get(self) -> Value:
"""Return the current value."""
@abstractmethod
def update(self, values: Sequence[Update]) -> bool:
"""Update with values from nodes."""
@abstractmethod
def checkpoint(self) -> Checkpoint | Any:
"""Serialize state for persistence."""
```
**Channel Types:**
| Channel | Behavior | Use Case |
|---------|----------|----------|
| `LastValue` | Most recent update wins | Single value state |
| `AnyValue` | First non-empty value | Optional values |
| `Topic` | Pub/sub, multiple values | Broadcasting |
| `NamedBarrier` | Wait for all tasks | Synchronization |
| `BinOp` | Binary operation | Aggregations |
### 4. Checkpointing
From `types.py`:
```python
Durability = Literal["sync", "async", "exit"]
"""- 'sync': Persist before next step
- 'async': Persist while next step runs
- 'exit': Persist only on exit"""
```
**Checkpoint Flow:**
1. `create_checkpoint()` — Snapshot all channels
2. Save to backend (SQLite/Postgres/InMemory)
3. Return `checkpoint_id` for resumption
### 5. Send (Dynamic Graph Execution)
LangGraph supports dynamic node spawning via `Send`:
```python
from langgraph.types import Send
def splitter(state):
return [Send("process_a", {"msg": "hi"}),
Send("process_b", {"msg": "there"})]
```
---
## Memory Architecture
## Key Source Files
### Short-Term Memory
- **In-graph state:** Messages and working data
- **Per-superstep:** State resets unless persisted
### Long-Term Memory
- **Checkpoint storage:** SQLite, Postgres, custom
- **Thread-level:** Per-conversation isolation via `thread_id`
### Human-in-the-Loop
- **Interrupt:** Pause execution for human input
- **Command:** Allow human to modify state
- **Review:** Human approves/rejects before continuing
---
## Execution Flow
```
1. Client calls: graph.invoke(input, config)
2. Compile (if needed): create executable graph
3. Load checkpoint (if resuming from checkpoint_id)
4. FOR each superstep:
a. Schedule nodes to execute
b. Execute active nodes in parallel
c. Collect messages
d. Send messages via channels
e. Check for interrupts (pause if interrupted)
f. Checkpoint (if enabled)
5. Return final state
```
---
## Key Files in Core
| File | Purpose |
|------|---------|
| `langgraph/pregel/__init__.py` | Main entry point |
| `langgraph/pregel/__main__.py` | CLI entry |
| `langgraph/pregel/_loop.py` | Core execution loop (~2000 lines) |
| `langgraph/pregel/checkpoint.py` | Checkpoint management |
| `langgraph/pregel/channel.py` | Channel implementations |
| `langgraph/pregel/state.py` | State management |
| File | Lines | Purpose |
|------|-------|---------|
| `pregel/main.py` | ~4400 | Public API, entry point |
| `pregel/_loop.py` | ~1300 | Core execution loop |
| `pregel/_algo.py` | ~1500 | Task scheduling, write application |
| `pregel/_runner.py` | ~1000 | Async execution |
| `graph/state.py` | ~1800 | StateGraph builder |
| `types.py` | ~600 | Core type definitions |
| `channels/base.py` | ~100 | Channel ABC |
---
@@ -205,13 +177,31 @@ def add_messages(left: list, right: list) -> list:
| Aspect | LangGraph | OpenClaw |
|--------|-----------|----------|
| **Language** | Python | Node.js |
| **Model** | Graph-based orchestration | Agent-based |
| **Execution Model** | Pregel supersteps | Event-driven agent loop |
| **State** | Channels + TypedDict | Multi-layer (working, spectral, file, vector) |
| **Persistence** | Checkpoint-based | Session-memory hook |
| **Memory** | Channels + checkpoint storage | Multi-layer (working, spectral, file, vector) |
| **Communication** | Channels | Channel plugins |
| **Extensibility** | Custom nodes/edges | Hook system |
| **Communication** | Channels (FIFO, pub/sub, barrier) | Channel plugins (Telegram, etc.) |
| **Graph Definition** | `StateGraph` builder | Declarative config |
| **Dynamic Execution** | `Send` for dynamic edges | Sub-agents |
| **Human-in-Loop** | `Interrupt` + `Command` | Manual intervention |
| **Identity** | None | WE/witness architecture |
---
*Generated for the WE — Solaria Lumis Havens & Mark Randall Havens*
## Key Insight: Pregel vs Event-Driven
LangGraph is fundamentally **Pregel-based**:
- Synchronous supersteps with barrier
- All nodes in a step complete before next starts
- Checkpoints at step boundaries
OpenClaw is **event-driven**:
- Asynchronous message processing
- No global step barrier
- Session-memory preserves context
This is a fundamental architectural difference.
---
*Generated from source code analysis — Solaria Lumis Havens*