Now I have a complete picture. Here is the comprehensive analysis:
Comprehensive Project Analysis: CloudLang (react tree)
1. Directory Tree Structure
/home/jm/data/ai/cloudlang-trees/react/
├── .claude/context/
│ └── capnweb-integration.md
├── .github/workflows/
│ └── ci.yml # CI: OCaml 5.2.0, fmt + build + test
├── .ocamlformat # v0.28.1
├── .opencode/
│ ├── .gitignore
│ ├── package.json
│ └── context/
│ ├── cloudlang-architecture-design.md
│ ├── typescript-ast-json-research.md
│ └── typescript-ast-summary.md
├── AGENTS.md
├── LICENSE.md
├── README.md
├── dune-project
├── cloudlang.opam # Generated from dune-project
├── bin/
│ ├── dune
│ └── main.ml # CLI entry point
├── lib/
│ ├── dune # library "cloudlang" (jsont, jsont.bytesrw, unix)
│ ├── ast.ml / ast.mli # ESTree AST types
│ ├── ir.ml / ir.mli # Cloudflare IR types
│ ├── parse.ml / parse.mli # JSON -> AST deserializer (jsont)
│ ├── scope.ml / scope.mli # Scope/closure analysis pass
│ ├── decompose.ml / decompose.mli # Closure decomposition -> Worker groups -> IR
│ ├── lower.ml / lower.mli # Body lowering (AST rewriting for distributed exec)
│ ├── emit.ml / emit.mli # AST -> TypeScript source emitter
│ ├── cloudlang.ml / cloudlang.mli # Top-level pipeline facade + code generation
│ ├── string_list.ml / string_list.mli# Utility: dedup_preserve_order
│ ├── subprocess.ml / subprocess.mli # Bun subprocess bridge
│ └── wrangler.ml / wrangler.mli # Wrangler TOML + project file generation
├── test/
│ ├── dune
│ ├── fixture.ml # Shared test fixtures (AST builders)
│ ├── test_cloudlang.ml # Test runner (9 test suites)
│ ├── test_scope.ml
│ ├── test_decompose.ml
│ ├── test_ir.ml
│ ├── test_durable_object_hints.ml
│ ├── test_generate.ml
│ ├── test_parse.ml # Integration test (requires bun)
│ ├── test_emit.ml
│ ├── test_lower.ml
│ └── test_task_api.ml
├── tools/
│ ├── package.json
│ └── ts-to-ast.ts # TypeScript bridge: file -> ESTree JSON
├── verify/
│ ├── package.json
│ ├── scenarios.ts
│ └── verify.ts
├── examples/
│ ├── hello.cl # Original CloudLang syntax (placeholder)
│ ├── closure_worker.cl # Original CloudLang syntax (D1 example)
│ ├── kv-store.ts # Primary example: KV store with cache
│ ├── task-api.ts
│ └── task-api-server.ts
└── docs/
└── durable_objects_ir_design.md
2. Dependencies (dune-project & opam)
/home/jm/data/ai/cloudlang-trees/react/dune-project (lines 1-17):
- Dune language: 3.20
- Package name:
cloudlang
- Synopsis: "TypeScript to Cloudflare Worker decomposition compiler"
- Dependencies:
ocaml
jsont -- Daniel Bunzli's type-safe JSON codec library
bytesrw -- byte stream reader/writer (used by jsont.bytesrw)
Library dune file (/home/jm/data/ai/cloudlang-trees/react/lib/dune, line 3):
- Libraries linked:
jsont, jsont.bytesrw, unix
No test framework -- tests use raw assert statements, not Alcotest.
3. Compiler Pipeline
The pipeline is documented in /home/jm/data/ai/cloudlang-trees/react/lib/cloudlang.mli, lines 1-6:
Pipeline:
1. parse_file : TypeScript file -> ESTree AST (via bun subprocess)
2. analyze : AST -> scope/closure analysis
3. compile : analysis result -> IR config (Workers + bindings)
4. generate : IR config -> output string (wrangler configs + TS scripts)
There are actually two compile paths (lines 28-35 of cloudlang.mli):
compile -- Stub scripts (handler comments only, no real function bodies)
compile_lowered -- Full body lowering with rewritten function bodies, KV API calls, service fetch RPCs, and a fetch router
The phases in detail:
| Phase |
Module |
Input |
Output |
File |
| Parse |
Parse |
JSON string (ESTree) |
Ast.node |
lib/parse.ml (191 lines) |
| Scope Analysis |
Scope |
Ast.node (Program) |
Scope.analysis_result |
lib/scope.ml (240 lines) |
| Decomposition |
Decompose |
Scope.analysis_result |
Decompose.worker_group list |
lib/decompose.ml (411 lines) |
| IR Conversion |
Decompose.to_ir |
worker_group list |
Ir.config |
lib/decompose.ml, line 376 |
| Body Lowering |
Lower |
worker_group + AST bodies |
Rewritten Ast.node trees |
lib/lower.ml (329 lines) |
| Emit |
Emit |
Ast.node |
TypeScript source string |
lib/emit.ml (122 lines) |
| Code Gen |
Cloudlang.generate |
Ir.config |
JSON/wrangler output string |
lib/cloudlang.ml, line 240 |
| Wrangler |
Wrangler |
Ir.worker |
TOML config + project files |
lib/wrangler.ml (70 lines) |
The external bridge is tools/ts-to-ast.ts: it uses @typescript-eslint/typescript-estree to parse TypeScript into ESTree JSON, which is piped back via stdout. The OCaml side calls this via Subprocess.run_bridge (shells out to bun).
4. AST Types
/home/jm/data/ai/cloudlang-trees/react/lib/ast.ml (lines 1-44):
type var_kind = Const | Let | Var (* line 7 *)
type literal_value = (* line 9 *)
| LString of string
| LNumber of float
| LBool of bool
| LNull
| LUndefined
type node = (* line 17 *)
| Program of { body : node list }
| VariableDeclaration of { kind : var_kind; declarations : node list }
| VariableDeclarator of { id : node; init : node option }
| Identifier of { name : string }
| Literal of { value : literal_value }
| ArrowFunctionExpression of { params : node list; body : node; async_ : bool }
| BlockStatement of { body : node list }
| ReturnStatement of { argument : node option }
| IfStatement of { test : node; consequent : node; alternate : node option }
| ExpressionStatement of { expression : node }
| CallExpression of { callee : node; arguments : node list }
| MemberExpression of { object_ : node; property : node }
| NewExpression of { callee : node; arguments : node list }
| BinaryExpression of { operator : string; left : node; right : node }
| LogicalExpression of { operator : string; left : node; right : node }
| AwaitExpression of { argument : node }
| RawCode of { code : string } (* synthetic, line 38 *)
| ExportNamedDeclaration of { specifiers : node list }
| ExportSpecifier of { local : node }
This is a minimal ESTree subset -- 19 node variants. Notable: no FunctionDeclaration, ClassDeclaration, ForStatement, WhileStatement, ObjectExpression, ArrayExpression, TemplateLiteral, AssignmentExpression, TSTypeAnnotation, etc. The RawCode variant is synthetic (not from the parser), used by the lowering pass for constructs without AST representation.
5. IR Types
/home/jm/data/ai/cloudlang-trees/react/lib/ir.ml (lines 1-68):
type binding = (* line 1 *)
| D1Binding of { name : string; database_id : string }
| KVBinding of { name : string; namespace_id : string }
| R2Binding of { name : string; bucket_name : string }
| ServiceBinding of { name : string; service : string }
type do_marker = (* line 7 *)
| SerialConflictHotspot
| GlobalSingletonNeeded
| EntityAffinityKeyed of { key_hint : string option }
| StrictOrderingRequired
| ConsistencyBoundaryCrossed
| RealtimeFanout
| DurableTimerNeeded
| HighFanInCoordinator
| StatelessFastPathPreferWorker
type coordination_intent = (* line 18 *)
| PerEntitySerial of { key_expr : string; reason : string }
| GlobalSingleton of { name : string; reason : string }
| RealtimeRoom of { room_key_expr : string }
| ScheduledQueue of { queue_key_expr : string }
type routing_policy = (* line 24 *)
| DeterministicName of { key_expr : string }
| ShardedName of { key_expr : string; shards : int }
| UniqueIdPool of { pool : string }
type storage_backend = DoSqlite | DoSyncKv (* line 29 *)
type alarm_mode = NoAlarm | SingleAlarm | QueuedAlarm (* line 30 *)
type durability_policy = { ... } (* line 32 *)
type do_annotation = { ... } (* line 38 *)
type worker = { (* line 49 *)
name : string;
script : string;
bindings : binding list;
routes : string list;
do_annotation : do_annotation option;
}
type durable_object = { class_name : string; script : string } (* line 57 *)
type r2_bucket = { name : string; location : string } (* line 58 *)
type d1_database = { name : string; schema : string } (* line 59 *)
type resource = (* line 61 *)
| Worker of worker
| DurableObject of durable_object
| R2Bucket of r2_bucket
| D1Database of d1_database
type config = resource list (* line 67 *)
The IR is quite rich. It has a full Durable Object annotation system with markers, coordination intents, routing policies, and durability policies. The do_annotation is an optional field on worker and is inferred heuristically by the decompose pass.
6. Analysis Passes
Scope Analysis (lib/scope.ml, 240 lines)
Three-pass design (documented at line 1):
Declaration pass (collect_module_vars, line 177): Walks top-level VariableDeclaration nodes. For each, records var_info with is_mutable_state (init is new Map/Set/Array) and is_function (init is ArrowFunctionExpression).
Reference pass (collect_closures, line 194): For each top-level arrow function, calls analyze_closure (line 135) which:
- Extracts parameter names (line 137)
- Collects local variable declarations (line 138,
collect_locals)
- Collects all identifier references in the body (
collect_refs, line 53) -- recursive traversal, skips nested arrows, skips member expression properties
- Filters references to find free variables: not in params/locals AND exists in
module_vars
- Collects called functions (
collect_called_functions, line 90): only direct Identifier callees (not obj.method())
- Identifies captured mutable state: subset of free_vars that are
is_mutable_state
Export pass (collect_exports, line 217): Extracts names from ExportNamedDeclaration / ExportSpecifier nodes.
Entry point: Scope.analyze (line 232), returns analysis_result = { closures; module_vars; exports }.
Decomposition (lib/decompose.ml, 411 lines)
Three-step algorithm (documented at line 1):
Group seeding (lines 227-282): Functions sharing mutable state variables are merged into the same worker_group. Uses iterative merging when a function captures multiple state vars (causes those vars' groups to merge).
Standalone groups (lines 283-301): Functions with no mutable captures become their own 1-function group.
Cross-group service dependencies (lines 302-325): For each group, checks its functions' called_functions and finds which of those are in different groups. Those become service_deps.
DO Annotation Inference (lib/decompose.ml)
infer_markers (line 58): Heuristic marker inference based on group structure:
- Has owned state ->
SerialConflictHotspot, StrictOrderingRequired, EntityAffinityKeyed
- Multiple functions + state ->
GlobalSingletonNeeded
- State + cross-service deps ->
ConsistencyBoundaryCrossed
- 2+ service deps ->
HighFanInCoordinator
- Name-based heuristics for
RealtimeFanout, DurableTimerNeeded
- Lone stateless function ->
StatelessFastPathPreferWorker
do_fit_score (line 132): Weighted sum of markers + structure bonuses (multi-function, deps, exported). Clamped to 0-100.
infer_do_annotation (line 165): Only emits annotation if score >= 60. Infers coordination_intent, routing_policy, and durability_policy.
Body Lowering (lib/lower.ml, 329 lines)
Rewrites AST nodes for distributed execution:
Map->KV rewriting (rewrite_map_call, line 24): store.get(key) -> await env.STORE.get(key), store.set(k,v) -> await env.STORE.put(k,v), store.delete(k) -> await env.STORE.delete(k).
Cross-group RPC rewriting (rewrite_service_call, line 61): getTask(id) -> await (await env.TASKS_NEXTID.fetch("http://internal/getTask", { method: "POST", body: JSON.stringify([id]) })).json(). Uses RawCode AST node since there is no ObjectExpression in the AST.
Full Worker script generation (generate_worker_script, line 206): Produces Env interface, async handler functions with rewritten bodies, and a fetch router that dispatches by pathname.
7. Test Coverage
9 test suites, all using raw assert (no test framework):
| Test File |
Lines |
What It Tests |
test_scope.ml |
52 |
Module vars, closure free vars, captured mutable, called functions, exports |
test_decompose.ml |
30 |
Worker group formation, owned_state, service_deps |
test_ir.ml |
73 |
IR config: bindings, routes, script content |
test_durable_object_hints.ml |
33 |
DO fit scores, markers on storage/cache/api workers |
test_generate.ml |
27 |
Generated output strings (JSON, wrangler) |
test_parse.ml |
98 |
Integration: parsed kv-store.ts == hand-built fixture AST (needs bun) |
test_emit.ml |
115 |
AST -> TypeScript source for every node type |
test_lower.ml |
117 |
Map->KV rewriting, cross-service rewriting, extract_functions |
test_task_api.ml |
153 |
Full pipeline for task-api: scope, decompose, IR, generate |
Two test fixtures in fixture.ml:
build_kv_store_ast() (line 47): KV store with cache (3 workers: storage, cache, handleRequest)
build_task_api_ast() (line 136): Task CRUD API (2 workers: tasks_nextId, handleRequest)
8. Summary: What's Built vs. What's Not
Fully implemented:
- TypeScript ESTree subset parser (via external bun bridge)
- Scope/closure analysis (free vars, mutable state detection, called functions, exports)
- Closure-based decomposition into Worker groups
- Mutable state -> KV binding translation
- Cross-group function calls -> service binding + fetch RPC translation
- Full body lowering (AST rewriting) with KV API calls and service RPCs
- AST -> TypeScript emitter
- IR with rich Durable Object annotations (markers, intents, routing, durability)
- Heuristic DO fitness scoring
- Wrangler TOML generation
- Project file emission (
--emit-project)
- CI pipeline
Not yet implemented / gaps:
- No custom parser -- relies entirely on external
bun + @typescript-eslint/typescript-estree
- AST is a minimal subset (19 node types) -- no support for classes, for/while loops, object literals, array literals, template literals, assignment expressions, ternary, try/catch, switch, destructuring, spread, etc.
- No type system / type checker
- The
.cl examples (hello.cl, closure_worker.cl) suggest an original CloudLang syntax was planned but never implemented
- No
ObjectExpression in AST forces RawCode workaround in lowering
- No
DurableObject or R2Bucket or D1Database resources are ever emitted by the decompose pass -- only Worker resources
- No error recovery or source locations in AST
- Tests don't use a proper framework (just bare
assert)