Mental Models for the AI Semi & Server Space — Fable Extraction

Extracted recursively from Nomura's "Asia AI Semi & Server — Is the cycle over?" (Anchor Report, 30 June 2026). Three passes: Pass 1 inventories the heuristics (breadth — 50 models in 7 clusters). Pass 2 finds the machinery underneath — the causal engines that generate the Pass-1 models, their second-order effects, and what breaks them. Pass 3 compresses to the five kernels that regenerate everything, plus the working playbook.

Standing caveat: the logic below is meant to be durable; every number attached to it (who holds which bottleneck, which share, which multiple) is a 30-Jun-2026 calibration and will rot. Keep the model, re-anchor the number.


PASS 1 — The inventory

A · Scarcity & allocation — who eats

B · Bottleneck physics — where the constraint lives and moves

C · Demand epistemology — real vs booked

D · Technology forcing functions — what the physics mints

E · Money mechanics — who captures, who pays

F · Valuation & market behavior

G · Ecosystem structure


PASS 2 — The engines underneath

The 50 models are not independent; they are outputs of six causal engines. Each engine is stated as mechanism → what it generates → what breaks it. This is the recursion: models made of models.

Engine I — The Allocation Cascade

Mechanism: upstream demand measurement (C4/C5) → buyers race to book the binding input (A1/A3) → the duopoly crowds out marginal buyers (A5) → the giant expands what it owns, so the constraint migrates outward (B1/B3) → periphery owners inherit pricing power (E1/E3) → their estimates ratchet up (F1) → the ratchet licenses high-end multiples (F2/F5). Generates: the report's entire Buy-list ordering — the durable longs are bottleneck owners one or two layers down (ASE, Unimicron, EMC, TUC, ZDT, KYEC), while the marquee chips are demand drivers rather than capture points. Also generates A2 automatically: as the constraint migrates, the "destiny variable" migrates with it. Breaks when: the revision breadth rolls over (F1's own kill switch), or supply relief arrives on schedule (B4) and the pricing power fades where you're long — the cascade doesn't stop, it moves, and the failure mode is holding last year's constraint owner.

Engine II — The Mismatch Generator (why the bottleneck always moves outward)

Mechanism: planning conviction scales with balance sheet and customer proximity (B3) → in every demand shock, the center (TSMC) plans closest to reality and the periphery under-builds → the constraint lodges at the smallest node → price hikes there (E1) fund capacity → greenfield takes 2+ years (B4), during which demand grows again → the constraint hops to the next least-convinced node. Generates: the observed conveyor: 2026 CoW/memory/CPU → 2027 WoS/substrate/CCL/capacitors/PMIC/optics → 2028 SerDes/CPO/thermal (the report's own sequence). Predicts you can forecast the next bottleneck by asking "who in the chain still doesn't believe the demand?" — conviction lag, not technology, locates it. Breaks when: demand actually stops growing (the conveyor needs fuel), or when a coordinating actor (TSMC co-investment, hyperscaler prepayments, Broadcom-style backstops) transmits conviction down the chain faster than organic ordering — watch prepayment/LTA structures reaching small suppliers as the damping signal.

Engine III — The Demand-Truth Triangle

Mechanism: three legs — bookings (orders, RPO, book-to-bill), usage (tokens, app traffic), funding (FCF, financing) — and a set of named wedges that push bookings above real demand: strategic overbooking (A3/C11), stockpiling under inflation (C11), neocloud bridging (C7), circular financing (C8), capex-price inflation (C9). Generates: the report's title answer. "Is the cycle over?" = "do usage and funding still corroborate bookings?" As of June 2026: usage yes (tokens 20x), funding straining (2027F FCF), bookings inflated but backed. Each wedge carries its own unwind trigger: transition air pockets (C7), financier withdrawal (C8), inventory digestion (C11). Breaks when: treated as static. The triangle's legs update at different frequencies (usage monthly, bookings quarterly, funding annually-ish) — the discipline is noticing when a leg turns before its narrative does. The funding leg turns first this cycle (memory costs → FCF), which is why the report names it the top risk.

Engine IV — The Content Ratchet

Mechanism: physics escalates (reticle, TDP — D1/D2) → each escalation forces new packaging/materials/thermal (D4) → new BOM lines mint new suppliers (D2) → entrants climb the yield ladder into them (B8) → customers dual-source for leverage (A8) → share contests open at each platform transition (G3). Generates: revenue = units × content/unit × price, and this cycle uniquely grows on all three axes at once (units: racks 54.5k→62k; content: 22L→44L boards, SiC, liquid cooling; price: across-the-board hikes). Explains why supplier revenue forecasts (EMC +110%, TUC +100%) can exceed any believable unit growth — most of it is content × price. Also generates D7 as a special case: agentic AI is a content ratchet on the CPU side of the rack. Breaks when: physics offers a cheaper path — a packaging/integration breakthrough that reduces content per unit (the report's own CoWoP and panel-level candidates), or when a platform transition simplifies rather than complicates. Content ratchets reverse rarely but violently; watch for "cost-down generation" language in roadmaps.

Engine V — The Reflexivity Pair (the cycle attacks itself twice)

Mechanism, macro circuit: AI capex boom → measured economic acceleration → Fed forced from insurance hikes to earnest tightening → 10y >5% → the multiple (F2) — the most levered part of every thesis — de-rates complex-wide (F6). Mechanism, micro circuit: shortage → overbooking (C11) → inflated order signals feed capacity planning (B3's data input) → over-conviction builds exactly at the top → glut. Meanwhile the smoothers — neocloud bridging (C7), mix back-fill (A4), vendor financing (C8) — hide the transition seams that would otherwise reveal where real demand ends. Generates: the deep reading that the system's stabilizers are also its opacity: everything that makes the cycle resilient (flexible mix, bridge capacity, financed demand) also makes the eventual turn harder to see in order data. Hence the primacy of usage data (C2) — tokens can't be double-ordered. Breaks when: — it doesn't; it's the boundary condition. The practical model: position for the cascade (Engine I) while renting, not owning, the assumption that the reflexive pair stays dormant; the two circuit breakers (yields, revision breadth) are cheap to monitor daily.

Engine VI — The Milestone Ledger (research as dated falsification)

Mechanism: every load-bearing thesis is attached to a dated, observable event; the portfolio of open questions resolves on a calendar, not on argument. Grounding examples: TPU v9 tape-out end-2026E ("a key reality check for EMIB-T timing"); VR200 = 15-20% of shipments concentrated in 4Q26F; Rubin-Ultra transition 2Q27F; substrate constraints capping ASPEED's 3Q26F; Crusoe/G42 halts as demand-reality events. Generates: the difference between a view and a position. The EMIB-T question (A6/A7) isn't argued to conclusion — it's carried as an explicit coin-flip with both branches pre-priced (MediaTek "benefit of the doubt" on one side, TSMC's SoIC/CoPoS defense on the other). Second-order: whoever maintains the best milestone calendar systematically front-runs resolution repricings. Breaks when: milestones are allowed to slip silently. A slipped date is information (usually negative); re-dating without noting the slip is how theses become zombies.

Second-order observations (models about the model-set)


PASS 3 — The kernels

Compression test: which minimal set of questions regenerates all 50 models? Five kernels, plus the method rule. If you internalize these, Pass 1 is re-derivable on demand — including for constraints, players, and technologies that don't exist yet.

The underwriting playbook — pricing a name in this space

  1. Position (K1): what has it secured — allocation, qualification, BOM slots? Whose leftovers cap it?
  2. Constraint (K2): does it own the current or the next binding step? If neither, it's a demand story — apply K3 harder.
  3. Foot (K0): reconcile its implied volumes to the common unit and the capacity ceiling. If its plan requires more than its allocation, someone's number is wrong — probably its.
  4. Truth (K3): how much of its backlog is wedge (double-order, bridge, financed)? What's the unwind event?
  5. Content (K4): is its content-per-unit rising with the spec roadmap? Content growth survives unit stalls.
  6. Torque (E3): smallest qualified player at the bottleneck = max earnings elasticity; size accordingly.
  7. Multiple (K5): where in the historical band, what narrative holds it there, which forward year is the basis, and what's the consensus-gap slope telling you about the real bet?
  8. Calendar (Engine VI): list the dated events that resolve the thesis; if you can't, you don't have one.

The standing watchlist (dated, as of 30-Jun-2026 — re-anchor quarterly)


Source: all grounding drawn from Nomura Global Markets Research, "Asia AI Semi & Server — Is the cycle over?", Anchor Report, 30 June 2026 (Jeng, Lee, Teng, Yang, Chen, Hu). The models are this extractor's synthesis; the report supplies the evidence. Logic durable; calibration expires.

04 — An Autonomous Research System for Anchor-Class Sell-Side Work

Designed from first principles against Nomura's "Asia AI Semi & Server — Is the cycle over?" (30 June 2026). The question this document answers: if you wanted software agents to produce research of this class — not a summary of it, the research itself — continuously and autonomously, what would you build? The plan starts from the research problems, derives the epistemic operations they require, and only then names tools, skills, and workflows. Anything in the inventory that cannot be traced back to a problem should be deleted.


1. What the artifact actually is, and what "replicate" means

The report is not 120 pages of content. It is one decision, warranted: stay long Asia AI semis after an 85% run, because the cycle has not peaked — plus the tradeable decomposition of that call into 25 tickers and 9 target-price raises. Everything else is warrant.

Read closely, the warrant rests on exactly three pillars, and they are worth naming because they are the specification:

  1. A proprietary leading indicator, footed to supply. The data-center build tracker (280 projects → GW → chips → CoWoS wafers) lets the analysts see demand before it appears in Asia supply-chain data, in units that reconcile against TSMC's capacity. The edge is not a better opinion; it is an earlier, commensurable measurement.
  2. Internal consistency at scale. The same 1,800kpcs CoWoS-output number flows into the exec summary, the allocation table, the TSMC chapter, ASE's LEAP bridge, ASPEED's BMC SAM, and the server forecast — and foots everywhere. A reader who checks any two sections against each other finds they agree. That is what makes 120 pages feel like one mind.
  3. Vintage discipline. Nearly every table is old-vs-new. The product is not the level; it is the revision — what changed since December, and why. The narrative is literally a diff with reasons attached.

So "replicate" means: reproduce the warrant, not the pages. A system that emits the same figures without provenance, footing, and diffs has replicated nothing; a system that maintains those three pillars over a living state can emit this report — and next year's report about bottlenecks that don't exist yet — as a rendering step.

The honest ledger: where agents lose, where they win

Be precise about the gap between a Claude-driven system and the Nomura desk, because the architecture must be shaped by it.

Agents lose on: - Field checks. "Our latest supply chain survey suggests…" is human relationship capital. Per-node kpcs figures, OSAT booking tone, testing-time-per-chip — not scrapeable, not licensable. This is the single genuine moat in the report. - Licensed data. Bloomberg consensus (BEst), Similarweb, TrendForce pricing. Deterministic but paid. - Being talked to. Companies pre-brief analysts. No feed replaces that.

Agents win on: - Exhaustiveness. The desk samples; an agent fleet can read everything — every earnings call including the 40 no analyst covers, every 8-K exhibit, every Taiwanese-language board resolution, every ERCOT interconnection filing, every Korean HBM trade story. Breadth substitutes for access more often than people expect. - Language. The supply chain discloses in Chinese, Japanese, and Korean first. MOPS filings, DIGITIMES Chinese edition, Nikkei, Korean IR — native multilingual reading is a structural advantage over any Anglophone desk. - Persistence. The system never forgets a vintage, never loses a source URL, never fails to re-check a number it published. - Consistency. Perfect footing is trivial for software and hard for six analysts with spreadsheets. (The report has visible OCR-era artifacts and at least one duplicated figure.) - Adversarial rigor at scale. Every major claim can get a dedicated refuter agent before publication. Humans do this for the thesis, not for 500 numbers. - Accountability. The system can score every dated forecast it ever made against outcomes and publish the scoreboard. Sell-side does not.

The design principle that falls out: substitute exhaustive public-source triangulation for privileged access, and persistent versioned state for analyst memory — and wherever the field-check moat is genuinely load-bearing, degrade to calibrated intervals rather than fake the point estimate.


2. The five research problems

Strip the report to its irreducible questions and there are five. Every section serves one of them.

P1 · Demand reality — Is the demand real, how large, arriving when?

The hardest problem, because the conventional signals (price hikes, LTAs, book-to-bill >2x, overbooking) all read "cycle top," and the analysts must decide whether to override them. The report's answer: construct an independent measurement upstream of consensus (the DC build tracker), triangulate against usage ground truth (tokens processed, Gen-AI traffic share), and separate real demand from booked demand (strategic overbooking, neocloud bridging, vendor financing, SpaceX-style circular deals all inflate bookings). Failure mode if skipped: you are long a double-ordering mirage, or flat for a supercycle.

P2 · Constraint location — Which step of the chain binds, who owns it, when does it relieve, where does it go next?

The report's signature intellectual move: TSMC's CoW is not the bottleneck precisely because TSMC controls it and is expanding aggressively; the constraint migrates to what the giant does not control (WoS, substrates, CCL, capacitors), and its owners inherit the pricing power. Supply relief is datable: greenfield ≈ 2 years from build-start, so late-2025 starts ⇒ constrained through 2027F. Failure mode: you buy the marquee chip vendor when the money is being made two layers down.

P3 · Allocation under scarcity — Given a binding resource, who gets what?

Zero-sum share assignment over the constrained output: nVidia ~55%, Google ~27%, AMD 8-9%, AWS 5-6% of 1.8mn wafers. This is what converts a macro call into per-company revenue. Note its logical structure: it is a constrained estimation problem — shares must sum to the (interval-valued) total, and each customer's share is evidenced by roadmaps, guided capex, rack shipments, and channel noise. Failure mode: per-company forecasts that don't jointly reconcile to physical supply — which is precisely the error the rest of the Street makes.

P4 · Translation to securities — What does the above do to revenue, EPS, and what's priced in?

Mechanical but merciless: segment bridges (LEAP, SiC thermal plates, BMC SAM) → quarterly P&L → EPS → gap vs consensus → target price = chosen multiple × chosen forward basis. The report's craft details matter: the multiple sits at the high end of the historical band (the re-rating IS the trade), the EPS basis rolls forward to keep TPs rising, and the consensus gap widening into out-years marks where the differentiated bet lives (MediaTek: +2.7% vs consensus 2026F → +35.5% 2028F). Failure mode: right thesis, no tradeable expression — or a TP that silently disagrees with its own model.

P5 · Falsification and timing — What would make this wrong, and when do we find out?

The report carries its own kill switches: EPS-revision breadth rolling over (the true top signal), 10-year yields through 5%, TPU-v9 tape-out on EMIB-T (end-2026E — a dated reality check on the biggest structural threat), project halts (Crusoe Jade, MSFT/G42 Kenya as canaries), memory-cost trajectory versus hyperscaler FCF. Failure mode: a thesis with no resolution criteria is an opinion, and it will be defended long after it dies.

P0, cross-cutting: the pillars from §1 — provenance, footing, vintage discipline — are not a sixth problem but the quality bar every answer must clear.


3. From problems to operations

Each problem decomposes into a small set of epistemic operations — the verbs the system must be able to perform. There are nine, and the entire tool inventory in §5 exists to perform them.

# Operation What it is Where the report does it
O1 Census Enumerate a population exhaustively, don't sample 280 DC projects; 18+ AWS deals; every CSP quote for 18 months
O2 Normalize Convert heterogeneous signals to one currency GW → chips (TDP, 70% load) → CoWoS wafers (÷16 or ÷9)
O3 Corroborate ≥2 independent paths to each fact; deals confirmed from both counterparties Rubin Ultra 2-die floorplan validated against the Kyber blade demo; IREN deals visible in both IREN and Dell/Microsoft disclosures
O4 Foot Force every aggregate and every consumer of it to reconcile Customer shares sum to output; output ≤ capacity; TP identical in Fig.1, chapter, and appendix
O5 Bound When the point is unknowable, carry an interval and let constraints tighten it "2,500–3,500kpcs by 2029F depending on price hikes"; "15-20% VR200 mix"
O6 Diff Express every claim as a change vs prior vintage and vs consensus Old/new TP columns; 28→32GW; forecast-revision tables in every chapter
O7 Refute Attack the claim before publishing it; keep the strongest surviving objection attached "We once expected testing-time cuts…"; overbooking risk carried alongside the ASPEED Buy
O8 Date Attach a resolution timestamp to every forward claim TPU v9 tape-out end-2026E; GB300→VR200 transition late-2Q26F; relief-by-2028F
O9 Render Project the state into human artifacts, last and least The 120 pages themselves

Two of these deserve emphasis because they are where an autonomous system can exceed the desk:

O4/O5 together produce a mechanism the report only gestures at: infeasibility as signal. If the sum of evidenced customer demand exceeds the capacity interval, that is not a modeling error to be smoothed — it is a measurement of overbooking, publishable as such. The constraint solver's residual is a research finding. (The report reaches the same conclusion — "biggest-ever component supply mismatch" — by analyst intuition; the system gets it as an arithmetic byproduct, quarterly, for free.)

O7 at scale is a new product tier. The desk red-teams its thesis over lunch. A workflow can spawn a refuter per material claim — 200 adversarial passes per publication — and publish each claim with its surviving counter-argument. No human desk can match that, and it directly addresses the reader's actual question ("what would these analysts say if they were wrong?").


4. The architecture that falls out

Four layers plus an evaluation harness. The organizing unit is the claim, not the section, the company, or the model — because the warrant (§1) is a property of claims.

4.1 The claim ledger — a derivation DAG under version control

Everything the system knows is one of four node types in a single graph, stored as human-readable files (YAML/Parquet + DuckDB views) in a git repository:

Properties this buys, each mapped to an operation:

4.2 Sensors — and the channel-check substitution stack

Sensors are ingestion agents, each owning a source class and a cadence, each emitting facts into the ledger (never derived claims). The interesting design work is not the scraper list — it is the substitution stack for the one input the desk has and agents don't. For each thing a channel check delivers, the public proxy chain:

Field-check deliverable Public substitution chain Residual gap
CoWoS/WoS kpcs per node TSMC capex + tool-maker calls (ASML, AMAT, BESI backlog & lead times) + Taiwan fab construction permits + MOPS monthly revenue of every listed OSAT/substrate name + TSMC's own guided ratios Point → interval; timing ±1Q
OSAT/substrate booking tone MOPS monthly revenue inflections (Taiwan's unique disclosure: every listed supplier prints revenue monthly), book-to-bill remarks in calls, job postings, local-language trade press Tone → lagged by ~4-6 weeks
Component shortage severity Distributor lead-time indices, spot-price series (memory, capacitors), price-hike announcements in CJK trade press, purchasing-manager commentary across all downstream industries' calls Good — arguably better than anecdote
Deal confirmation Two-sided corroboration protocol: every deal has ≥2 counterparties and usually one is SEC/TWSE/HKEX-registered even when the other isn't (IREN 8-Ks confirm Microsoft; Dell confirms IREN; utility interconnection queues confirm GW claims) Near-complete for material deals
Product/packaging intel Symposium decks (TSMC NA Symposium, NEPCON, GTC, Computex), patents, teardown photos, HBM vendor roadmaps Confirmed-vs-inferred must be labeled
Per-rack BOM constants (BMC/rack, layer counts) OCP contributions, vendor slide OCR, teardown literature Sparse; carry as assumptions with review dates

The MOPS monthly-revenue monitor deserves star billing: Taiwan mandates monthly revenue disclosure for all listed companies. That is a free, 12×/year, ground-truth read on the entire supply chain the report covers — TSMC, ASE, ASPEED, KYEC, EMC, TUC, ZDT, Unimicron, and a hundred smaller names. A monitor that ingests it within hours of each print, normalizes it, and diffs it against the system's own implied trajectories is the single highest-value/lowest-cost sensor in the whole design, and it is the honest backbone of channel-check substitution.

Cadences layer as: continuous (news/deal wire), daily (prices, EDGAR/MOPS filings), monthly (MOPS revenue sweep — a fixed calendar event, ~day 10), quarterly (earnings-season transcript sweep — the big recalibration), event-driven (GTC/Computex/symposia, tape-out rumors, project-halt reports).

4.3 Models — code with assumptions factored out

The report's quantitative spine is surprisingly few distinct models. Each becomes a small, tested Python module whose every tunable input is an assumption node in the ledger (so recalibration is a data change, never a code change):

  1. GW-deployment bridge — project ledger → GW by year → chips → wafer demand (the Fig. 3 engine).
  2. Capacity/output bridge — nameplate capacity by step → binding-step solve → realizable output (the 2,000 vs 1,800kpcs engine). The solver takes the chain as data (steps, owners, capacity intervals, lead times), so when the constraint migrates to optics in 2028 the chain grows a node — no new model.
  3. Allocation solver — constrained estimation of share vectors given the output interval + per-customer evidence; residual infeasibility emitted as an overbooking measurement (§3).
  4. Unit×ASP revenue bridge — per customer per generation → the Fig. 35-class master table.
  5. Server-market rollup — GPU supply → module mix → racks (with yield/bottleneck discount) → units/revenue by segment.
  6. Three-statement company model — driver-based quarterly P&L → BS/CF with identity checks; one parameterized engine, N tickers.
  7. Valuation engine — historical multiple bands, TP = multiple × basis, consensus-gap ladder, band-percentile flags. (It should flag the report's own tricks: "TP basis rolled forward from 2027F to 2028F" is a disclosure the system makes automatically.)
  8. TAM/content bridges — attach-rate stacks (BMC per rack, SiC per Feynman, CCL layer content per platform) for the per-name kickers.

Eight models, each under ~500 lines, each with golden tests pinned to the June-2026 report's published numbers.

4.4 The hypothesis book — theses as objects under test

Research is distinguished from aggregation by carrying theses, and a thesis is only honest if it can die. The hypothesis book is a directory of standing claims, each a file:

id: H-2026-03            # "WoS, not CoW, is the 2027 binding constraint"
status: active           # active | resolved-true | resolved-false | retired
claim: >
  Wafer-on-substrate and small components bind 2027F AI-chip output below
  TSMC's CoW nameplate; owners of those steps capture the price hikes.
evidence_for: [fact-ids...]        # auto-appended by sensors
evidence_against: [fact-ids...]    # auto-appended — see refuter below
resolution:
  criteria: CoWoS output/capacity gap at 4Q27 print; substrate ASP trajectory
  date: 2028-01-31
falsifiers:                        # each becomes a scheduled watch
  - substrate spot ASP rolls over two consecutive months
  - TSMC reports CoWoS output ≈ capacity for 2 quarters
tradeable_expression: [ASE, Unimicron, TUC, EMC positioning]

Two agents service every hypothesis: a curator (routes new facts to the right evidence list) and a standing refuter whose only job is to find disconfirming evidence — searched for as actively as the confirming kind. Falsifiers compile into the scheduled-watch list (§5). When a resolution date arrives, the hypothesis must resolve or be explicitly re-dated with a stated reason — no silent evergreen theses.

The report's ~10 major calls (cycle-not-over; WoS bottleneck; Google-share-rising; agentic-CPU renaissance; EMIB-T coin-flip; FCF squeeze 2027; shortage spillover; neocloud bridge; re-rating persistence; memory-cost double-edge) seed the book on day one.

4.5 Publication passes — views over the ledger

Rendering is terminal and cheap once the state is right. Three artifact classes:

Every artifact embeds its git vintage tag; every number is hyperlinked (in the HTML render) to its provenance subtree.

4.6 The evaluation harness — the report as a labeled dataset

This is the step most designs skip and the one that makes the system trustworthy.


5. The concrete inventory

Mapped to this harness's actual primitives: tools = deterministic code invoked by agents; skills = codified procedures (slash-invocable prompt programs); workflows = multi-agent orchestrations; scheduled agents = cron-driven autonomy.

Tools (deterministic, testable — the ledger and the models)

Tool Spec Serves
ledger Claim-graph CRUD over git-backed store; provenance walk; dirty-propagation; footing assertions as CI P0, O3/O4/O6
bridge-gw Project rows → GW/chips/wafers with interval arithmetic P1, O2/O5
bridge-capacity Step-chain → binding constraint → output interval; chain is data P2, O5
solve-allocation Constrained share estimation; infeasibility → overbooking metric P3, O4/O5
model-company Parameterized 3-statement + segment-bridge engine P4
value Bands, TP arithmetic, consensus-gap ladder, basis-roll disclosure P4
rollup-server GPU→module→rack→segment units/revenue P1/P3
render Typed data-contracts → tables/charts/Gantt/report assembly O9
score Forecast-resolution ledger and calibration metrics §4.6

Skills (procedures an agent follows; each ends by writing facts/updating the ledger)

Skill Procedure
/ingest-call TICKER Fetch transcript → extract KPI/capex/capacity quotes verbatim with spans → corroborate figures vs filings → facts into ledger → dirty-propagate
/ingest-deal URL Parse announcement → normalize units → seek the counterparty's disclosure (two-sided rule) → dedup vs project ledger → status-flag
/mops-sweep Monthly Taiwan revenue sweep → normalize → diff vs system-implied trajectories → flag inflections to hypothesis curator
/update-hypothesis H-ID Curator pass: route new evidence, recheck falsifiers, draft status note
/red-team CLAIM Structured refutation: strongest counter-case, missing-evidence list, verdict + confidence; attaches to the claim
/foot Run all DAG assertions; report violations with node paths (pre-publication gate)
/company-refresh TICKER Rerun model from latest facts → revision table → TP check → chapter render
/flash EVENT Blast-radius computation → short note render → publish
/anchor-pass Orchestrates the full quarterly publication (invokes the workflow below)
/recoverability-audit The §4.6 backtest against a pinned source cutoff

Workflows (fan-out orchestrations; where parallel agents earn their cost)

Workflow Shape
earnings-season-sweep ~60 transcripts × /ingest-call in parallel → barrier → cross-company contradiction scan (same fact claimed differently by two counterparties is a finding) → hypothesis-book routing
deal-census Multi-modal search fan-out (newsrooms, 8-K, CJK trade press, ISO queues, permits) → dedup/entity-resolution → two-sided corroboration pass → ledger upsert; loop-until-dry, not fixed-N
allocation-refresh Per-customer evidence agents (roadmap, guidance, rack data) in parallel → solve-allocation → infeasibility report → refuter pass on the share vector
anchor-report Chapter renders in parallel from one pinned vintage → footing gate → refuter fleet over material claims (O7 at scale) → assembly → human sign-off
red-team-fleet For publication: one refuter per material claim, adversarial verify with majority-kill, survivors ship with their counter-arguments
moat-audit The recoverability backtest: claim-extraction agents over the target report → per-claim recovery attempts from pinned sources → moat map

Scheduled agents (the autonomy layer)

Schedule Agent
Continuous/hourly Deal & news watcher (feeds /ingest-deal; halt/pause keywords page immediately)
Daily Filings sweep (EDGAR/MOPS/HKEX), price & yield monitor (the 10y>5% falsifier lives here)
~Day 10 monthly /mops-sweep — the supply-chain heartbeat
Earnings season earnings-season-sweep trigger per calendar
Weekly Hypothesis-book review: falsifier checks, resolution-date enforcement, refuter refresh
Quarterly anchor-report workflow → human review → publish; score update

6. What stays human, and the degradation contract

The system should be honest about its boundary rather than paper over it:


7. Build order — compounding assets first

The correct ordering criterion is not architectural elegance; it is time-in-market of the assets that compound. Two things in this design get more valuable every week they run, and cannot be backfilled later: the deal/project ledger (P1's census) and the MOPS monitor's history (the supply-chain heartbeat). Start them first even while everything else is a stub.

Phase 0 (weeks 1-2) — the spine. ledger tool with git vintaging + footing assertions; seed assumptions and the hypothesis book (10 theses from the report); stand up the daily deal watcher and /mops-sweep. The system is already useful here: it is a living tracker with provenance.

Phase 1 (weeks 3-6) — measurement. /ingest-call + earnings-season-sweep; bridge-gw and bridge-capacity with golden tests against Figs 3/18-22; two-sided deal corroboration; first flash notes fire off real events.

Phase 2 (weeks 7-10) — the calls. solve-allocation (with the overbooking residual), model-company for the nine tickers, value, /red-team. First allocation-refresh produces the system's own Fig. 35-class table with intervals.

Phase 3 (weeks 11-14) — the proof. moat-audit against the June-2026 report: publish the recoverability map internally. First full anchor-report pass, human-reviewed, diffed against Nomura's own next update when it ships — that comparison is the system's first real exam.

Steady state. Quarterly anchors, event flashes, monthly heartbeat, weekly hypothesis hygiene, scoreboard accruing. Headcount equivalent: the desk that wrote this report is ~6 analysts; the goal is not to fire them — it is a system where one analyst-operator plus the fleet covers what six did, with better footing, total provenance, a public track record, and no claim it can't defend.


One-line summary: build a git-versioned claim graph fed by exhaustive multilingual sensors, with eight small models whose assumptions are data, a hypothesis book that is forced to resolve, adversarial refutation at publication, and an evaluation harness that treats the Nomura report itself as the labeled test set — then the 120 pages become a render, and the research becomes the thing that runs every day.