<!--
******************************************************
* RECALL COMPILED OUTPUT
* SOURCE: (embedded below)
* RECALL VERSION: 1.2.1
******************************************************

* IDENTIFICATION DIVISION.
*    PROGRAM-ID.   UC-254-CASE.
*    PAGE-TITLE.   CASE-TITLE-FULL.
*    DESCRIPTION.  META-DESCRIPTION.
*    AUTHOR.       "StratIQX Research".
*    DATE-WRITTEN. "July 2026".
*
* ENVIRONMENT DIVISION.
*    SUPPRESS-DEFAULT-CSS YES.
*    LOAD PLUGIN @stratiqx/recall-components.
*    COPY FROM "@stratiqx/recall-components/themes/stratiqx-case.rcpy".
*
* DATA DIVISION.
*
*    WORKING-STORAGE SECTION.
*       01 CASE-ID              PIC X(8)    VALUE "UC-254".
*       01 CASE-TITLE           PIC X(200)  VALUE "The Efficiency Escape Hatch: *The Fragile Number in the Bear Case*".
*       01 CASE-TITLE-FULL      PIC X(200)  VALUE "UC-254: The Efficiency Escape Hatch — The Fragile Number in the Bear Case".
*       01 CASE-SUBTITLE        PIC X(200)  VALUE "If Cost-Per-Token Keeps Collapsing, Today's Insane Demand Curve Is Tomorrow's Rational One".
*       01 META-DESCRIPTION     PIC X(500)  VALUE "The bear case on AI capex rests on one assumption: that cost-per-token stays high. It has fallen roughly 10x a year — 280x for GPT-3.5-class capability in under two years. This is the deliberate counterexample to the reckoning cluster: the efficiency escape hatch, argued in full, and then its own honest weakness. A 6D amplifying analysis.".
*       01 HERO-LABEL           PIC X(80)   VALUE "• 6D Amplifying Analysis · The Counterexample".
*       01 CASE-EXEC-SUMMARY    PIC X(1600) VALUE "Every bearish case in this cluster shares one hidden assumption: that a token stays expensive. It has not. The cost to reach a fixed capability level has fallen roughly 10x per year — by Stanford's measure, 280x for GPT-3.5-class quality in under two years, from $20 to $0.07 per million tokens.[1][2] Mixture-of-experts models activate ~5% of their parameters per token; distilled 32B models now beat last year's frontier reasoning models; compute-to-fixed-capability halves about every 8 months.[3] Satya Nadella named the mechanism: Jevons paradox — cheaper AI means more of it, not less.[4] If that curve holds, a demand forecast that looks insane at today's cost is rational at next year's, and the overbuild thesis is the fragile number on the page. This case argues the escape hatch in full — and then, honestly, names where it does not hold.".
*       01 CASE-TYPE            PIC X(80)   VALUE "Amplifying · AI Economics · Counterexample".
*       01 CASE-SECTOR          PIC X(80)   VALUE "AI Infrastructure / Compute Economics".
*       01 CASE-DATE            PIC X(30)   VALUE "July 2026".
*       01 CASE-REVIEW-DATE     PIC X(30)   VALUE "".
*       01 THEME-OVERRIDES      PIC X(600)  VALUE "--accent:#059669;--accent-rgb:5,150,105;--accent-bg:rgba(5,150,105,0.08);--accent-glow:#34d399;--border-accent:rgba(5,150,105,0.2);".
*       01 CASE-CHIRP           PIC X(10)   VALUE "80".
*       01 CASE-DRIFT           PIC X(6)    VALUE "42".
*       01 CASE-CONFIDENCE      PIC X(6)    VALUE "0.79".
*       01 CASE-FETCH           PIC X(10)   VALUE "2,654".
*       01 CASE-VERDICT         PIC X(30)   VALUE "EXECUTE — HIGH PRIORITY".
*       01 SECTION-1-NUMBER     PIC X(4)    VALUE "01".
*       01 SECTION-1-TITLE      PIC X(60)   VALUE "The Insight".
*       01 SECTION-1-BODY       PIC X(1600) VALUE "This case exists to argue against the rest of its own cluster. The reckoning thesis — that the ~$725B AI buildout will not earn its return — rests on an assumption almost nobody states out loud: that the cost of running the models stays roughly where it is. It has not stayed. It has collapsed.".
*       01 SECTION-1-BODY-2     PIC X(1600) VALUE "The numbers are not marketing. a16z documented that the cost to run a model of equivalent performance has fallen about 10x per year — roughly 1,000x over three years, from $60 to $0.06 per million tokens for GPT-3-class capability.[1] Stanford's AI Index put it more conservatively and more precisely: the cost to query a model at GPT-3.5 quality fell 280x in under two years, from $20.00 to $0.07 per million tokens.[2] Epoch AI, studying it independently, found compute-to-reach-a-fixed-capability halves roughly every 8 months — far faster than Moore's Law.[3]".
*       01 SECTION-1-BODY-3     PIC X(1600) VALUE "The mechanism is real engineering, not hope. Mixture-of-experts architectures activate only ~5% of a model's parameters per token (DeepSeek: 671B total, 37B active). Speculative decoding delivers 2–3x throughput in production. Distillation now yields 32B models that beat last year's frontier reasoning models on hard benchmarks.[3] Satya Nadella gave the demand-side its name the day DeepSeek crashed Nvidia: Jevons paradox strikes again — as AI gets more efficient, its use skyrockets.[4] If a token keeps getting cheaper and demand keeps expanding to fill it, the capex is not an overbuild; it is a floor.".
*       01 SECTION-1-BODY-4     PIC X(1600) VALUE "Now the honest weakness — the seam a smart critic attacks, named before they do. Two things. First, what fell was the cost to reach a *fixed* capability; the frontier price *floor* barely moved — GPT-3's 2021 launch price and a late-2024 frontier model both sat near $60 per million output tokens.[1] Second, Jevons only rescues *provider* revenue if demand is price-elastic above 1 — and the economists asked (Northeastern's Hanser, Venkatesan) explicitly decline to assert it. Sequoia's David Cahn, no bear, put the risk in his own words: GPU compute is turning into a commodity, competed down to marginal cost. Cheaper tokens can grow the buyer's bill while compressing the seller's margin toward zero. Efficiency is the escape hatch — but it can deflate revenue as surely as it expands demand. It cuts both ways, and this case says so.".
*       01 SECTION-1-STAT           PIC X(20)   VALUE "280x".
*       01 SECTION-1-STAT-LABEL     PIC X(140)   VALUE "Fall in cost to run GPT-3.5-class capability, Nov 2022 to Oct 2024 (Stanford AI Index)".
*       01 SECTION-1-STAT-CONTEXT   PIC X(500)  VALUE "$20.00 to $0.07 per million tokens for the same capability, in under two years.[2] The bear case assumes this stops. Its most honest counter is that the frontier price floor did not fall at all — the collapse is in reaching yesterday's bar, not today's.[1]".
*       01 SECTION-2-NUMBER     PIC X(4)    VALUE "02".
*       01 SECTION-2-TITLE      PIC X(60)   VALUE "The Timeline".
*       01 SECTION-2-BODY       PIC X(1600) VALUE "The efficiency curve that the bear case has to assume away — and the shock that proved it moves markets.".
*       01 SECTION-2-BODY-2     PIC X(1600) VALUE "".
*       01 SECTION-2-BODY-3     PIC X(1600) VALUE "".
*       01 SECTION-2-QUOTE      PIC X(400)  VALUE "Jevons paradox strikes again! As AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can't get enough of. — Satya Nadella, Microsoft CEO, January 27, 2025".
*       01 SECTION-3-NUMBER     PIC X(4)    VALUE "03".
*       01 SECTION-3-TITLE      PIC X(60)   VALUE "6D Cascade Analysis".
*       01 SECTION-3-BODY       PIC X(1600) VALUE "The cascade originates in D5 — Quality — because the lever is capability-per-dollar: the same output delivered at a collapsing cost.[1][2] From D5 it amplifies into D6 (the operational engineering — mixture-of-experts, speculative decoding, distillation, quantization that make the fall real) and D2 (the demand economics — Jevons, cheaper compute pulling more use) together, then D1 (broader access as inference approaches free) and D3 (the developers and workloads that expand to fill it). D4 (the regulatory/structural question of whether commoditized inference sustains a market) is the longest-lag dimension. This case is deliberately the counter-cascade to the cluster: [UC-251] documents the market pricing an overbuild — UC-254 is the case that, if the efficiency curve holds, breaks that read. [UC-044] is the sibling efficiency case it amplifies; [UC-220] is the buildout whose demand curve efficiency would rationalize. The honest hedge is stated in the analysis, not hidden: the frontier floor did not fall, and falling cost can deflate provider revenue.".
*       01 SECTION-3-BODY-2     PIC X(1600) VALUE "".
*       01 SECTION-3-BODY-3     PIC X(1600) VALUE "".
*       01 INSIGHTS-NUMBER      PIC X(4)    VALUE "04".
*       01 INSIGHTS-TITLE       PIC X(40)   VALUE "Key Insights".
*       01 FETCH-DATA-CHIRP         PIC X(60)   VALUE "80".
*       01 FETCH-DATA-DRIFT         PIC X(20)   VALUE "42".
*       01 FETCH-DATA-CONFIDENCE    PIC X(6)    VALUE "0.79".
*       01 FETCH-DATA-FETCH         PIC X(30)   VALUE "2,654".
*       01 FETCH-DATA-VERDICT       PIC X(50)   VALUE "EXECUTE — HIGH PRIORITY".
*       01 FETCH-DATA-CONTEXT       PIC X(900)  VALUE "FETCH 2,654 sits just below its sibling UC-251 (2,692) by design — this is the counterweight to the rotation, not a louder claim than it. DRIFT 42 is the widest in the cluster on purpose: the methodology is strong (the cost-decline is primary-verified by Stanford and Epoch) but the performance claim — that efficiency rescues the buildout's economics — is genuinely contested, with named economists declining to assert the elasticity it requires. Confidence 0.79, the lowest in the cluster, because this is an argument that concedes its own load-bearing weakness: the frontier floor did not fall, and cheaper tokens can cut provider revenue as well as expand demand. A strong counterexample is measured by how honestly it states the case against itself.".
*       01 CAL-SOURCE-HINT      PIC X(240)  VALUE "efficiency-escape-hatch · amplifying · D5 origin · the counterexample — if cost-per-token keeps falling the bear case breaks".
*       01 CAL-SOURCE-DOI       PIC X(40)   VALUE "10.5281/zenodo.18905193".
*       01 CAL-SOURCE-CODE      PIC X(3000) VALUE "-- UC-254: The Efficiency Escape Hatch: 6D Amplifying Cascade (COUNTEREXAMPLE)\n-- The fragile number in the bear case (counters UC-251/255; amplifies UC-044)\nFORAGE efficiency_escape_hatch\nWHERE cost_per_token_collapsing = true\n  AND demand_expands_to_fill = true\n  AND bear_case_assumes_high_cost = true\nACROSS D5, D6, D2, D1, D3, D4\nDEPTH 3\nSURFACE efficiency_escape_hatch\n\nDIVE INTO capability_per_dollar\nWHEN fixed_capability_cost_falls_10x_yr = true\n  AND frontier_floor_holds = true\nTRACE efficiency_counter_cascade\nEMIT efficiency_escape_hatch_signal\n\nDRIFT efficiency_escape_hatch\nMETHODOLOGY 84\nPERFORMANCE 42\n\nFETCH efficiency_escape_hatch\nTHRESHOLD 1000\nON EXECUTE CHIRP high 'Cost to run a fixed-capability model fell ~10x a year - 280x for GPT-3.5-class in under two years - so a demand curve that looks insane today is rational tomorrow and the overbuild thesis is the fragile number; but the frontier floor did not fall and cheaper tokens can deflate provider revenue - the escape hatch cuts both ways'\n\nSURFACE analysis AS json".
*       01 CTA-HEAD             PIC X(200)  VALUE "Before you short the buildout, price the token it runs on falling 10x a year.".
*       01 CTA-BODY             PIC X(360)  VALUE "Every overbuild thesis has one assumption it never states. Find it. Here, it is that a token stays expensive.".
*       01 FOOTER-BRAND         PIC X(80)   VALUE "StratIQX · Strategic Cascade Intelligence".
*       01 FOOTER-META          PIC X(100)  VALUE "Case UC-254 · July 2026 · v1.0 · The Efficiency Escape Hatch".
*       01 FOOTER-DISCLOSURE    PIC X(280)  VALUE "All dimension scores and FETCH calculations are derived from the 6D Foraging Methodology™ applied to cited primary sources.".
*       01 FOOTER-LEGAL         PIC X(200)  VALUE "Disclaimer https://stratiqx.com/disclaimer, Terms https://stratiqx.com/terms, Privacy https://stratiqx.com/privacy".
*       01 SOURCES-INTRO        PIC X(600)  VALUE "Seven sources: Stanford's AI Index and a16z for the primary cost-decline series, Epoch AI for the independent efficiency rate, the architectural levers (mixture-of-experts, distillation, speculative decoding), Nadella's Jevons framing — and, for the honest counter, the frontier-floor caveat and the economists who decline to assert the elasticity the bull case needs.".
*       01 WINDOW-STATUS        PIC X(10)   VALUE "".
*       01 WINDOW-HEALTH        PIC X(4)    VALUE "".
*       01 WINDOW-CONTEXT       PIC X(400)  VALUE "".
*
*    ITEMS SECTION.
*       01 HERO-STATS.
*          05 HERO-STATS-1.
*             10 HERO-STATS-1-VALUE  PIC X(20)  VALUE "280x".
*             10 HERO-STATS-1-LABEL  PIC X(40)  VALUE "Cost fall, GPT-3.5-class, <2yr (Stanford".
*          05 HERO-STATS-2.
*             10 HERO-STATS-2-VALUE  PIC X(20)  VALUE "~10x / yr".
*             10 HERO-STATS-2-LABEL  PIC X(40)  VALUE "Inference cost decline, fixed capability".
*          05 HERO-STATS-3.
*             10 HERO-STATS-3-VALUE  PIC X(20)  VALUE "8 months".
*             10 HERO-STATS-3-LABEL  PIC X(40)  VALUE "Compute-to-fixed-capability halving (Epo".
*          05 HERO-STATS-4.
*             10 HERO-STATS-4-VALUE  PIC X(20)  VALUE "~5%".
*             10 HERO-STATS-4-LABEL  PIC X(40)  VALUE "Parameters active per token (mixture-of-".
*          05 HERO-STATS-5.
*             10 HERO-STATS-5-VALUE  PIC X(20)  VALUE "~$60/M".
*             10 HERO-STATS-5-LABEL  PIC X(40)  VALUE "Frontier price floor — the part that did".
*          05 HERO-STATS-6.
*             10 HERO-STATS-6-VALUE  PIC X(20)  VALUE "6 of 6".
*             10 HERO-STATS-6-LABEL  PIC X(40)  VALUE "Dimensions amplified".
*       01 CASCADE.
*          05 CASCADE-1.
*             10 CASCADE-1-CASCADE-PATTERN    PIC X(50)  VALUE "D5 > D6+D2 > D1+D3 > D4".
*          05 CASCADE-2.
*             10 CASCADE-2-CASCADE-DIMS-HIT   PIC X(8)   VALUE "6 of 6".
*          05 CASCADE-3.
*             10 CASCADE-3-CASCADE-MULTIPLIER PIC X(20)  VALUE "Efficiency = a floor".
*          05 CASCADE-4.
*             10 CASCADE-4-CASCADE-FETCH      PIC X(10)  VALUE "2,654".
*       01 CAL-TRACE.
*          05 CAL-TRACE-1.
*             10 CAL-TRACE-1-LAYER   PIC X(10)   VALUE "SENSE".
*             10 CAL-TRACE-1-DETAIL  PIC X(1000) VALUE "FORAGE: cost to run a fixed-capability model fell ~10x/yr (a16z: $60 to $0.06/M for GPT-3-class over 3yr; Stanford: 280x, $20 to $0.07/M for GPT-3.5-class in <2yr); Epoch: compute-to-fixed-capability halves ~every 8mo. Levers: mixture-of-experts (DeepSeek 671B total / 37B active), speculative decoding (2-3x), distillation (R1-Distill-32B beats o1-mini). Nadella: Jevons paradox strikes again (Jan 27 2025). Honest counter: frontier price floor held near $60/M output; Jevons rescues provider revenue only if elasticity>1 (economists decline); Cahn: compute is a commodity competed to marginal cost. Signal: efficiency is the escape hatch, and it cuts both ways.".
*          05 CAL-TRACE-2.
*             10 CAL-TRACE-2-LAYER   PIC X(10)   VALUE "ANALYZE".
*             10 CAL-TRACE-2-DETAIL  PIC X(1000) VALUE "DRIFT 42 (widest in the cluster) — methodology strong (cost-decline primary-verified, Stanford + Epoch) against a contested performance claim (does efficiency rescue the buildout's economics?). D5 origin (capability-per-dollar) cascades to D6 (the engineering levers) + D2 (Jevons demand economics), then D1 (access) + D3 (workloads), with D4 (does commoditized inference sustain a market) the longest lag. This is the deliberate counter-cascade to UC-251/255 — the case built to break the bear read if the efficiency curve holds.".
*          05 CAL-TRACE-3.
*             10 CAL-TRACE-3-LAYER   PIC X(10)   VALUE "DECIDE".
*             10 CAL-TRACE-3-DETAIL  PIC X(1000) VALUE "FETCH 2,654 exceeds threshold 1,000. EXECUTE — HIGH PRIORITY, calibrated just below UC-251 so the counterweight does not out-shout the rotation. The cost-decline is primary; the rescue claim is conceded as contested. WATCH: whether cost-per-token deflation slows (it has not — still steep, which supports this case) and whether provider margins hold or commoditize toward marginal cost (Cahn's risk). Confidence 0.79 reflects a counterexample that states its own weakest joint on the page.".
*       01 SOURCES.
*          05 SOURCES-1.
*             10 SOURCES-1-NUM       PIC X(4)    VALUE "1".
*             10 SOURCES-1-TIER      PIC X(80)   VALUE "Tier 1 — Official & Structural Data".
*             10 SOURCES-1-TEXT      PIC X(500)  VALUE "a16z (Guido Appenzeller) — LLMflation: LLM inference cost is going down fast (November 12, 2024). Cost for equivalent performance falls ~10x/year; ~1,000x over three years, from $60 to $0.06 per million tokens for GPT-3-class capability. Includes the load-bearing caveat this case cites: the frontier price floor did not fall — a late-2024 frontier model still cost ~$60/M output, the same as GPT-3 in 2021.".
*             10 SOURCES-1-URL       PIC X(200)  VALUE "https://a16z.com/llmflation-llm-inference-cost/".
*             10 SOURCES-1-URL-LABEL PIC X(30)   VALUE "a16z.com · Nov 2024".
*          05 SOURCES-2.
*             10 SOURCES-2-NUM       PIC X(4)    VALUE "2".
*             10 SOURCES-2-TEXT      PIC X(500)  VALUE "Stanford HAI — 2025 AI Index Report, Research & Development chapter. The cost to query a model at GPT-3.5-equivalent quality (MMLU 64.8) fell from $20.00 per million tokens (November 2022) to $0.07 (October 2024, Gemini-1.5-Flash-8B) — a more than 280-fold reduction in ~18 months. The cleanest fully-primary-verified figure in the case.".
*             10 SOURCES-2-URL       PIC X(200)  VALUE "https://hai.stanford.edu/ai-index/2025-ai-index-report/research-and-development".
*             10 SOURCES-2-URL-LABEL PIC X(30)   VALUE "stanford hai · 2025".
*          05 SOURCES-3.
*             10 SOURCES-3-NUM       PIC X(4)    VALUE "3".
*             10 SOURCES-3-TEXT      PIC X(500)  VALUE "Epoch AI — Algorithmic progress in language models (231 models through March 2024). Compute required to reach a fixed capability level halves roughly every 8 months (95% CI: 5–14 months), far faster than Moore's Law. Authors' caveat, cited for honesty: 60–95% of the gains came from more compute/data, only 5–40% from novel algorithms. Levers: mixture-of-experts, distillation, speculative decoding.".
*             10 SOURCES-3-URL       PIC X(200)  VALUE "https://epoch.ai/blog/algorithmic-progress-in-language-models".
*             10 SOURCES-3-URL-LABEL PIC X(30)   VALUE "epoch.ai · Mar 2025".
*          05 SOURCES-4.
*             10 SOURCES-4-NUM       PIC X(4)    VALUE "4".
*             10 SOURCES-4-TEXT      PIC X(500)  VALUE "Satya Nadella (Microsoft CEO), on X, January 27, 2025: Jevons paradox strikes again! As AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can't get enough of. Posted the day DeepSeek's efficiency drove a ~$600B one-day Nvidia decline — the canonical bull-case demand argument.".
*             10 SOURCES-4-URL       PIC X(200)  VALUE "https://x.com/satyanadella/status/1883753899255046301".
*             10 SOURCES-4-URL-LABEL PIC X(30)   VALUE "x.com · Jan 27 2025".
*          05 SOURCES-5.
*             10 SOURCES-5-NUM       PIC X(4)    VALUE "5".
*             10 SOURCES-5-TIER      PIC X(80)   VALUE "Tier 2 — Industry Analysis".
*             10 SOURCES-5-TEXT      PIC X(500)  VALUE "DeepSeek-V3 / R1 technical reports and Hugging Face model cards. Mixture-of-experts: 671B total parameters, 37B active per token (~5.5%). R1-Distill-Qwen-32B beats OpenAI o1-mini on AIME/MATH/GPQA. R1 (Jan 20, 2025) rivaled o1 at API pricing of $0.55/$2.19 per million vs o1's $15/$60 — the concrete proof that capability is diffusing down the cost curve.".
*             10 SOURCES-5-URL       PIC X(200)  VALUE "https://huggingface.co/deepseek-ai/DeepSeek-R1".
*             10 SOURCES-5-URL-LABEL PIC X(30)   VALUE "huggingface.co · 2025".
*          05 SOURCES-6.
*             10 SOURCES-6-NUM       PIC X(4)    VALUE "6".
*             10 SOURCES-6-TEXT      PIC X(500)  VALUE "The honest counter (Northeastern economists + Sequoia). Northeastern's Philip Hanser and Madhavi Venkatesan explicitly decline to assert that AI demand is price-elastic above 1 — the condition Jevons requires to grow provider revenue rather than just buyer spend. Sequoia's David Cahn, in his own words: GPU compute is turning into a commodity, competed down to marginal cost. The case against the escape hatch, stated fairly.".
*             10 SOURCES-6-URL       PIC X(200)  VALUE "https://news.northeastern.edu/2025/02/07/jevons-paradox-ai-future/".
*             10 SOURCES-6-URL-LABEL PIC X(30)   VALUE "northeastern.edu · Feb 2025".
*          05 SOURCES-7.
*             10 SOURCES-7-NUM       PIC X(4)    VALUE "7".
*             10 SOURCES-7-TEXT      PIC X(500)  VALUE "The margin caveat (Ed Zitron and inference-economics analysis). Newer reasoning models burn far more tokens per query, so per-query cost has in cases risen even as per-token price fell; AI-provider gross margins (~52%) run well below traditional SaaS (70–90%) because inference cost scales with every query. Why falling cost-per-token can deflate revenue and compress margin — the seam this case concedes.".
*             10 SOURCES-7-URL       PIC X(200)  VALUE "https://www.wheresyoured.at/ais-economics-dont-make-sense/".
*             10 SOURCES-7-URL-LABEL PIC X(30)   VALUE "wheresyoured.at · Apr 2026".
*       01 INSIGHTS.
*          05 INSIGHTS-1.
*             10 INSIGHTS-1-TITLE  PIC X(80)   VALUE "Every overbuild thesis hides one assumption".
*             10 INSIGHTS-1-BODY   PIC X(800)  VALUE "That the unit cost of the thing stays where it is. For AI, the unit is a token, and it has fallen ~10x a year.[1] Before you call the buildout an overbuild, price the token it runs on falling by an order of magnitude annually — then re-run the demand curve.".
*          05 INSIGHTS-2.
*             10 INSIGHTS-2-TITLE  PIC X(80)   VALUE "Efficiency is engineering, not hope".
*             10 INSIGHTS-2-BODY   PIC X(800)  VALUE "Mixture-of-experts (5% of parameters active), speculative decoding (2–3x), distillation (32B beats last year's frontier). These are shipped, measured techniques.[3] The cost fall is not a forecast; it already happened, twice over, and Epoch confirmed it independently.".
*          05 INSIGHTS-3.
*             10 INSIGHTS-3-TITLE  PIC X(80)   VALUE "The frontier floor did not fall — the honest asterisk".
*             10 INSIGHTS-3-BODY   PIC X(800)  VALUE "What collapsed was the cost to reach a fixed capability. The frontier price floor stayed near $60/M output from 2021 to late 2024.[1] The escape hatch works for commodity capability, not for the newest frontier — a distinction the loudest bull cases skip and this one does not.".
*          05 INSIGHTS-4.
*             10 INSIGHTS-4-TITLE  PIC X(80)   VALUE "It cuts both ways".
*             10 INSIGHTS-4-BODY   PIC X(800)  VALUE "Jevons grows the buyer's bill; it does not automatically grow the seller's margin. If inference commoditizes to marginal cost — Cahn's own warning — cheaper tokens can deflate provider revenue faster than they expand demand.[6][7] The escape hatch is real and double-edged; a counterexample that hid the second edge would not be worth citing.".
*       01 TIMELINE.
*          05 TIMELINE-1.
*             10 TIMELINE-1-DATE      PIC X(40)   VALUE "Nov 2022 → Oct 2024".
*             10 TIMELINE-1-TITLE     PIC X(80)   VALUE "280x, in under two years".
*             10 TIMELINE-1-BODY      PIC X(400)  VALUE "Stanford's AI Index measures the cost to query a GPT-3.5-class model falling from $20.00 to $0.07 per million tokens — a 280x reduction — as capability diffuses down to smaller, cheaper models.[2]".
*             10 TIMELINE-1-DOT       PIC X(20)   VALUE "breakthrough".
*             10 TIMELINE-1-TAG       PIC X(60)   VALUE "The Curve".
*             10 TIMELINE-1-TAG-TYPE  PIC X(20)   VALUE "".
*          05 TIMELINE-2.
*             10 TIMELINE-2-DATE      PIC X(40)   VALUE "Nov 2024".
*             10 TIMELINE-2-TITLE     PIC X(80)   VALUE "a16z names it: LLMflation".
*             10 TIMELINE-2-BODY      PIC X(400)  VALUE "a16z documents that inference cost for equivalent performance falls about 10x per year — roughly 1,000x over three years for GPT-3-class capability. The efficiency curve gets a name and a slope.[1]".
*             10 TIMELINE-2-DOT       PIC X(20)   VALUE "breakthrough".
*             10 TIMELINE-2-TAG       PIC X(60)   VALUE "".
*             10 TIMELINE-2-TAG-TYPE  PIC X(20)   VALUE "".
*          05 TIMELINE-3.
*             10 TIMELINE-3-DATE      PIC X(40)   VALUE "Jan 27, 2025".
*             10 TIMELINE-3-TITLE     PIC X(80)   VALUE "DeepSeek proves it moves markets".
*             10 TIMELINE-3-BODY      PIC X(400)  VALUE "DeepSeek R1 matches frontier reasoning at a fraction of the cost. Nvidia falls ~17% in a day — the largest one-day market-cap loss in history — and Nadella posts: Jevons paradox strikes again. Efficiency is now a market force, not a footnote.[4]".
*             10 TIMELINE-3-DOT       PIC X(20)   VALUE "breakthrough".
*             10 TIMELINE-3-TAG       PIC X(60)   VALUE "The Shock".
*             10 TIMELINE-3-TAG-TYPE  PIC X(20)   VALUE "".
*          05 TIMELINE-4.
*             10 TIMELINE-4-DATE      PIC X(40)   VALUE "Mar 2025".
*             10 TIMELINE-4-TITLE     PIC X(80)   VALUE "Independently confirmed".
*             10 TIMELINE-4-BODY      PIC X(400)  VALUE "Epoch AI finds compute needed to reach a fixed performance level halves roughly every 8 months — faster than Moore's Law. But its own authors note 60–95% of gains came from scaling, only 5–40% from algorithms.[3]".
*             10 TIMELINE-4-DOT       PIC X(20)   VALUE "breakthrough".
*             10 TIMELINE-4-TAG       PIC X(60)   VALUE "".
*             10 TIMELINE-4-TAG-TYPE  PIC X(20)   VALUE "".
*          05 TIMELINE-5.
*             10 TIMELINE-5-DATE      PIC X(40)   VALUE "2026".
*             10 TIMELINE-5-TITLE     PIC X(80)   VALUE "The floor that did not fall".
*             10 TIMELINE-5-BODY      PIC X(400)  VALUE "The honest asterisk holds through 2026: the frontier price floor stayed near $60 per million output tokens, and named economists decline to assert the demand elasticity the bull case needs. The escape hatch is real — and it cuts both ways.[1]".
*             10 TIMELINE-5-DOT       PIC X(20)   VALUE "crisis".
*             10 TIMELINE-5-TAG       PIC X(60)   VALUE "The Caveat".
*             10 TIMELINE-5-TAG-TYPE  PIC X(20)   VALUE "".
*       01 DIMENSIONS.
*          05 DIMENSIONS-1.
*             10 DIMENSIONS-1-CODE      PIC X(4)     VALUE "D5".
*             10 DIMENSIONS-1-NAME      PIC X(50)    VALUE "Quality".
*             10 DIMENSIONS-1-LAYER     PIC X(10)    VALUE "origin".
*             10 DIMENSIONS-1-SCORE     PIC X(4)     VALUE "88".
*             10 DIMENSIONS-1-TAG       PIC X(60)    VALUE "Capability Per Dollar".
*             10 DIMENSIONS-1-EVIDENCE  PIC X(1200)  VALUE "The lever is capability-per-dollar collapsing: the same GPT-3.5-class quality that cost $20 per million tokens in 2022 cost $0.07 by late 2024 — 280x.[2] a16z's ~10x/year and Epoch's 8-month halving corroborate it independently.[1][3] D5 is the origin because the entire counterexample rests on one measured fact — that quality is getting radically cheaper to deliver — which, if it continues, resets every downstream economic assumption in the cluster.".
*          05 DIMENSIONS-2.
*             10 DIMENSIONS-2-CODE      PIC X(4)     VALUE "D6".
*             10 DIMENSIONS-2-NAME      PIC X(50)    VALUE "Operational".
*             10 DIMENSIONS-2-LAYER     PIC X(10)    VALUE "l1".
*             10 DIMENSIONS-2-SCORE     PIC X(4)     VALUE "84".
*             10 DIMENSIONS-2-TAG       PIC X(60)    VALUE "The Engineering".
*             10 DIMENSIONS-2-EVIDENCE  PIC X(1200)  VALUE "The fall is produced by real, shipped engineering: mixture-of-experts activating ~5% of parameters (DeepSeek 671B total / 37B active), speculative decoding at 2–3x production throughput, distillation yielding 32B models that beat last year's frontier reasoning models, and quantization (FP8/4-bit) as the mainstream serving default.[3][5] D6 amplifies from D5 because efficiency is not a market mood — it is a stack of techniques that keep compounding, which is why the curve has held across multiple independent measurements.".
*          05 DIMENSIONS-3.
*             10 DIMENSIONS-3-CODE      PIC X(4)     VALUE "D2".
*             10 DIMENSIONS-3-NAME      PIC X(50)    VALUE "Revenue".
*             10 DIMENSIONS-3-LAYER     PIC X(10)    VALUE "l1".
*             10 DIMENSIONS-3-SCORE     PIC X(4)     VALUE "80".
*             10 DIMENSIONS-3-TAG       PIC X(60)    VALUE "The Double Edge".
*             10 DIMENSIONS-3-EVIDENCE  PIC X(1200)  VALUE "The revenue dimension is where the escape hatch becomes two-sided. Jevons (Nadella) says cheaper compute expands total use and thus the market.[4] But Jevons only grows *provider* revenue if demand elasticity exceeds 1 — which named economists decline to assert — and Cahn warns compute is commoditizing to marginal cost.[6] D2 carries both the bull mechanism and its honest refutation: falling cost can grow the buyer's bill while compressing the seller's margin. The single most contested dimension in the case.".
*          05 DIMENSIONS-4.
*             10 DIMENSIONS-4-CODE      PIC X(4)     VALUE "D1".
*             10 DIMENSIONS-4-NAME      PIC X(50)    VALUE "Customer".
*             10 DIMENSIONS-4-LAYER     PIC X(10)    VALUE "l2".
*             10 DIMENSIONS-4-SCORE     PIC X(4)     VALUE "78".
*             10 DIMENSIONS-4-EVIDENCE  PIC X(1200)  VALUE "As inference approaches free, access broadens and use expands — the demand-pull half of Jevons. Enterprise GenAI spend rose from $1.7B to $37B across 2023–2025 even as per-token price fell more than 90%.[4] D1 shows the mechanism working on the buyer side: usage genuinely skyrockets. The unresolved question this dimension inherits from D2 is whether that expanding usage lands as provider revenue or as commodity throughput.".
*          05 DIMENSIONS-5.
*             10 DIMENSIONS-5-CODE      PIC X(4)     VALUE "D3".
*             10 DIMENSIONS-5-NAME      PIC X(50)    VALUE "Employee".
*             10 DIMENSIONS-5-LAYER     PIC X(10)    VALUE "l2".
*             10 DIMENSIONS-5-SCORE     PIC X(4)     VALUE "66".
*             10 DIMENSIONS-5-EVIDENCE  PIC X(1200)  VALUE "Cheaper inference expands the set of workloads and developers that can afford to run AI — the agentic and reasoning workflows that multiply token consumption exist because the token got cheap enough to burn.[3] D3 is where efficiency turns into new demand rather than saved cost: the same fall that could deflate a provider's price is what lets a developer run 50–500x more tokens per task, which is precisely the behavior the bull case needs and the bear case fears.".
*          05 DIMENSIONS-6.
*             10 DIMENSIONS-6-CODE      PIC X(4)     VALUE "D4".
*             10 DIMENSIONS-6-NAME      PIC X(50)    VALUE "Regulatory".
*             10 DIMENSIONS-6-LAYER     PIC X(10)    VALUE "l3".
*             10 DIMENSIONS-6-SCORE     PIC X(4)     VALUE "60".
*             10 DIMENSIONS-6-TAG       PIC X(60)    VALUE "Watch — Market Structure".
*             10 DIMENSIONS-6-EVIDENCE  PIC X(1200)  VALUE "D4 is the longest-lag dimension: whether a commoditized inference market sustains the capital that built it. Open-source token share rose from 34% to 65% in the first half of 2026, and Chinese models undercut frontier pricing by an order of magnitude.[7] If inference becomes a true commodity with no pricing power, the market structure that funds the buildout is the thing at risk — the slowest-moving and most decisive question the escape hatch raises about itself.".
*       01 RELATED.
*          05 RELATED-1.
*             10 RELATED-1-NUM    PIC X(8)    VALUE "251".
*             10 RELATED-1-TITLE  PIC X(200)  VALUE "The Show-Me Rotation".
*             10 RELATED-1-TYPE   PIC X(30)   VALUE "diagnostic".
*             10 RELATED-1-FETCH  PIC X(10)   VALUE "2,692".
*             10 RELATED-1-LABEL  PIC X(80)   VALUE "The Case It Counters".
*             10 RELATED-1-BODY   PIC X(800)  VALUE "UC-251 documents the market pricing AI capex as an overbuild. UC-254 is the deliberate counterexample: if the cost-per-token curve holds, the rotation is pricing a floor as if it were a ceiling. The two are meant to be read against each other — the ledger's disconfirmation, built in.".
*          05 RELATED-2.
*             10 RELATED-2-NUM    PIC X(8)    VALUE "044".
*             10 RELATED-2-TITLE  PIC X(200)  VALUE "The $80 Billion Pressure Valve".
*             10 RELATED-2-TYPE   PIC X(30)   VALUE "amplifying".
*             10 RELATED-2-FETCH  PIC X(10)   VALUE "2,325".
*             10 RELATED-2-LABEL  PIC X(80)   VALUE "Sibling Efficiency Case".
*             10 RELATED-2-BODY   PIC X(800)  VALUE "UC-044 traced efficiency as a release valve on the compute-demand pressure. UC-254 is the same force one turn later, measured on the cost curve: the pressure valve is the reason the demand forecasts that look reckless may simply be early.".
*          05 RELATED-3.
*             10 RELATED-3-NUM    PIC X(8)    VALUE "220".
*             10 RELATED-3-TITLE  PIC X(200)  VALUE "The Data Center Gold Rush".
*             10 RELATED-3-TYPE   PIC X(30)   VALUE "diagnostic".
*             10 RELATED-3-FETCH  PIC X(10)   VALUE "2,772".
*             10 RELATED-3-LABEL  PIC X(80)   VALUE "The Buildout It Rationalizes".
*             10 RELATED-3-BODY   PIC X(800)  VALUE "UC-220 documented a buildout whose demand curve looked stretched. UC-254 is the argument that the curve is only stretched at today's cost — at next year's, the same capacity is rational. Efficiency is what turns an apparent overbuild into a foundation.".
*          05 RELATED-4.
*             10 RELATED-4-NUM    PIC X(8)    VALUE "063".
*             10 RELATED-4-TITLE  PIC X(200)  VALUE "The Stock Reward Ceiling".
*             10 RELATED-4-TYPE   PIC X(30)   VALUE "prognostic".
*             10 RELATED-4-FETCH  PIC X(10)   VALUE "1,537".
*             10 RELATED-4-LABEL  PIC X(80)   VALUE "The Other Side of the Ledger".
*             10 RELATED-4-BODY   PIC X(800)  VALUE "UC-063 and UC-251 are the bearish spine of the cluster — the market punishing AI spend. UC-254 is the counterweight the whole cluster is built around: one honest disconfirmation, scored with the same rigor, so the ledger argues with itself instead of only confirming.".
*
* PROCEDURE DIVISION.
*    MAIN.
*       COPY FROM "@stratiqx/recall-components/components/case-study.rcpy".
*       STOP RUN.

******************************************************
-->
<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <title>UC-254: The Efficiency Escape Hatch — The Fragile Number in the Bear Case</title>
  <meta name="description" content="The bear case on AI capex rests on one assumption: that cost-per-token stays high. It has fallen roughly 10x a year — 280x for GPT-3.5-class capability in under two years. This is the deliberate counterexample to the reckoning cluster: the efficiency escape hatch, argued in full, and then its own honest weakness. A 6D amplifying analysis.">
  <meta name="author" content="StratIQX Research">
  
  <meta property="og:title" content="The Efficiency Escape Hatch: The Fragile Number in the Bear Case">
  <meta property="og:description" content="The bear case on AI capex rests on one assumption: that cost-per-token stays high. It has fallen roughly 10x a year — 280x for GPT-3.5-class capability in under two years. This is the deliberate counterexample to the reckoning cluster: the efficiency escape hatch, argued in full, and then its own honest weakness. A 6D amplifying analysis.">
  <meta property="og:type" content="article">
  <style>
@import url('https://fonts.googleapis.com/css2?family=Source+Serif+4:opsz,wght@8..60,300;8..60,400;8..60,600;8..60,700&family=IBM+Plex+Mono:wght@400;500;600&family=Source+Sans+3:wght@300;400;500;600&display=swap');
:root {
--bg:#080a10; --bg-secondary:#0c1018; --bg-card:#121a2a; --bg-card-hover:#1a2438;
--text:#f0f2f5; --text-secondary:#b8c4d8; --text-muted:#6b7a96;
--accent:#22d3ee; --accent-glow:#67e8f9; --accent-dim:#06b6d4;
--accent-bg:rgba(34,211,238,0.07); --accent-bg-strong:rgba(34,211,238,0.16);
--green:#4ade80; --green-bg:rgba(74,222,128,0.08);
--red:#f87171; --red-bg:rgba(248,113,113,0.08);
--amber:#fbbf24; --amber-bg:rgba(251,191,36,0.08);
--gold:#fcd34d; --gold-bg:rgba(252,211,77,0.08);
--teal:#2dd4bf; --teal-bg:rgba(45,212,191,0.08);
--orange:#fb923c; --blue:#60a5fa;
--cyan:#22d3ee; --cyan-bg:rgba(34,211,238,0.08);
--border:#1e2a3e; --border-accent:rgba(34,211,238,0.35);
--font-serif:'Source Serif 4',Georgia,serif;
--font-mono:'IBM Plex Mono','Courier New',monospace;
--font-sans:'Source Sans 3','Segoe UI',sans-serif;
}
h1, h2, h3, h4, h5, h6 { margin:0; }
body { font-family:var(--font-serif); background:var(--bg); color:var(--text); line-height:1.7; font-size:17px; -webkit-font-smoothing:antialiased; }
.container { max-width:900px; margin:0 auto; padding:0 24px; }
a { color:var(--accent-glow); text-decoration:none; transition:color 0.2s; }
a:hover { color:var(--gold); }
/* Nav */
nav { padding:20px 0; border-bottom:1px solid var(--border); background:var(--bg); position:sticky; top:0; z-index:100; backdrop-filter:blur(12px); }
nav .container { display:flex; justify-content:space-between; align-items:center; }
.nav-brand { font-family:var(--font-sans); font-weight:600; font-size:14px; letter-spacing:1.5px; text-transform:uppercase; color:var(--text); }
.nav-meta { font-family:var(--font-mono); font-size:12px; color:var(--text-muted); }
/* Hero */
.hero { padding:80px 0 60px; border-bottom:1px solid var(--border); position:relative; overflow:hidden; }
.hero::before { content:''; position:absolute; top:-200px; right:-200px; width:600px; height:600px; background:radial-gradient(circle,rgba(34,211,238,0.04) 0%,transparent 70%); pointer-events:none; }
.hero-label { font-family:var(--font-mono); font-size:12px; font-weight:500; letter-spacing:2px; text-transform:uppercase; color:var(--accent); margin-bottom:8px; }
.hero-badge { display:inline-block; font-family:var(--font-mono); font-size:11px; font-weight:600; padding:4px 12px; border-radius:3px; background:var(--accent-bg); color:var(--accent-glow); border:1px solid var(--border-accent); margin-bottom:24px; }
.hero h1 { font-family:var(--font-serif); font-size:clamp(30px,5vw,44px); font-weight:700; line-height:1.15; color:var(--text); margin-bottom:20px; max-width:820px; }
.hero h1 em { font-style:normal; color:var(--accent-glow); }
.hero-subtitle { font-family:var(--font-serif); font-size:19px; font-weight:300; color:var(--text-secondary); max-width:700px; line-height:1.6; margin-bottom:48px; }
.hero-methodology { margin-top:32px; text-align:right; }
.hero-methodology a { font-family:var(--font-mono); font-size:12px; color:var(--text-muted); }
/* Hero stats strip */
.hero-stats { display:grid; grid-template-columns:repeat(auto-fit,minmax(120px,1fr)); gap:1px; background:var(--border); border:1px solid var(--border); border-radius:4px; overflow:hidden; }
.hero-stat { background:var(--bg-card); padding:20px 16px; text-align:center; }
.hero-stat-value { font-family:var(--font-mono); font-size:24px; font-weight:600; color:var(--accent-glow); line-height:1; margin-bottom:6px; }
.hero-stat-label { font-family:var(--font-sans); font-size:11px; font-weight:500; color:var(--text-muted); letter-spacing:0.5px; text-transform:uppercase; }
/* Sections */
.section { padding:72px 0; border-bottom:1px solid var(--border); }
.section-number { font-family:var(--font-mono); font-size:13px; font-weight:600; color:var(--accent-dim); letter-spacing:1px; margin-bottom:12px; }
.section h2 { font-family:var(--font-serif); font-size:28px; font-weight:600; line-height:1.3; margin-bottom:24px; color:var(--text); }
.section p { color:var(--text-secondary); margin-bottom:20px; max-width:760px; }
/* Dimension table (6D cascade breakdown) */
.cascade-table { width:100%; border-collapse:collapse; margin:32px 0; font-size:15px; }
.cascade-table thead th { font-family:var(--font-sans); font-size:11px; font-weight:600; letter-spacing:1px; text-transform:uppercase; color:var(--text-muted); text-align:left; padding:12px 16px; border-bottom:2px solid var(--border); }
.cascade-table tbody td { padding:16px; border-bottom:1px solid var(--border); vertical-align:top; color:var(--text-secondary); }
.cascade-table tbody td:first-child { width:220px; white-space:nowrap; }
.cascade-table tbody tr:hover { background:var(--bg-card); }
.dimension-name { display:block; font-family:var(--font-serif); font-weight:600; color:var(--text); margin-bottom:6px; }
.dimension-score { display:inline-block; font-family:var(--font-mono); font-size:10px; font-weight:600; padding:2px 6px; border-radius:2px; }
.dimension-score.origin { background:var(--accent-bg-strong); color:var(--accent-glow); }
.dimension-score.l1 { background:var(--accent-bg); color:var(--accent-glow); }
.dimension-score.l2 { background:rgba(107,122,150,0.1); color:var(--text-muted); }
.dimension-score.at-risk { background:var(--red-bg); color:var(--red); border:1px solid rgba(239,68,68,0.2); }
.dimension-tag { display:inline-block; font-family:var(--font-mono); font-size:9px; font-weight:600; padding:1px 5px; border-radius:2px; background:var(--accent-bg); color:var(--accent-dim); border:1px solid var(--border-accent); margin-left:6px; letter-spacing:0.5px; text-transform:uppercase; vertical-align:middle; }
/* Supply grid (at-risk evidence cards) */
.supply-grid { display:grid; grid-template-columns:repeat(auto-fit,minmax(260px,1fr)); gap:16px; margin:32px 0; }
.supply-card { padding:24px; background:var(--bg-card); border:1px solid var(--border); border-radius:6px; border-top:2px solid var(--accent); }
.supply-card-stat { font-family:var(--font-mono); font-size:28px; font-weight:600; color:var(--accent-glow); line-height:1; margin-bottom:8px; }
.supply-card-title { font-family:var(--font-sans); font-size:11px; font-weight:600; letter-spacing:1px; text-transform:uppercase; color:var(--text-muted); margin-bottom:12px; }
.supply-card-body { font-size:14px; color:var(--text-secondary); line-height:1.6; margin:0; }
/* Blockquote */
p.blockquote { border-left:3px solid var(--accent); margin:32px 0; padding:16px 24px; background:var(--bg-card); border-radius:0 6px 6px 0; font-family:var(--font-serif); font-size:17px; font-style:italic; color:var(--text-secondary); max-width:none; }
/* Section stat box */
.stat-box { margin:48px 0; padding:40px 32px; background:var(--bg-card); border:1px solid var(--border); border-radius:6px; text-align:center; }
.stat-box-value { font-family:var(--font-mono); font-size:clamp(48px,8vw,72px); font-weight:600; color:var(--accent-glow); line-height:1; margin-bottom:16px; }
.stat-box-label { font-family:var(--font-sans); font-size:11px; font-weight:600; letter-spacing:1.5px; text-transform:uppercase; color:var(--text-muted); margin-bottom:24px; }
.stat-box-context { font-family:var(--font-serif); font-size:16px; color:var(--text-secondary); max-width:600px; margin:0 auto; line-height:1.65; }
/* Evidence table */
.evidence-table { width:100%; border-collapse:collapse; margin:32px 0; font-size:14px; }
.evidence-table thead th { font-family:var(--font-sans); font-size:11px; font-weight:600; letter-spacing:1px; text-transform:uppercase; color:var(--text-muted); text-align:left; padding:12px 16px; border-bottom:2px solid var(--border); }
.evidence-table tbody td { padding:14px 16px; border-bottom:1px solid var(--border); color:var(--text-secondary); }
.evidence-table tbody tr:hover { background:var(--bg-card); }
.case-ref { font-family:var(--font-mono); font-size:12px; color:var(--accent-glow); }
a.case-ref:hover { color:var(--gold); }
/* Inline citations */
a.cite { font-family:var(--font-mono); font-size:11px; color:var(--accent-glow); vertical-align:super; line-height:0; text-decoration:none; }
a.cite:hover { color:var(--gold); }
/* Prognostic window box */
.window-box { margin:40px 0; padding:32px; background:var(--bg-card); border:1px solid var(--border-accent); border-radius:6px; text-align:center; }
.window-status-label { font-family:var(--font-mono); font-size:48px; font-weight:700; color:var(--accent-glow); line-height:1; margin-bottom:8px; }
.window-health-label { font-family:var(--font-sans); font-size:14px; color:var(--text-muted); text-transform:uppercase; letter-spacing:1px; }
.window-context { font-family:var(--font-serif); font-size:15px; color:var(--text-secondary); margin-top:16px; max-width:600px; margin-left:auto; margin-right:auto; }
/* Cascade timeline */
.timeline { margin:40px 0; }
.timeline-item { display:grid; grid-template-columns:100px 3px 1fr; gap:20px; margin-bottom:0; position:relative; }
.timeline-date { font-family:var(--font-mono); font-size:12px; font-weight:500; color:var(--accent-glow); text-align:right; padding-top:20px; }
.timeline-line { background:var(--border); position:relative; }
.timeline-dot { width:11px; height:11px; background:var(--accent-glow); border-radius:50%; position:absolute; left:-4px; top:22px; box-shadow:0 0 12px var(--accent-bg-strong); }
.timeline-dot.crisis { background:var(--red); box-shadow:0 0 16px rgba(248,113,113,0.6); }
.timeline-dot.breakthrough { width:15px; height:15px; left:-6px; top:20px; background:var(--gold); box-shadow:0 0 20px rgba(252,211,77,0.5); }
.timeline-content { padding:16px 0 32px; }
.timeline-content h4 { font-family:var(--font-serif); font-size:17px; font-weight:600; color:var(--text); margin-bottom:6px; }
.timeline-content p { font-size:15px; color:var(--text-secondary); margin-bottom:0; }
.timeline-tag { display:inline-block; font-family:var(--font-mono); font-size:10px; font-weight:600; padding:2px 8px; border-radius:3px; margin-top:8px; background:var(--accent-bg); color:var(--accent-glow); border:1px solid var(--border-accent); }
.timeline-tag.catalyst { background:var(--gold-bg); color:var(--gold); border-color:rgba(252,211,77,0.3); }
.timeline-tag.crisis-tag { background:var(--red-bg); color:var(--red); border-color:rgba(248,113,113,0.3); }
/* Trigger/watch grid */
.trigger-grid { display:grid; gap:16px; margin:32px 0; }
.trigger-item { display:grid; grid-template-columns:auto 1fr auto; gap:16px; padding:20px; background:var(--bg-card); border:1px solid var(--border); border-radius:6px; align-items:start; }
.trigger-number { font-family:var(--font-mono); font-size:14px; font-weight:700; color:var(--accent-glow); padding-top:2px; }
.trigger-name { font-family:var(--font-serif); font-weight:600; color:var(--text); font-size:15px; margin-bottom:4px; }
.trigger-detail { font-size:13px; color:var(--text-secondary); line-height:1.5; margin:0; }
.trigger-link { font-family:var(--font-mono); font-size:11px; color:var(--text-muted); margin-top:6px; }
.trigger-status { font-family:var(--font-mono); font-size:10px; font-weight:600; padding:4px 10px; border-radius:3px; text-transform:uppercase; letter-spacing:1px; white-space:nowrap; }
.trigger-status.inactive { background:rgba(100,116,139,0.1); color:var(--text-muted); border:1px solid var(--border); }
.trigger-status.active { background:var(--accent-bg); color:var(--accent-glow); border:1px solid var(--border-accent); }
.trigger-status.warming { background:var(--amber-bg); color:var(--amber); border:1px solid rgba(251,191,36,0.2); }
/* Cascade stats */
.cascade-stats { display:grid; grid-template-columns:repeat(3,1fr); gap:1px; background:var(--border); border:1px solid var(--border); border-radius:4px; overflow:hidden; margin:40px 0; }
.cascade-stat { background:var(--bg-card); padding:24px; text-align:center; }
.cascade-stat-value { font-family:var(--font-mono); font-size:28px; font-weight:600; color:var(--accent-glow); }
.cascade-stat-label { font-family:var(--font-sans); font-size:12px; color:var(--text-muted); text-transform:uppercase; letter-spacing:0.5px; margin-top:4px; }
/* Cascade flow */
.flow-row { display:flex; align-items:center; gap:8px; margin:16px 0; flex-wrap:wrap; }
.flow-label { font-family:var(--font-sans); font-size:11px; font-weight:600; letter-spacing:0.5px; color:var(--text-muted); text-transform:uppercase; width:60px; flex-shrink:0; }
.flow-node { font-family:var(--font-mono); font-size:12px; font-weight:500; padding:6px 12px; border-radius:4px; background:var(--accent-bg); color:var(--accent-glow); border:1px solid var(--border-accent); }
.flow-arrow { color:var(--text-muted); font-size:14px; }
/* FETCH breakdown box */
.doi-box { margin:40px 0; padding:24px; background:var(--bg-card); border:1px solid var(--border); border-radius:6px; border-left:3px solid var(--teal); }
.doi-box h4 { font-family:var(--font-sans); font-size:11px; font-weight:600; letter-spacing:1.5px; text-transform:uppercase; color:var(--teal); margin-bottom:12px; }
.doi-item { font-family:var(--font-mono); font-size:13px; color:var(--text-secondary); margin-bottom:8px; line-height:1.5; }
.doi-label { font-family:var(--font-sans); font-size:12px; color:var(--text-muted); }
/* Insight grid */
.insight-grid { display:grid; grid-template-columns:1fr 1fr; gap:20px; margin:32px 0; }
.insight-card { padding:24px; background:var(--bg-card); border:1px solid var(--border); border-radius:6px; border-left:3px solid var(--accent); }
.insight-card h4 { font-family:var(--font-serif); font-size:16px; font-weight:600; color:var(--text); margin-bottom:10px; }
.insight-card p { font-size:14px; color:var(--text-secondary); margin-bottom:0; line-height:1.6; }
/* CTA + footer */
.cta { padding:80px 0; border-bottom:1px solid var(--border); }
.cta h2,
.cta h3 { font-family:var(--font-serif); font-size:28px; font-weight:600; color:var(--text); margin-bottom:16px; max-width:540px; line-height:1.25; }
.cta p { color:var(--text-secondary); max-width:480px; margin:0 0 32px; font-size:17px; }
.cta-links { font-family:var(--font-sans); font-size:15px; }
.cta-links a { color:var(--gold); font-weight:600; margin-right:32px; }
/* CTA buttons: strip box model, inherit link colour from the page theme.
Per-case THEME-OVERRIDES drives the colour — if links are gold, buttons are gold. */
.cta .recall-btn,
.cta .recall-btn.ghost { background:none; border:none; box-shadow:none; padding:0; margin-right:32px; font-family:var(--font-sans); font-size:17px; font-weight:600; letter-spacing:0; text-transform:none; display:inline; border-radius:0; }
.cta .recall-btn:hover,
.cta .recall-btn.ghost:hover { opacity:1; }
footer { padding:40px 0; }
footer .container { display:flex; flex-direction:column; gap:14px; }
.footer-brand { font-family:var(--font-sans); font-size:14px; color:var(--text-secondary); }
.footer-meta { font-family:var(--font-mono); font-size:11px; color:var(--text-muted); letter-spacing:0.2px; }
.footer-links { display:flex; gap:32px; flex-wrap:wrap; }
.footer-links a { font-family:var(--font-sans); font-size:13px; color:var(--text-muted); }
.footer-links a:hover { color:var(--text-secondary); }
.footer-disclosure { font-family:var(--font-mono); font-size:11px; color:var(--text-muted); opacity:0.6; line-height:1.6; }
.footer-legal a { color:var(--accent-glow); text-decoration:none; }
.footer-legal a:hover { color:var(--gold); }
/* Sources section */
.sources { padding:48px 0; border-bottom:1px solid var(--border); }
.sources h3 { font-family:var(--font-sans); font-size:13px; font-weight:600; letter-spacing:1.5px; text-transform:uppercase; color:var(--text-muted); margin-bottom:24px; }
.sources > .container > p { font-size:14px; color:var(--text-muted); margin-bottom:24px; max-width:760px; }
.source-tier { font-family:var(--font-mono); font-size:10px; font-weight:600; letter-spacing:1px; text-transform:uppercase; color:var(--teal); margin:24px 0 16px; padding-bottom:8px; border-bottom:1px solid var(--border); }
.source-item { display:grid; grid-template-columns:30px 1fr; gap:12px; margin-bottom:16px; font-size:14px; }
.source-num { font-family:var(--font-mono); font-size:12px; font-weight:600; color:var(--accent-glow); }
.source-text { color:var(--text-secondary); line-height:1.5; }
.source-text a { color:var(--text-muted); font-size:12px; word-break:break-all; display:block; margin-top:4px; }
/* CAL source block */
.cal-source { margin:48px 0 0; border:1px solid var(--border); border-radius:6px; background:var(--bg-card); overflow:hidden; }
.cal-source[open] { border-color:var(--border-accent); }
.cal-source-toggle { display:flex; align-items:center; gap:12px; padding:16px 24px; cursor:pointer; list-style:none; user-select:none; }
.cal-source-toggle::-webkit-details-marker { display:none; }
.cal-source-toggle::before { content:'\25B8'; font-family:var(--font-mono); font-size:12px; color:var(--accent-glow); transition:transform 0.2s; }
.cal-source[open] .cal-source-toggle::before { transform:rotate(90deg); }
.cal-source-label { font-family:var(--font-mono); font-size:11px; font-weight:600; letter-spacing:1.5px; text-transform:uppercase; color:var(--accent-glow); }
.cal-source-hint { font-family:var(--font-sans); font-size:12px; color:var(--text-muted); }
.cal-source-file { margin-left:auto; font-family:var(--font-mono); font-size:11px; color:var(--text-muted); background:var(--bg); border:1px solid var(--border); border-radius:4px; padding:3px 9px; letter-spacing:0.3px; }
.cal-source-body { padding:0 24px 24px; }
.cal-code { background:var(--bg); border:1px solid var(--border); border-radius:4px; padding:20px 24px; overflow-x:auto; font-family:var(--font-mono); font-size:12px; line-height:1.65; color:var(--text-secondary); margin-bottom:20px; }
.cal-keyword { color:var(--accent-glow); font-weight:600; }
.cal-comment { color:var(--text-muted); font-style:italic; }
.cal-string  { color:var(--gold); }
.cal-number  { color:var(--green); }
.cal-dim     { color:var(--orange); font-weight:500; }
.cal-runtime-link { font-family:var(--font-mono); font-size:11px; color:var(--text-muted); margin-bottom:0; }
.cal-trace { margin:20px 0 16px; border-top:1px solid var(--border); padding-top:16px; display:flex; flex-direction:column; gap:10px; }
.cal-trace-row { display:grid; grid-template-columns:80px 1fr; gap:12px; font-size:13px; }
.cal-trace-layer { font-family:var(--font-mono); font-size:10px; font-weight:600; letter-spacing:1px; text-transform:uppercase; color:var(--teal); padding-top:2px; }
.cal-trace-detail { font-family:var(--font-sans); color:var(--text-secondary); line-height:1.5; }
/* Related cases cross-reference */
.crossref-stack { display:flex; flex-direction:column; gap:20px; margin:8px 0; }
.crossref-box { padding:24px; background:var(--bg-card); border:1px solid var(--border); border-left:3px solid var(--accent); border-radius:0 6px 6px 0; }
.crossref-box h4 { font-family:var(--font-mono); font-size:12px; font-weight:600; color:var(--text); margin-bottom:12px; display:flex; align-items:baseline; gap:8px; flex-wrap:wrap; line-height:1.5; }
.crossref-label { color:var(--accent-glow); letter-spacing:0.5px; flex-shrink:0; }
a.crossref-title { color:var(--text); font-family:var(--font-mono); }
a.crossref-title:hover { color:var(--gold); }
.crossref-chips { display:inline-flex; gap:6px; align-items:center; margin-left:2px; }
.crossref-type { font-family:var(--font-mono); font-size:10px; font-weight:600; padding:1px 6px; border-radius:2px; text-transform:uppercase; letter-spacing:0.5px; }
.crossref-type.amplifying { background:var(--green-bg); color:var(--green); }
.crossref-type.diagnostic { background:var(--red-bg); color:var(--red); }
.crossref-type.at-risk { background:var(--amber-bg); color:var(--amber); }
.crossref-type.prognostic { background:var(--cyan-bg); color:var(--cyan); }
.crossref-fetch { font-family:var(--font-mono); font-size:10px; color:var(--text-muted); padding:1px 6px; border:1px solid var(--border); border-radius:2px; }
.crossref-box p { font-size:15px; color:var(--text-secondary); margin-bottom:0; line-height:1.65; max-width:none; }
@media(max-width:768px) {
.hero-stats { grid-template-columns:repeat(2,1fr); }
.insight-grid { grid-template-columns:1fr; }
.cascade-stats { grid-template-columns:1fr; }
.trigger-item { grid-template-columns:1fr; }
.footer-links { gap:20px; }
}
  </style>
  <style id="theme-overrides">
  :root { --accent:#059669;--accent-rgb:5,150,105;--accent-bg:rgba(5,150,105,0.08);--accent-glow:#34d399;--border-accent:rgba(5,150,105,0.2); }
  </style>
</head>
<body id="uc-254-case">
<nav>
  <div class="container">
    <a href="https://stratiqx.com" class="nav-brand">StratIQX</a>
    <span class="nav-meta">Case UC-254 &middot; July 2026<span style="display:inline-block;font-family:var(--font-mono);font-size:10px;font-weight:600;padding:2px 8px;border-radius:3px;background:var(--accent-bg);color:var(--accent-glow);border:1px solid var(--border-accent);margin-left:8px;">Amplifying</span></span>
  </div>
</nav>
<section id="hero" class="hero">
  <div class="container layout-stack">
  <div class="hero-label">• 6D Amplifying Analysis · The Counterexample</div>
  <div class="hero-badge">Amplifying · AI Economics · Counterexample</div>
  <h1>The Efficiency Escape Hatch: <em>The Fragile Number in the Bear Case</em></h1>
  <p class="hero-subtitle">Every bearish case in this cluster shares one hidden assumption: that a token stays expensive. It has not. The cost to reach a fixed capability level has fallen roughly 10x per year — by Stanford's measure, 280x for GPT-3.5-class quality in under two years, from $20 to $0.07 per million tokens.<a class="cite" href="#source-1">[1]</a><a class="cite" href="#source-2">[2]</a> Mixture-of-experts models activate ~5% of their parameters per token; distilled 32B models now beat last year's frontier reasoning models; compute-to-fixed-capability halves about every 8 months.<a class="cite" href="#source-3">[3]</a> Satya Nadella named the mechanism: Jevons paradox — cheaper AI means more of it, not less.<a class="cite" href="#source-4">[4]</a> If that curve holds, a demand forecast that looks insane at today's cost is rational at next year's, and the overbuild thesis is the fragile number on the page. This case argues the escape hatch in full — and then, honestly, names where it does not hold.</p>
  <div class="hero-stats">
    <div class="hero-stat">
      <div class="hero-stat-value">280x</div>
      <div class="hero-stat-label">Cost fall, GPT-3.5-class, &lt;2yr (Stanford</div>
    </div>
    <div class="hero-stat">
      <div class="hero-stat-value">~10x / yr</div>
      <div class="hero-stat-label">Inference cost decline, fixed capability</div>
    </div>
    <div class="hero-stat">
      <div class="hero-stat-value">8 months</div>
      <div class="hero-stat-label">Compute-to-fixed-capability halving (Epo</div>
    </div>
    <div class="hero-stat">
      <div class="hero-stat-value">~5%</div>
      <div class="hero-stat-label">Parameters active per token (mixture-of-</div>
    </div>
    <div class="hero-stat">
      <div class="hero-stat-value">~$60/M</div>
      <div class="hero-stat-label">Frontier price floor — the part that did</div>
    </div>
    <div class="hero-stat">
      <div class="hero-stat-value">6 of 6</div>
      <div class="hero-stat-label">Dimensions amplified</div>
    </div></div>
  <p class="hero-methodology"><a href="https://6d.cormorantforaging.dev">6D Foraging Methodology™</a></p>
  </div>
</section>
<section id="section-1" class="section">
  <div class="container layout-stack">
  <div class="section-number">01</div>
  <h2>The Insight</h2>
  <p>This case exists to argue against the rest of its own cluster. The reckoning thesis — that the ~$725B AI buildout will not earn its return — rests on an assumption almost nobody states out loud: that the cost of running the models stays roughly where it is. It has not stayed. It has collapsed.</p>
  <p>The numbers are not marketing. a16z documented that the cost to run a model of equivalent performance has fallen about 10x per year — roughly 1,000x over three years, from $60 to $0.06 per million tokens for GPT-3-class capability.<a class="cite" href="#source-1">[1]</a> Stanford's AI Index put it more conservatively and more precisely: the cost to query a model at GPT-3.5 quality fell 280x in under two years, from $20.00 to $0.07 per million tokens.<a class="cite" href="#source-2">[2]</a> Epoch AI, studying it independently, found compute-to-reach-a-fixed-capability halves roughly every 8 months — far faster than Moore's Law.<a class="cite" href="#source-3">[3]</a></p>
  <p>The mechanism is real engineering, not hope. Mixture-of-experts architectures activate only ~5% of a model's parameters per token (DeepSeek: 671B total, 37B active). Speculative decoding delivers 2–3x throughput in production. Distillation now yields 32B models that beat last year's frontier reasoning models on hard benchmarks.<a class="cite" href="#source-3">[3]</a> Satya Nadella gave the demand-side its name the day DeepSeek crashed Nvidia: Jevons paradox strikes again — as AI gets more efficient, its use skyrockets.<a class="cite" href="#source-4">[4]</a> If a token keeps getting cheaper and demand keeps expanding to fill it, the capex is not an overbuild; it is a floor.</p>
  <p>Now the honest weakness — the seam a smart critic attacks, named before they do. Two things. First, what fell was the cost to reach a <em>fixed</em> capability; the frontier price <em>floor</em> barely moved — GPT-3's 2021 launch price and a late-2024 frontier model both sat near $60 per million output tokens.<a class="cite" href="#source-1">[1]</a> Second, Jevons only rescues <em>provider</em> revenue if demand is price-elastic above 1 — and the economists asked (Northeastern's Hanser, Venkatesan) explicitly decline to assert it. Sequoia's David Cahn, no bear, put the risk in his own words: GPU compute is turning into a commodity, competed down to marginal cost. Cheaper tokens can grow the buyer's bill while compressing the seller's margin toward zero. Efficiency is the escape hatch — but it can deflate revenue as surely as it expands demand. It cuts both ways, and this case says so.</p>
  <div class="stat-box">
  <div class="stat-box-value">280x</div>
  <div class="stat-box-label">Fall in cost to run GPT-3.5-class capability, Nov 2022 to Oct 2024 (Stanford AI Index)</div>
  <p class="stat-box-context">$20.00 to $0.07 per million tokens for the same capability, in under two years.<a class="cite" href="#source-2">[2]</a> The bear case assumes this stops. Its most honest counter is that the frontier price floor did not fall at all — the collapse is in reaching yesterday's bar, not today's.<a class="cite" href="#source-1">[1]</a></p>
</div>
  
  </div>
</section>
<section id="section-2" class="section">
  <div class="container layout-stack">
  <div class="section-number">02</div>
  <h2>The Timeline</h2>
  <p>The efficiency curve that the bear case has to assume away — and the shock that proved it moves markets.</p>
  
  
  <div class="timeline">
  <div class="timeline-item">
    <div class="timeline-date">Nov 2022 → Oct 2024</div>
    <div class="timeline-line"><div class="timeline-dot breakthrough"></div></div>
    <div class="timeline-content">
      <h4>280x, in under two years</h4>
      <p>Stanford's AI Index measures the cost to query a GPT-3.5-class model falling from $20.00 to $0.07 per million tokens — a 280x reduction — as capability diffuses down to smaller, cheaper models.<a class="cite" href="#source-2">[2]</a></p>
      <span class="timeline-tag">The Curve</span>
    </div>
  </div>
  <div class="timeline-item">
    <div class="timeline-date">Nov 2024</div>
    <div class="timeline-line"><div class="timeline-dot breakthrough"></div></div>
    <div class="timeline-content">
      <h4>a16z names it: LLMflation</h4>
      <p>a16z documents that inference cost for equivalent performance falls about 10x per year — roughly 1,000x over three years for GPT-3-class capability. The efficiency curve gets a name and a slope.<a class="cite" href="#source-1">[1]</a></p>
    </div>
  </div>
  <div class="timeline-item">
    <div class="timeline-date">Jan 27, 2025</div>
    <div class="timeline-line"><div class="timeline-dot breakthrough"></div></div>
    <div class="timeline-content">
      <h4>DeepSeek proves it moves markets</h4>
      <p>DeepSeek R1 matches frontier reasoning at a fraction of the cost. Nvidia falls ~17% in a day — the largest one-day market-cap loss in history — and Nadella posts: Jevons paradox strikes again. Efficiency is now a market force, not a footnote.<a class="cite" href="#source-4">[4]</a></p>
      <span class="timeline-tag">The Shock</span>
    </div>
  </div>
  <div class="timeline-item">
    <div class="timeline-date">Mar 2025</div>
    <div class="timeline-line"><div class="timeline-dot breakthrough"></div></div>
    <div class="timeline-content">
      <h4>Independently confirmed</h4>
      <p>Epoch AI finds compute needed to reach a fixed performance level halves roughly every 8 months — faster than Moore's Law. But its own authors note 60–95% of gains came from scaling, only 5–40% from algorithms.<a class="cite" href="#source-3">[3]</a></p>
    </div>
  </div>
  <div class="timeline-item">
    <div class="timeline-date">2026</div>
    <div class="timeline-line"><div class="timeline-dot crisis"></div></div>
    <div class="timeline-content">
      <h4>The floor that did not fall</h4>
      <p>The honest asterisk holds through 2026: the frontier price floor stayed near $60 per million output tokens, and named economists decline to assert the demand elasticity the bull case needs. The escape hatch is real — and it cuts both ways.<a class="cite" href="#source-1">[1]</a></p>
      <span class="timeline-tag">The Caveat</span>
    </div>
  </div>
</div>
  
  <p class="blockquote">Jevons paradox strikes again! As AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can't get enough of. — Satya Nadella, Microsoft CEO, January 27, 2025</p>
  
<table class="cascade-table">
  <thead><tr><th>Dimension</th><th>Evidence</th></tr></thead>
  <tbody>
    <tr>
      <td><span class="dimension-name">Quality (D5)</span>
<span class="dimension-score origin">Origin · 88</span></td>
      <td>The lever is capability-per-dollar collapsing: the same GPT-3.5-class quality that cost $20 per million tokens in 2022 cost $0.07 by late 2024 — 280x.<a class="cite" href="#source-2">[2]</a> a16z's ~10x/year and Epoch's 8-month halving corroborate it independently.<a class="cite" href="#source-1">[1]</a><a class="cite" href="#source-3">[3]</a> D5 is the origin because the entire counterexample rests on one measured fact — that quality is getting radically cheaper to deliver — which, if it continues, resets every downstream economic assumption in the cluster.<span class="dimension-tag">Capability Per Dollar</span></td>
    </tr>
    <tr>
      <td><span class="dimension-name">Operational (D6)</span>
<span class="dimension-score l1">L1 · 84</span></td>
      <td>The fall is produced by real, shipped engineering: mixture-of-experts activating ~5% of parameters (DeepSeek 671B total / 37B active), speculative decoding at 2–3x production throughput, distillation yielding 32B models that beat last year's frontier reasoning models, and quantization (FP8/4-bit) as the mainstream serving default.<a class="cite" href="#source-3">[3]</a><a class="cite" href="#source-5">[5]</a> D6 amplifies from D5 because efficiency is not a market mood — it is a stack of techniques that keep compounding, which is why the curve has held across multiple independent measurements.<span class="dimension-tag">The Engineering</span></td>
    </tr>
    <tr>
      <td><span class="dimension-name">Revenue (D2)</span>
<span class="dimension-score l1">L1 · 80</span></td>
      <td>The revenue dimension is where the escape hatch becomes two-sided. Jevons (Nadella) says cheaper compute expands total use and thus the market.<a class="cite" href="#source-4">[4]</a> But Jevons only grows *provider* revenue if demand elasticity exceeds 1 — which named economists decline to assert — and Cahn warns compute is commoditizing to marginal cost.<a class="cite" href="#source-6">[6]</a> D2 carries both the bull mechanism and its honest refutation: falling cost can grow the buyer's bill while compressing the seller's margin. The single most contested dimension in the case.<span class="dimension-tag">The Double Edge</span></td>
    </tr>
    <tr>
      <td><span class="dimension-name">Customer (D1)</span>
<span class="dimension-score l2">L2 · 78</span></td>
      <td>As inference approaches free, access broadens and use expands — the demand-pull half of Jevons. Enterprise GenAI spend rose from $1.7B to $37B across 2023–2025 even as per-token price fell more than 90%.<a class="cite" href="#source-4">[4]</a> D1 shows the mechanism working on the buyer side: usage genuinely skyrockets. The unresolved question this dimension inherits from D2 is whether that expanding usage lands as provider revenue or as commodity throughput.</td>
    </tr>
    <tr>
      <td><span class="dimension-name">Employee (D3)</span>
<span class="dimension-score l2">L2 · 66</span></td>
      <td>Cheaper inference expands the set of workloads and developers that can afford to run AI — the agentic and reasoning workflows that multiply token consumption exist because the token got cheap enough to burn.<a class="cite" href="#source-3">[3]</a> D3 is where efficiency turns into new demand rather than saved cost: the same fall that could deflate a provider's price is what lets a developer run 50–500x more tokens per task, which is precisely the behavior the bull case needs and the bear case fears.</td>
    </tr>
    <tr>
      <td><span class="dimension-name">Regulatory (D4)</span>
<span class="dimension-score l3">60</span></td>
      <td>D4 is the longest-lag dimension: whether a commoditized inference market sustains the capital that built it. Open-source token share rose from 34% to 65% in the first half of 2026, and Chinese models undercut frontier pricing by an order of magnitude.<a class="cite" href="#source-7">[7]</a> If inference becomes a true commodity with no pricing power, the market structure that funds the buildout is the thing at risk — the slowest-moving and most decisive question the escape hatch raises about itself.<span class="dimension-tag">Watch — Market Structure</span></td>
    </tr>
  </tbody>
</table>
  
  </div>
</section>
<section id="section-3" class="section">
  <div class="container layout-stack">
  <div class="section-number">03</div>
  <h2>6D Cascade Analysis</h2>
  <p>The cascade originates in D5 — Quality — because the lever is capability-per-dollar: the same output delivered at a collapsing cost.<a class="cite" href="#source-1">[1]</a><a class="cite" href="#source-2">[2]</a> From D5 it amplifies into D6 (the operational engineering — mixture-of-experts, speculative decoding, distillation, quantization that make the fall real) and D2 (the demand economics — Jevons, cheaper compute pulling more use) together, then D1 (broader access as inference approaches free) and D3 (the developers and workloads that expand to fill it). D4 (the regulatory/structural question of whether commoditized inference sustains a market) is the longest-lag dimension. This case is deliberately the counter-cascade to the cluster: [UC-251] documents the market pricing an overbuild — UC-254 is the case that, if the efficiency curve holds, breaks that read. [UC-044] is the sibling efficiency case it amplifies; [UC-220] is the buildout whose demand curve efficiency would rationalize. The honest hedge is stated in the analysis, not hidden: the frontier floor did not fall, and falling cost can deflate provider revenue.</p>
  
  
  
  
<div class="doi-box" style="border-left-color:var(--gold);">
  <h4 style="color:var(--gold);">FETCH Score Breakdown</h4>
  <div class="doi-item"><span class="doi-label">Chirp</span>: <strong style="color:var(--accent-glow);">80</strong></div>
  <div class="doi-item"><span class="doi-label">|DRIFT|</span>: <strong style="color:var(--accent-glow);">42</strong></div>
  <div class="doi-item"><span class="doi-label">Confidence</span>: <strong style="color:var(--accent-glow);">0.79</strong></div>
  
  <div class="doi-item" style="margin-top:12px;padding-top:12px;border-top:1px solid var(--border);">
    <span class="doi-label">FETCH</span> = 80 &times; 42 &times; 0.79 =
    <strong style="color:var(--gold);font-size:18px;">2,654</strong>
    &nbsp;&rarr;&nbsp;
    <strong style="color:var(--green);">EXECUTE — HIGH PRIORITY</strong>
    <span style="color:var(--text-muted);">(threshold: 1,000)</span>
  </div>
  <div class="doi-item" style="margin-top:8px;"><span class="doi-label">Calibration</span>: FETCH 2,654 sits just below its sibling UC-251 (2,692) by design — this is the counterweight to the rotation, not a louder claim than it. DRIFT 42 is the widest in the cluster on purpose: the methodology is strong (the cost-decline is primary-verified by Stanford and Epoch) but the performance claim — that efficiency rescues the buildout's economics — is genuinely contested, with named economists declining to assert the elasticity it requires. Confidence 0.79, the lowest in the cluster, because this is an argument that concedes its own load-bearing weakness: the frontier floor did not fall, and cheaper tokens can cut provider revenue as well as expand demand. A strong counterexample is measured by how honestly it states the case against itself.</div>
</div>
  
  <div class="cascade-stats">
    <div class="cascade-stat"><div class="cascade-stat-value">6 of 6</div><div class="cascade-stat-label">Dimensions Hit</div></div>
    <div class="cascade-stat"><div class="cascade-stat-value">Efficiency = a floor</div><div class="cascade-stat-label">Multiplier</div></div>
    <div class="cascade-stat"><div class="cascade-stat-value">2,654</div><div class="cascade-stat-label">FETCH Score</div></div>
  </div>
  <div class="flow-row">
    <span class="flow-label">Origin</span>
    <span class="flow-node">D5 Quality</span>
  </div>
  <div class="flow-row">
    <span class="flow-label">L1</span>
    <span class="flow-node">D6 Operational</span><span class="flow-arrow">+</span>
      <span class="flow-node">D2 Revenue</span>
  </div>
  <div class="flow-row">
    <span class="flow-label">L2</span>
    <span class="flow-node">D1 Customer</span><span class="flow-arrow">+</span>
      <span class="flow-node">D3 Employee</span>
  </div>
  <div class="flow-row">
    <span class="flow-label">L3</span>
    <span class="flow-node">D4 Regulatory</span>
  </div>
  <details class="cal-source" open>
  <summary class="cal-source-toggle">
    <span class="cal-source-label">CAL Source</span>
    <span class="cal-source-hint">efficiency-escape-hatch · amplifying · D5 origin · the counterexample — if cost-per-token keeps falling the bear case breaks</span>
    <span class="cal-source-file">efficiency-escape-hatch.cal</span>
  </summary>
  <div class="cal-source-body">
    <pre class="cal-code"><code><span class="cal-comment">-- UC-254: The Efficiency Escape Hatch: 6D Amplifying Cascade (COUNTEREXAMPLE)</span>
<span class="cal-comment">-- The fragile number in the bear case (counters UC-251/255; amplifies UC-044)</span>
<span class="cal-keyword">FORAGE</span> efficiency_escape_hatch
<span class="cal-keyword">WHERE</span> cost_per_token_collapsing = true
  <span class="cal-keyword">AND</span> demand_expands_to_fill = true
  <span class="cal-keyword">AND</span> bear_case_assumes_high_cost = true
<span class="cal-keyword">ACROSS</span> <span class="cal-dim">D5</span>, <span class="cal-dim">D6</span>, <span class="cal-dim">D2</span>, <span class="cal-dim">D1</span>, <span class="cal-dim">D3</span>, <span class="cal-dim">D4</span>
<span class="cal-keyword">DEPTH</span> <span class="cal-number">3</span>
<span class="cal-keyword">SURFACE</span> efficiency_escape_hatch

<span class="cal-keyword">DIVE</span> <span class="cal-keyword">INTO</span> capability_per_dollar
<span class="cal-keyword">WHEN</span> fixed_capability_cost_falls_10x_yr = true
  <span class="cal-keyword">AND</span> frontier_floor_holds = true
<span class="cal-keyword">TRACE</span> efficiency_counter_cascade
<span class="cal-keyword">EMIT</span> efficiency_escape_hatch_signal

<span class="cal-keyword">DRIFT</span> efficiency_escape_hatch
<span class="cal-keyword">METHODOLOGY</span> <span class="cal-number">84</span>
<span class="cal-keyword">PERFORMANCE</span> <span class="cal-number">42</span>

<span class="cal-keyword">FETCH</span> efficiency_escape_hatch
<span class="cal-keyword">THRESHOLD</span> <span class="cal-number">1000</span>
<span class="cal-keyword">ON</span> <span class="cal-keyword">EXECUTE</span> <span class="cal-keyword">CHIRP</span> high <span class="cal-string">'Cost to run a fixed-capability model fell ~10x a year - 280x for GPT-<span class="cal-number">3.5</span>-class in under two years - so a demand curve that looks insane today is rational tomorrow and the overbuild thesis is the fragile number; but the frontier floor did not fall and cheaper tokens can deflate provider revenue - the escape hatch cuts both ways'</span>

<span class="cal-keyword">SURFACE</span> analysis <span class="cal-keyword">AS</span> json</code></pre>
    <div class="cal-trace">
    <div class="cal-trace-row">
      <span class="cal-trace-layer">SENSE</span>
      <span class="cal-trace-detail">FORAGE: cost to run a fixed-capability model fell ~10x/yr (a16z: $60 to $0.06/M for GPT-3-class over 3yr; Stanford: 280x, $20 to $0.07/M for GPT-3.5-class in &lt;2yr); Epoch: compute-to-fixed-capability halves ~every 8mo. Levers: mixture-of-experts (DeepSeek 671B total / 37B active), speculative decoding (2-3x), distillation (R1-Distill-32B beats o1-mini). Nadella: Jevons paradox strikes again (Jan 27 2025). Honest counter: frontier price floor held near $60/M output; Jevons rescues provider revenue only if elasticity&gt;1 (economists decline); Cahn: compute is a commodity competed to marginal cost. Signal: efficiency is the escape hatch, and it cuts both ways.</span>
    </div>
    <div class="cal-trace-row">
      <span class="cal-trace-layer">ANALYZE</span>
      <span class="cal-trace-detail">DRIFT 42 (widest in the cluster) — methodology strong (cost-decline primary-verified, Stanford + Epoch) against a contested performance claim (does efficiency rescue the buildout's economics?). D5 origin (capability-per-dollar) cascades to D6 (the engineering levers) + D2 (Jevons demand economics), then D1 (access) + D3 (workloads), with D4 (does commoditized inference sustain a market) the longest lag. This is the deliberate counter-cascade to UC-251/255 — the case built to break the bear read if the efficiency curve holds.</span>
    </div>
    <div class="cal-trace-row">
      <span class="cal-trace-layer">DECIDE</span>
      <span class="cal-trace-detail">FETCH 2,654 exceeds threshold 1,000. EXECUTE — HIGH PRIORITY, calibrated just below UC-251 so the counterweight does not out-shout the rotation. The cost-decline is primary; the rescue claim is conceded as contested. WATCH: whether cost-per-token deflation slows (it has not — still steep, which supports this case) and whether provider margins hold or commoditize toward marginal cost (Cahn's risk). Confidence 0.79 reflects a counterexample that states its own weakest joint on the page.</span>
    </div>
    </div>
    <p class="cal-runtime-link">Runtime: <a href="https://npmjs.com/package/@stratiqx/cal-runtime">@stratiqx/cal-runtime</a> · Spec: <a href="https://cal.semanticintent.dev">cal.semanticintent.dev</a> · DOI: <a href="https://doi.org/10.5281/zenodo.18905193">10.5281/zenodo.18905193</a></p>
  </div>
</details>
  </div>
</section>
<section id="section-insights" class="section">
  <div class="container layout-stack">
  <div class="section-number">04</div>
  <h2>Key Insights</h2>
  <div class="insight-grid">
  <div class="insight-card">
    <h4>Every overbuild thesis hides one assumption</h4>
    <p>That the unit cost of the thing stays where it is. For AI, the unit is a token, and it has fallen ~10x a year.[1] Before you call the buildout an overbuild, price the token it runs on falling by an order of magnitude annually — then re-run the demand curve.</p>
  </div>
  <div class="insight-card">
    <h4>Efficiency is engineering, not hope</h4>
    <p>Mixture-of-experts (5% of parameters active), speculative decoding (2–3x), distillation (32B beats last year's frontier). These are shipped, measured techniques.[3] The cost fall is not a forecast; it already happened, twice over, and Epoch confirmed it independently.</p>
  </div>
  <div class="insight-card">
    <h4>The frontier floor did not fall — the honest asterisk</h4>
    <p>What collapsed was the cost to reach a fixed capability. The frontier price floor stayed near $60/M output from 2021 to late 2024.[1] The escape hatch works for commodity capability, not for the newest frontier — a distinction the loudest bull cases skip and this one does not.</p>
  </div>
  <div class="insight-card">
    <h4>It cuts both ways</h4>
    <p>Jevons grows the buyer's bill; it does not automatically grow the seller's margin. If inference commoditizes to marginal cost — Cahn's own warning — cheaper tokens can deflate provider revenue faster than they expand demand.[6][7] The escape hatch is real and double-edged; a counterexample that hid the second edge would not be worth citing.</p>
  </div></div>
  </div>
</section>

<section class="section" id="section-related">
  <div class="container layout-stack">
  <h2>Related Cases</h2>
  <div class="crossref-stack">
  <div class="crossref-box">
    <h4><span class="crossref-label">The Case It Counters</span> &mdash; <a class="crossref-title" href="https://uc-251.stratiqx.com">UC-251: The Show-Me Rotation</a><span class="crossref-chips"><span class="crossref-type diagnostic">Diagnostic</span><span class="crossref-fetch">FETCH 2,692</span></span></h4>
    <p>UC-251 documents the market pricing AI capex as an overbuild. UC-254 is the deliberate counterexample: if the cost-per-token curve holds, the rotation is pricing a floor as if it were a ceiling. The two are meant to be read against each other — the ledger's disconfirmation, built in.</p>
  </div>
  <div class="crossref-box">
    <h4><span class="crossref-label">Sibling Efficiency Case</span> &mdash; <a class="crossref-title" href="https://uc-044.stratiqx.com">UC-044: The $80 Billion Pressure Valve</a><span class="crossref-chips"><span class="crossref-type amplifying">Amplifying</span><span class="crossref-fetch">FETCH 2,325</span></span></h4>
    <p>UC-044 traced efficiency as a release valve on the compute-demand pressure. UC-254 is the same force one turn later, measured on the cost curve: the pressure valve is the reason the demand forecasts that look reckless may simply be early.</p>
  </div>
  <div class="crossref-box">
    <h4><span class="crossref-label">The Buildout It Rationalizes</span> &mdash; <a class="crossref-title" href="https://uc-220.stratiqx.com">UC-220: The Data Center Gold Rush</a><span class="crossref-chips"><span class="crossref-type diagnostic">Diagnostic</span><span class="crossref-fetch">FETCH 2,772</span></span></h4>
    <p>UC-220 documented a buildout whose demand curve looked stretched. UC-254 is the argument that the curve is only stretched at today's cost — at next year's, the same capacity is rational. Efficiency is what turns an apparent overbuild into a foundation.</p>
  </div>
  <div class="crossref-box">
    <h4><span class="crossref-label">The Other Side of the Ledger</span> &mdash; <a class="crossref-title" href="https://uc-063.stratiqx.com">UC-063: The Stock Reward Ceiling</a><span class="crossref-chips"><span class="crossref-type prognostic">Prognostic</span><span class="crossref-fetch">FETCH 1,537</span></span></h4>
    <p>UC-063 and UC-251 are the bearish spine of the cluster — the market punishing AI spend. UC-254 is the counterweight the whole cluster is built around: one honest disconfirmation, scored with the same rigor, so the ledger argues with itself instead of only confirming.</p>
  </div>
  </div>
  </div>
</section>
<section class="sources">
  <div class="container">
    <h3>Sources</h3>
    <p style="font-size:14px;color:var(--text-muted);margin-bottom:24px;">Seven sources: Stanford's AI Index and a16z for the primary cost-decline series, Epoch AI for the independent efficiency rate, the architectural levers (mixture-of-experts, distillation, speculative decoding), Nadella's Jevons framing — and, for the honest counter, the frontier-floor caveat and the economists who decline to assert the elasticity the bull case needs.</p>
    <div class="source-tier">Tier 1 — Official &amp; Structural Data</div>
<div class="source-item" id="source-1">
  <span class="source-num">[1]</span>
  <div class="source-text">a16z (Guido Appenzeller) — LLMflation: LLM inference cost is going down fast (November 12, 2024). Cost for equivalent performance falls ~10x/year; ~1,000x over three years, from $60 to $0.06 per million tokens for GPT-3-class capability. Includes the load-bearing caveat this case cites: the frontier price floor did not fall — a late-2024 frontier model still cost ~$60/M output, the same as GPT-3 in 2021.<a href="https://a16z.com/llmflation-llm-inference-cost/">a16z.com · Nov 2024</a></div>
</div>

<div class="source-item" id="source-2">
  <span class="source-num">[2]</span>
  <div class="source-text">Stanford HAI — 2025 AI Index Report, Research &amp; Development chapter. The cost to query a model at GPT-3.5-equivalent quality (MMLU 64.8) fell from $20.00 per million tokens (November 2022) to $0.07 (October 2024, Gemini-1.5-Flash-8B) — a more than 280-fold reduction in ~18 months. The cleanest fully-primary-verified figure in the case.<a href="https://hai.stanford.edu/ai-index/2025-ai-index-report/research-and-development">stanford hai · 2025</a></div>
</div>

<div class="source-item" id="source-3">
  <span class="source-num">[3]</span>
  <div class="source-text">Epoch AI — Algorithmic progress in language models (231 models through March 2024). Compute required to reach a fixed capability level halves roughly every 8 months (95% CI: 5–14 months), far faster than Moore's Law. Authors' caveat, cited for honesty: 60–95% of the gains came from more compute/data, only 5–40% from novel algorithms. Levers: mixture-of-experts, distillation, speculative decoding.<a href="https://epoch.ai/blog/algorithmic-progress-in-language-models">epoch.ai · Mar 2025</a></div>
</div>

<div class="source-item" id="source-4">
  <span class="source-num">[4]</span>
  <div class="source-text">Satya Nadella (Microsoft CEO), on X, January 27, 2025: Jevons paradox strikes again! As AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can't get enough of. Posted the day DeepSeek's efficiency drove a ~$600B one-day Nvidia decline — the canonical bull-case demand argument.<a href="https://x.com/satyanadella/status/1883753899255046301">x.com · Jan 27 2025</a></div>
</div>
<div class="source-tier">Tier 2 — Industry Analysis</div>
<div class="source-item" id="source-5">
  <span class="source-num">[5]</span>
  <div class="source-text">DeepSeek-V3 / R1 technical reports and Hugging Face model cards. Mixture-of-experts: 671B total parameters, 37B active per token (~5.5%). R1-Distill-Qwen-32B beats OpenAI o1-mini on AIME/MATH/GPQA. R1 (Jan 20, 2025) rivaled o1 at API pricing of $0.55/$2.19 per million vs o1's $15/$60 — the concrete proof that capability is diffusing down the cost curve.<a href="https://huggingface.co/deepseek-ai/DeepSeek-R1">huggingface.co · 2025</a></div>
</div>

<div class="source-item" id="source-6">
  <span class="source-num">[6]</span>
  <div class="source-text">The honest counter (Northeastern economists + Sequoia). Northeastern's Philip Hanser and Madhavi Venkatesan explicitly decline to assert that AI demand is price-elastic above 1 — the condition Jevons requires to grow provider revenue rather than just buyer spend. Sequoia's David Cahn, in his own words: GPU compute is turning into a commodity, competed down to marginal cost. The case against the escape hatch, stated fairly.<a href="https://news.northeastern.edu/2025/02/07/jevons-paradox-ai-future/">northeastern.edu · Feb 2025</a></div>
</div>

<div class="source-item" id="source-7">
  <span class="source-num">[7]</span>
  <div class="source-text">The margin caveat (Ed Zitron and inference-economics analysis). Newer reasoning models burn far more tokens per query, so per-query cost has in cases risen even as per-token price fell; AI-provider gross margins (~52%) run well below traditional SaaS (70–90%) because inference cost scales with every query. Why falling cost-per-token can deflate revenue and compress margin — the seam this case concedes.<a href="https://www.wheresyoured.at/ais-economics-dont-make-sense/">wheresyoured.at · Apr 2026</a></div>
</div>
  </div>
</section>
<section id="cta" class="cta">
  <div class="container layout-stack">
  <h2>Before you short the buildout, price the token it runs on falling 10x a year.</h2>
  <p>Every overbuild thesis has one assumption it never states. Find it. Here, it is that a token stays expensive.</p>
  <div class="btn-row"><a href="https://stratiqx.com/contact" class="recall-btn primary">Book a discovery call →</a><a href="https://uc-000.stratiqx.com" class="recall-btn ghost">Browse the case library →</a></div>
  </div>
</section>
<footer>
  <div class="container">
    <span class="footer-brand">StratIQX · Strategic Cascade Intelligence</span>
    <div class="footer-meta">Case UC-254 · July 2026 · v1.0 · The Efficiency Escape Hatch</div>
    <div class="footer-links">
      <a href="https://stratiqx.com">AI Platform</a>
      <a href="https://stratiqx.com/intelligence">6D Intelligence</a>
      <a href="https://6d.cormorantforaging.dev">6D Methodology</a>
      <a href="https://cormorantforaging.dev">Cormorant Foraging</a>
    </div>
    <p class="footer-disclosure">All dimension scores and FETCH calculations are derived from the 6D Foraging Methodology™ applied to cited primary sources. <span class="footer-legal"> · <a href="https://stratiqx.com/disclaimer">Disclaimer</a> · <a href="https://stratiqx.com/terms">Terms</a> · <a href="https://stratiqx.com/privacy">Privacy</a></span></p>
  </div>
</footer>
  <!-- generated by cal-workflow MCP 0.1.0@b54f8f1 (built 2026-06-18T21:20:41.185Z) · recall-components 0.1.0@7168ee3 (built 2026-06-20T23:19:57.984Z) · 2026-07-02T23:18:28.713Z -->
</body>
</html>