<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Interesting Engineering++]]></title><description><![CDATA[My personal Substack]]></description><link>https://interestingengineering.substack.com</link><image><url>https://substackcdn.com/image/fetch/$s_!-M9w!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05150353-1bdc-48d2-b72c-c0bd499513eb_1024x1024.png</url><title>Interesting Engineering++</title><link>https://interestingengineering.substack.com</link></image><generator>Substack</generator><lastBuildDate>Sun, 21 Jun 2026 20:48:58 GMT</lastBuildDate><atom:link href="https://interestingengineering.substack.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[InterestingEngineering++]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[interestingengineering@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[interestingengineering@substack.com]]></itunes:email><itunes:name><![CDATA[Interesting Engineering ++]]></itunes:name></itunes:owner><itunes:author><![CDATA[Interesting Engineering ++]]></itunes:author><googleplay:owner><![CDATA[interestingengineering@substack.com]]></googleplay:owner><googleplay:email><![CDATA[interestingengineering@substack.com]]></googleplay:email><googleplay:author><![CDATA[Interesting Engineering ++]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Two Economies, One Technology]]></title><description><![CDATA[Why Agentic AI's Impact Will Diverge by Industry, Function, and Regulatory Regime]]></description><link>https://interestingengineering.substack.com/p/two-economies-one-technology</link><guid isPermaLink="false">https://interestingengineering.substack.com/p/two-economies-one-technology</guid><dc:creator><![CDATA[Interesting Engineering ++]]></dc:creator><pubDate>Fri, 19 Jun 2026 10:19:40 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!kUfm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032dd2ec-750d-428a-b07c-2198544ecab6_1122x610.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kUfm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032dd2ec-750d-428a-b07c-2198544ecab6_1122x610.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kUfm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032dd2ec-750d-428a-b07c-2198544ecab6_1122x610.png 424w, https://substackcdn.com/image/fetch/$s_!kUfm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032dd2ec-750d-428a-b07c-2198544ecab6_1122x610.png 848w, https://substackcdn.com/image/fetch/$s_!kUfm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032dd2ec-750d-428a-b07c-2198544ecab6_1122x610.png 1272w, https://substackcdn.com/image/fetch/$s_!kUfm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032dd2ec-750d-428a-b07c-2198544ecab6_1122x610.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kUfm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032dd2ec-750d-428a-b07c-2198544ecab6_1122x610.png" width="1122" height="610" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/032dd2ec-750d-428a-b07c-2198544ecab6_1122x610.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:610,&quot;width&quot;:1122,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1035068,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202366454?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032dd2ec-750d-428a-b07c-2198544ecab6_1122x610.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kUfm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032dd2ec-750d-428a-b07c-2198544ecab6_1122x610.png 424w, https://substackcdn.com/image/fetch/$s_!kUfm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032dd2ec-750d-428a-b07c-2198544ecab6_1122x610.png 848w, https://substackcdn.com/image/fetch/$s_!kUfm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032dd2ec-750d-428a-b07c-2198544ecab6_1122x610.png 1272w, https://substackcdn.com/image/fetch/$s_!kUfm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032dd2ec-750d-428a-b07c-2198544ecab6_1122x610.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is the second in a series, and a companion piece to "<a href="https://interestingengineering.substack.com/p/what-is-any-agentic-architecture">What Is Any Agentic Architecture Worth Anyway. PharmaCo International: An Agentic AI Case Study</a>" Where I have gone deep with this case study, the material below takes a wider scope on the impact of AI and Agentic AI structures to industry.</p><p>A note on what this is. Everything that follows is opinion backed by recent data on AI adoption and actual successes, where it may likely be found &#8212; informed by the PharmaCo case study, by the <a href="https://interestingengineering.substack.com/p/the-harness-lab-automated">Harness Lab experiments (</a>H1&#8211;H10) that underpin it, and by a broader read of where the agentic AI market actually stands in mid-2026. It is not a forecast with confidence intervals. Where the view here contrasts with mainstream market enthusiasm, or with the more cautious findings of recent research, both sides are presented &#8212; the goal is to give readers the material to form their own judgment, not to win an argument. Whilst my use and practical application of the findings here have been helpful to me, you could have differing views, based on industry task-specific application. </p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qgLe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfa96528-5061-497e-a867-778176696398_1136x617.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qgLe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfa96528-5061-497e-a867-778176696398_1136x617.png 424w, https://substackcdn.com/image/fetch/$s_!qgLe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfa96528-5061-497e-a867-778176696398_1136x617.png 848w, https://substackcdn.com/image/fetch/$s_!qgLe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfa96528-5061-497e-a867-778176696398_1136x617.png 1272w, https://substackcdn.com/image/fetch/$s_!qgLe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfa96528-5061-497e-a867-778176696398_1136x617.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qgLe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfa96528-5061-497e-a867-778176696398_1136x617.png" width="1136" height="617" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bfa96528-5061-497e-a867-778176696398_1136x617.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:617,&quot;width&quot;:1136,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1044779,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202366454?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfa96528-5061-497e-a867-778176696398_1136x617.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qgLe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfa96528-5061-497e-a867-778176696398_1136x617.png 424w, https://substackcdn.com/image/fetch/$s_!qgLe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfa96528-5061-497e-a867-778176696398_1136x617.png 848w, https://substackcdn.com/image/fetch/$s_!qgLe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfa96528-5061-497e-a867-778176696398_1136x617.png 1272w, https://substackcdn.com/image/fetch/$s_!qgLe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfa96528-5061-497e-a867-778176696398_1136x617.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>1. The cost-side ceiling</h2><p><a href="https://interestingengineering.substack.com/p/what-is-any-agentic-architecture">The PharmaCo case study</a>, on its own numbers, makes an unglamorous point: choosing the right harness architecture for disruption response was worth roughly $297M in enterprise value versus the &#8220;wrong-but-plausible&#8221; alternative, and getting it badly wrong (H4) was worth nearly $1B in destroyed value versus the recommended choice (H2). Against a $12B company, that&#8217;s a <em><strong>2.5&#8211;8% potential swing in enterprise value from one operational decision.</strong></em></p><p>That is a real number. It is also, deliberately, the <em>ceiling</em> of what a cost-avoidance story can deliver &#8212; because <strong>cost savings are bounded by the size of the cost base you&#8217;re optimizing</strong>. PharmaCo&#8217;s disruption-response function is a sliver of its total operating cost. Even a dramatic improvement to a sliver is, by construction, a small improvement to the whole.</p><p>This is the part of the agentic AI story that&#8217;s easiest to model, easiest to benchmark (it&#8217;s exactly what the H1&#8211;H10 experiments measured), and &#8212; I&#8217;d argue &#8212; the part most over-indexed in how the market currently talks about AI&#8217;s impact on individual companies. <mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">&#8220;</mark><em><strong><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">We deployed agents and cut costs by X%&#8221; is a real, fundable, board-defensible story</mark></strong></em><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">.</mark> <strong>It is also a story with a visible end-state: once the process is automated, the savings don&#8217;t compound indefinitely, they just recur. That could be good enough for companies in steady state, dividend plays. But&#8230;.</strong></p><p>A useful way to see <em>why</em> this ceiling exists comes from a framework <strong><a href="https://www.normaltech.ai/p/why-ai-hasnt-replaced-software-engineers">Arvind Narayanan and Sayash Kapoor proposed for knowledge work generally: most jobs are a &#8220;decide-execute-deliver sandwich,&#8221;</a></strong> where AI compresses the execute layer in the middle but leaves the decide layer (what should be done, and why) and the deliver layer (who&#8217;s accountable for whether it was done right) largely untouched. Read through that lens, <em><strong>H2&#8217;s win over H9 is squarely an execute-layer story &#8212; which architecture responds fastest and most coherently to a disruption PharmaCo&#8217;s leadership has already decided is worth responding to, and for which someone remains accountable regardless of which architecture ran the response. The $297M&#8211;$986M swing is real, but it&#8217;s a swing within the execute layer; it doesn&#8217;t touch who decides PharmaCo&#8217;s disruption-response posture or who&#8217;s accountable when a plan goes wrong. That&#8217;s the structural reason the ceiling exists</strong></em> &#8212; not merely an artifact of this particular cost base being small.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!O86_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff033dcf5-a671-4921-927b-d903ac85f2fc_1138x617.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!O86_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff033dcf5-a671-4921-927b-d903ac85f2fc_1138x617.png 424w, https://substackcdn.com/image/fetch/$s_!O86_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff033dcf5-a671-4921-927b-d903ac85f2fc_1138x617.png 848w, https://substackcdn.com/image/fetch/$s_!O86_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff033dcf5-a671-4921-927b-d903ac85f2fc_1138x617.png 1272w, https://substackcdn.com/image/fetch/$s_!O86_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff033dcf5-a671-4921-927b-d903ac85f2fc_1138x617.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!O86_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff033dcf5-a671-4921-927b-d903ac85f2fc_1138x617.png" width="1138" height="617" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f033dcf5-a671-4921-927b-d903ac85f2fc_1138x617.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:617,&quot;width&quot;:1138,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1005102,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202366454?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff033dcf5-a671-4921-927b-d903ac85f2fc_1138x617.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!O86_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff033dcf5-a671-4921-927b-d903ac85f2fc_1138x617.png 424w, https://substackcdn.com/image/fetch/$s_!O86_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff033dcf5-a671-4921-927b-d903ac85f2fc_1138x617.png 848w, https://substackcdn.com/image/fetch/$s_!O86_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff033dcf5-a671-4921-927b-d903ac85f2fc_1138x617.png 1272w, https://substackcdn.com/image/fetch/$s_!O86_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff033dcf5-a671-4921-927b-d903ac85f2fc_1138x617.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XG67!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F669e7243-dfcd-485b-bfd8-f75c386486d5_1142x617.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XG67!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F669e7243-dfcd-485b-bfd8-f75c386486d5_1142x617.png 424w, https://substackcdn.com/image/fetch/$s_!XG67!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F669e7243-dfcd-485b-bfd8-f75c386486d5_1142x617.png 848w, https://substackcdn.com/image/fetch/$s_!XG67!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F669e7243-dfcd-485b-bfd8-f75c386486d5_1142x617.png 1272w, https://substackcdn.com/image/fetch/$s_!XG67!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F669e7243-dfcd-485b-bfd8-f75c386486d5_1142x617.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XG67!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F669e7243-dfcd-485b-bfd8-f75c386486d5_1142x617.png" width="1142" height="617" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/669e7243-dfcd-485b-bfd8-f75c386486d5_1142x617.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:617,&quot;width&quot;:1142,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:928248,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202366454?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F669e7243-dfcd-485b-bfd8-f75c386486d5_1142x617.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XG67!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F669e7243-dfcd-485b-bfd8-f75c386486d5_1142x617.png 424w, https://substackcdn.com/image/fetch/$s_!XG67!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F669e7243-dfcd-485b-bfd8-f75c386486d5_1142x617.png 848w, https://substackcdn.com/image/fetch/$s_!XG67!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F669e7243-dfcd-485b-bfd8-f75c386486d5_1142x617.png 1272w, https://substackcdn.com/image/fetch/$s_!XG67!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F669e7243-dfcd-485b-bfd8-f75c386486d5_1142x617.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>2. The revenue-side frontier</h2><p>The more interesting &#8212; and much harder to model &#8212; story is what happens when agentic AI touches the <em>top</em> of the income statement rather than the cost lines underneath it. Drug discovery is the cleanest illustration, because the economics of pharma R&amp;D are already asymmetric and option-like before AI enters the picture: most programs fail, a small number succeed enormously, and the entire industry&#8217;s economics are built around that distribution.</p><p>The numbers being floated for AI&#8217;s effect on that distribution are large enough that they&#8217;re worth taking seriously even with a generous discount for hype. McKinsey-style estimates put the addressable annual value of generative AI across the pharma value chain in the $60&#8211;110B range, with some forecasts running considerably higher when R&amp;D productivity gains are included. The AI-in-drug-discovery tooling market itself is still small (low single-digit billions in 2025&#8211;26) &#8212; the action is in what the tools <em>enable</em>, not in the tools as a line item. Exscientia has pushed multiple AI-designed candidates into clinical trials in under a year, a process that historically takes years. Insilico Medicine&#8217;s AI-designed candidate has progressed toward Phase III. Early data on AI-designed molecules entering Phase I shows success rates well above the historical industry average (roughly 80&#8211;90% versus a long-run baseline near 50%), though the sample sizes are still small and no AI-originated drug has yet completed the full approval pipeline &#8212; most forecasts put the first such approval in 2026&#8211;2028. Note: Takeda's zasocitinib (covered in my separate Funnel/Floor/Structure piece) has since cleared Phase III and is pending an FDA filing within fiscal 2026.</p><p>What&#8217;s striking is the deal activity this is attracting from companies that are not AI companies: <strong><a href="https://insilico.com/news/uiy12zcjg1-insilico-medicine-announces-global-rampd"><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">Eli Lilly&#8217;s multi-billion-dollar partnership with Insilico Medicine,</mark></a><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);"> </mark><a href="https://www.cnbc.com/2026/04/14/novo-nordisk-openai-ai-drug-discovery-healthcare-nvo.html"><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">Novo Nordisk&#8217;s R&amp;D-and-manufacturing partnership with OpenAI</mark></a><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">, </mark><a href="https://www.isomorphiclabs.com/articles/isomorphic-labs-enters-into-a-research-collaboration-with-johnson-johnson"><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">Johnson &amp; Johnson&#8217;s multi-target collaboration with Isomorphic Labs</mark></a><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">, </mark><a href="https://www.fiercebiotech.com/biotech/gsk-inks-model-deal-50m-bet-noetiks-cancer-ai-platform"><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">GSK&#8217;s infrastructure partnership with Noetik</mark></a>. </strong>These are revenue-side bets by companies that already have functioning, profitable cost structures &#8212; the rationale isn&#8217;t &#8220;make our existing pipeline cheaper,&#8221; it&#8217;s &#8220;<strong>change the shape of the pipeline itself</strong>.&#8221;</p><p>This is the asymmetry that a cost-avoidance model like PharmaCo&#8217;s disruption-response case structurally cannot capture. Compressing a discovery timeline by even a year, or modestly improving a Phase II success probability, doesn&#8217;t show up as a percentage on an existing cost line &#8212; it can mean an entire additional revenue stream that wouldn&#8217;t otherwise exist, at NPV magnitudes that dwarf the $297M&#8211;$986M range from Section 10 of the PharmaCo case.</p><h3>2.1 A third axis: when the payoff isn&#8217;t cost or revenue, but capability</h3><p>The framing above already brushes up against something worth pulling out and naming on its own. Some of history&#8217;s technologies with the largest eventual economic impact looked, for years after their invention, like poor investments by any cost-or-revenue test &#8212; not because the technology had failed, but because &#8220;what does this save, or what does this sell&#8221; was the wrong question to be asking yet. <em><strong>GPS spent decades as a defense programme before it became a civilian product category. Genome sequencing absorbed a multi-billion-dollar, decade-long public investment before it produced anything resembling a return. Early integrated circuits had nothing like the cost or performance advantage that would later make them ubiquitous</strong></em>. In each case the technology wasn&#8217;t expanding a margin or opening a revenue line yet &#8212; it was <em><strong>making something previously impossible now possible, and a cost/revenue lens only became the right one to apply once that capability already existed.</strong></em></p><p><strong>Pharma R&amp;D</strong> is the cleanest current illustration of this third axis, but it isn&#8217;t unique to pharma. <strong>Semiconductor design</strong> is arguably just as strong a case: the more consequential AI-driven outcome there isn&#8217;t &#8220;design existing chips more cheaply&#8221; &#8212; a Floor-type claim &#8212; but whether AI-assisted exploration of interconnect topologies, packaging techniques, or photonic architectures turns up designs no human engineering team would have proposed on its own. That&#8217;s the same<em><strong> <mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">fat-tailed, capability-expanding logic as drug discovery, applied to an entirely different industry. Read this way, AI&#8217;s most consequential role in both fields starts to resemble less a labour-saving tool and more a scientific instrument in the lineage of the microscope or the telescope &#8212; something that doesn&#8217;t make existing inquiry cheaper so much as it opens an entirely new field of inquiry.</mark></strong></em><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);"> </mark>Whether agentic AI eventually belongs in that category, rather than the labour-automation category this essay otherwise treats it as, is one of the more consequential open questions this framework doesn&#8217;t resolve on its own.</p><p>This third axis lives inside the Funnel described in Section 1 above and developed further in &#8220;<strong>The Funnel, the Floor, and the Structure</strong>&#8221; &#8212; the Funnel was already described as capable of changing <em>which</em> bets enter a pipeline, not merely speeding up existing-style ones. What this subsection adds is a name for the most extreme version of that tail, because &#8220;capability expansion&#8221; describes a meaningfully different kind of bet than &#8220;a marginally better version of an existing product&#8221;: one where, in the years before the new capability proves out, the honest answer to &#8220;what&#8217;s the ROI&#8221; may legitimately be poor, unclear, or negative &#8212; without that being evidence the bet is failing.</p><h3>2.2 A scope boundary worth naming: this framework doesn&#8217;t explain state-funded capability bets</h3><p>One place this matters concretely: <strong><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">government and defense investment in exactly this kind of capability-expanding technology &#8212; nuclear research, early aerospace, GPS&#8217;s own military origins, internet infrastructure &#8212; has historically continued despite weak or unclear near-term ROI, for a reason this essay&#8217;s framework doesn&#8217;t capture</mark></strong>. A state funding a capability bet isn&#8217;t optimising for enterprise value or profit margin at all; the payoff currency is strategic position or security, which doesn&#8217;t convert into the kind of multiple-based valuation Section 10 of the PharmaCo case uses, or the enterprise-value language this essay has used throughout. A &#8220;poor&#8221; dollar-denominated ROI can be the expected and accepted outcome of that kind of bet, not evidence it is failing.</p><p>This is worth flagging explicitly rather than leaving it implicit, because the companion piece &#8220;Does Agentic AI Matter?&#8221; leans on Nicholas Carr&#8217;s argument that universally available technology stops conferring competitive advantage &#8212; a claim specifically about firms competing against other firms for advantage measured in profit or market position. That argument was never built to explain why a state keeps funding a capability with unclear near-term returns, and it shouldn&#8217;t be read as implying such investment is irrational simply because it doesn&#8217;t show up favourably on this essay&#8217;s cost/revenue axis. Where this series&#8217; frameworks are scoped to firm-level competitive strategy, state-level capability investment is a different actor optimising for a different currency entirely &#8212; outside this series&#8217; scope by construction, not by oversight.</p><h2>3. One company, two economies</h2><p>Here&#8217;s where I think the PharmaCo case study is most useful as a <em>diagnostic</em>, even though it deliberately stayed on one side of this divide: the same company can be living in both economies simultaneously, and the three drugs in that case &#8212; Lisinopril, Metformin, Salbutamol &#8212; are the tell. All three are decades-old, off-patent, generic small molecules. No real innovator pharma builds its strategic narrative around a generic API supply disruption; that&#8217;s a <em>generics manufacturer&#8217;s</em> problem (Teva, Viatris, Sandoz, and similar companies operate exactly this kind of import-dependent, thin-margin, operationally-driven business).</p><p><strong><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">A pure generics manufacturer is, almost by definition, living entirely in the cost-side economy &#8212; there&#8217;s no R&amp;D pipeline to accelerate, no patent-protected revenue to expand, just margin to defend against exactly the kind of supply shock PharmaCo modeled. For that company, the PharmaCo case study&#8217;s framing is close to the whole story: harness architecture choice is a capital allocation decision measured in tens to low hundreds of millions of enterprise value.</mark></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pX5l!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2fd1d41-968a-4d5e-9d7a-11dd3906895b_1145x626.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pX5l!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2fd1d41-968a-4d5e-9d7a-11dd3906895b_1145x626.png 424w, https://substackcdn.com/image/fetch/$s_!pX5l!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2fd1d41-968a-4d5e-9d7a-11dd3906895b_1145x626.png 848w, https://substackcdn.com/image/fetch/$s_!pX5l!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2fd1d41-968a-4d5e-9d7a-11dd3906895b_1145x626.png 1272w, https://substackcdn.com/image/fetch/$s_!pX5l!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2fd1d41-968a-4d5e-9d7a-11dd3906895b_1145x626.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pX5l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2fd1d41-968a-4d5e-9d7a-11dd3906895b_1145x626.png" width="1145" height="626" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d2fd1d41-968a-4d5e-9d7a-11dd3906895b_1145x626.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:626,&quot;width&quot;:1145,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1109720,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202366454?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2fd1d41-968a-4d5e-9d7a-11dd3906895b_1145x626.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pX5l!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2fd1d41-968a-4d5e-9d7a-11dd3906895b_1145x626.png 424w, https://substackcdn.com/image/fetch/$s_!pX5l!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2fd1d41-968a-4d5e-9d7a-11dd3906895b_1145x626.png 848w, https://substackcdn.com/image/fetch/$s_!pX5l!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2fd1d41-968a-4d5e-9d7a-11dd3906895b_1145x626.png 1272w, https://substackcdn.com/image/fetch/$s_!pX5l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2fd1d41-968a-4d5e-9d7a-11dd3906895b_1145x626.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>An innovator pharma with a real R&amp;D pipeline is a different company wearing the same SIC code. Its back office, supply chain, and compliance functions face exactly the same cost-side calculus as the generics manufacturer &#8212; but its R&amp;D division is playing the revenue-side game described in Section 2, where the numbers are an order of magnitude larger and far less certain. Pfizer&#8217;s CFO described 2026 R&amp;D productivity gains from AI as freeing up capacity to &#8220;take on more substrate&#8221; for future medicines &#8212; note that this is explicitly framed as a productivity <em>reallocation</em> story (do more with the same budget), not a cost-cutting story; the $11B R&amp;D budget for 2026 didn&#8217;t shrink.</p><p>If I had to state the view plainly: <strong><mark data-color="#00ffff" style="background-color: rgb(0, 255, 255); color: rgb(0, 0, 0);">regulated industries will likely see smaller near-term cost-side gains than unregulated ones (because compliance overhead bounds how much of the back office can actually be automated, and regulatory review timelines are exogenous to how fast your internal systems run), but potentially much larger long-run revenue-side gains in their R&amp;D/innovation functions specifically</mark></strong><mark data-color="#00ffff" style="background-color: rgb(0, 255, 255); color: rgb(0, 0, 0);"> </mark>&#8212; because that&#8217;s where the bottleneck is internal throughput rather than external approval timelines, and where the payoff distribution is fat-tailed enough that even modest probability shifts are worth enormous amounts.</p><p>The counter-view, and it&#8217;s a serious one: <strong>AI-discovered molecules still have to clear the same Phase I/II/III gauntlet as everything else, and as of mid-2026 zero have done so. FDA guidance on AI use in regulatory submissions (the January 2025 draft and the January 2026 &#8220;Guiding Principles&#8221; follow-up) is explicitly risk-based and incremental &#8212; regulators are not going to compress review timelines just because the candidate was AI-designed</strong>. It&#8217;s entirely possible the revenue-side story takes the rest of this decade to show up in actual approved-drug revenue, in which case the cost-side story &#8212; unglamorous as it is &#8212; remains the only one with a near-term P&amp;L impact, for pharma specifically.</p><h2>4. Where cost and revenue collapse into the same line: software</h2><p>Pharma&#8217;s cost-side and revenue-side stories are separable because R&amp;D and operations are organizationally separate. <strong><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">Software &#8212; and SaaS in particular &#8212; is the industry where I think this separation breaks down entirely, which is also why I&#8217;d guess it&#8217;s where most of the current Silicon Valley enthusiasm is actually anchored, even when the framing is about AI &#8220;in general.&#8221;</mark></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9aJy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54b15acb-c342-4a61-bfd8-b1ce9d81b836_1116x606.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9aJy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54b15acb-c342-4a61-bfd8-b1ce9d81b836_1116x606.png 424w, https://substackcdn.com/image/fetch/$s_!9aJy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54b15acb-c342-4a61-bfd8-b1ce9d81b836_1116x606.png 848w, https://substackcdn.com/image/fetch/$s_!9aJy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54b15acb-c342-4a61-bfd8-b1ce9d81b836_1116x606.png 1272w, https://substackcdn.com/image/fetch/$s_!9aJy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54b15acb-c342-4a61-bfd8-b1ce9d81b836_1116x606.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9aJy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54b15acb-c342-4a61-bfd8-b1ce9d81b836_1116x606.png" width="1116" height="606" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/54b15acb-c342-4a61-bfd8-b1ce9d81b836_1116x606.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:606,&quot;width&quot;:1116,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1000798,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202366454?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54b15acb-c342-4a61-bfd8-b1ce9d81b836_1116x606.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9aJy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54b15acb-c342-4a61-bfd8-b1ce9d81b836_1116x606.png 424w, https://substackcdn.com/image/fetch/$s_!9aJy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54b15acb-c342-4a61-bfd8-b1ce9d81b836_1116x606.png 848w, https://substackcdn.com/image/fetch/$s_!9aJy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54b15acb-c342-4a61-bfd8-b1ce9d81b836_1116x606.png 1272w, https://substackcdn.com/image/fetch/$s_!9aJy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54b15acb-c342-4a61-bfd8-b1ce9d81b836_1116x606.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The mechanism is structural: <strong>SaaS revenue has historically been priced per seat &#8212; you pay for the number of humans who use the software. If an agentic AI system can do the work that used to require N human seats, two things happen to the </strong><em><strong>same line item</strong></em><strong> simultaneously: the buyer&#8217;s cost structure improves (fewer seats needed) and, if the vendor doesn&#8217;t change its pricing model, the vendor&#8217;s revenue </strong><em><strong>declines</strong></em> &#8212; automating away your customers&#8217; need for seats is, under per-seat pricing, automating away your own revenue. This is being described candidly in the industry press as an existential pricing problem, not a hypothetical one, and it&#8217;s why 2026 has seen a rapid, real shift toward usage-based and outcome-based pricing &#8212; vendors are trying to reposition themselves to capture value from the <em>work done</em> by agents rather than the <em>seats occupied</em> by humans, before their customers do the arbitrage for them.</p><p>Gartner&#8217;s numbers on this are aggressive but illustrative of the direction: roughly 40% of enterprise applications expected to ship with task-specific agents by the end of 2026 (up from under 5% in 2025), with a best-case long-run projection of agentic AI representing something like 30% of enterprise application software revenue by 2035 &#8212; a market north of $450B. Omdia&#8217;s estimate of the enterprise agentic AI software market itself growing from roughly $1.5B (2025) to $41.8B (2030) &#8212; a five-year CAGR around 175%, far outpacing generative AI&#8217;s earlier growth curve &#8212; is one of several data points suggesting investors are pricing this as a structural shift in how software is bought and sold, not an incremental feature.</p><p>This is, I think, the closest thing to a single coherent answer to &#8220;<strong><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">what is Silicon Valley so excited about</mark></strong>&#8221;: not drug discovery specifically (too slow, too regulated, too far from most VC portfolios), and not generic cost-cutting (too small per company, as Section 1 shows) &#8212; but <strong><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">the prospect of an entire industry&#8217;s revenue model being re-architected around metered agent output, with the companies that reposition first capturing share from those that don&#8217;t. It&#8217;s a revenue-side story for the </mark></strong><em><strong><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">software industry as a whole</mark></strong></em><strong><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">, even though for most of software&#8217;s </mark></strong><em><strong><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">customers</mark></strong></em><strong><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);"> it shows up as a cost-side story (fewer seats, lower software spend, redeployed headcount).</mark></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dFeh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3591080d-9310-4b9c-9945-c7895021af28_1133x607.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dFeh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3591080d-9310-4b9c-9945-c7895021af28_1133x607.png 424w, https://substackcdn.com/image/fetch/$s_!dFeh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3591080d-9310-4b9c-9945-c7895021af28_1133x607.png 848w, https://substackcdn.com/image/fetch/$s_!dFeh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3591080d-9310-4b9c-9945-c7895021af28_1133x607.png 1272w, https://substackcdn.com/image/fetch/$s_!dFeh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3591080d-9310-4b9c-9945-c7895021af28_1133x607.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dFeh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3591080d-9310-4b9c-9945-c7895021af28_1133x607.png" width="1133" height="607" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3591080d-9310-4b9c-9945-c7895021af28_1133x607.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:607,&quot;width&quot;:1133,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:967087,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202366454?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3591080d-9310-4b9c-9945-c7895021af28_1133x607.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dFeh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3591080d-9310-4b9c-9945-c7895021af28_1133x607.png 424w, https://substackcdn.com/image/fetch/$s_!dFeh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3591080d-9310-4b9c-9945-c7895021af28_1133x607.png 848w, https://substackcdn.com/image/fetch/$s_!dFeh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3591080d-9310-4b9c-9945-c7895021af28_1133x607.png 1272w, https://substackcdn.com/image/fetch/$s_!dFeh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3591080d-9310-4b9c-9945-c7895021af28_1133x607.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>There&#8217;s a sharper version of this threat worth naming. <strong>Narayanan and Kapoor </strong>point out that <em><strong>most software engineers already work in-house, inside non-software companies, not at software vendors &#8212; and argue that share may grow as AI makes building software cheaper (<mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">their version of Jevons&#8217; paradox: cheaper execution doesn&#8217;t shrink total demand for software, it expands it, including demand for software that previously wasn&#8217;t worth building at all). If that&#8217;s right, the risk to incumbent per-seat SaaS vendors isn&#8217;t only &#8220;our customers need fewer seats of our product&#8221; &#8212; it&#8217;s &#8220;our customers increasingly build the equivalent capability in-house, because their own engineers can now execute on it cheaply too</mark>.</strong></em>&#8221; That&#8217;s a harder problem than a pricing-model change can fix, because it&#8217;s not about <em>how</em> the vendor charges for a given capability &#8212; it&#8217;s about whether the vendor remains the cheapest place to get that capability at all.</p><h2>5. The adoption gap: why most of this isn&#8217;t happening yet</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!O35z!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90249bfe-f08c-4c15-b09b-60076d9a0c7a_1122x605.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!O35z!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90249bfe-f08c-4c15-b09b-60076d9a0c7a_1122x605.png 424w, https://substackcdn.com/image/fetch/$s_!O35z!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90249bfe-f08c-4c15-b09b-60076d9a0c7a_1122x605.png 848w, https://substackcdn.com/image/fetch/$s_!O35z!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90249bfe-f08c-4c15-b09b-60076d9a0c7a_1122x605.png 1272w, https://substackcdn.com/image/fetch/$s_!O35z!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90249bfe-f08c-4c15-b09b-60076d9a0c7a_1122x605.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!O35z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90249bfe-f08c-4c15-b09b-60076d9a0c7a_1122x605.png" width="1122" height="605" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/90249bfe-f08c-4c15-b09b-60076d9a0c7a_1122x605.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:605,&quot;width&quot;:1122,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1130111,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202366454?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90249bfe-f08c-4c15-b09b-60076d9a0c7a_1122x605.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!O35z!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90249bfe-f08c-4c15-b09b-60076d9a0c7a_1122x605.png 424w, https://substackcdn.com/image/fetch/$s_!O35z!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90249bfe-f08c-4c15-b09b-60076d9a0c7a_1122x605.png 848w, https://substackcdn.com/image/fetch/$s_!O35z!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90249bfe-f08c-4c15-b09b-60076d9a0c7a_1122x605.png 1272w, https://substackcdn.com/image/fetch/$s_!O35z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90249bfe-f08c-4c15-b09b-60076d9a0c7a_1122x605.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Everything above describes potential &#8212; and the gap between potential and realized impact, in 2026, is large. <a href="https://fortune.com/2025/08/18/mit-report-95-percent-generative-ai-pilots-at-companies-failing-cfo/">MIT&#8217;s NANDA initiative published research in mid-2025 (and follow-on work has largely confirmed the pattern into 2026) </a><strong><a href="https://fortune.com/2025/08/18/mit-report-95-percent-generative-ai-pilots-at-companies-failing-cfo/"><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">finding that roughly 95% of generative AI pilots inside large organizations produce no measurable P&amp;L impact.</mark></a></strong> The headline number gets cited a lot; the more useful detail is <em>why</em>: the research attributes this overwhelmingly to organizational and integration failures rather than model limitations &#8212; generic tools that don&#8217;t adapt to specific workflows, budgets concentrated in high-visibility sales/marketing pilots rather than the back-office automation where MIT found the actual ROI was concentrated, and a strong success-rate gap between buying from specialized vendors with domain expertise (roughly two-thirds success) versus building in-house (roughly one-third).</p><p>A second, more pointed strand of evidence makes a related point even more directly: a lot of what gets <em>reported</em> as &#8220;AI replaced these jobs&#8221; doesn&#8217;t hold up under scrutiny. <em><strong>Narayanan and Kapoor walk through several high-profile 2026 layoff announcements &#8212; Block, Snap, Intuit &#8212; where executives cited AI capability gains as the rationale, but the underlying drivers (investor pressure, financial strain, organizational restructuring) told a different story;</strong></em> in Intuit&#8217;s case, the CEO explicitly pushed back on the AI framing the press had attached to the cuts. They cite <em><strong><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">a survey finding that a majority of US hiring managers admit to citing AI for layoffs because it plays better with stakeholders than citing financial constraints, and an HBR survey where 21% of executives had already made large headcount cuts &#8220;in anticipation of&#8221; AI, versus only 2% citing actual AI implementation as the cause &#8212; roughly a tenfold gap between anticipated and realized impact.</mark></strong></em> New York&#8217;s first full year of WARN Act AI-disclosure data shows a similarly small realized-impact figure: well under 1% of recorded layoffs checked the AI box, though the authors note this could understate the true figure given asymmetric incentives around how companies report.</p><p>This doesn&#8217;t contradict the MIT NANDA finding &#8212; if anything it sharpens it. NANDA measures whether AI pilots produce measurable P&amp;L impact (mostly no); the layoffs evidence suggests that even where job cuts <em>are</em> happening, the causal story attached to AI is often a convenient narrative for decisions that would have happened anyway. Two different ways of arriving at the same conclusion: the realized labor-market impact of agentic AI, as of mid-2026, is consistently smaller than the headlines suggest &#8212; across both &#8220;did the pilot work&#8221; and &#8220;did the layoff actually happen because of AI.&#8221;</p><p><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">This maps remarkably cleanly onto the Harness Lab findings that underpin the PharmaCo case study. H9 &#8212; the five-agent swarm that looked most sophisticated on paper &#8212; </mark><em><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">lost</mark></em><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);"> to H2, a simpler structured single agent, specifically because complexity compounded coordination costs faster than it added capability. The MIT findings about internal builds underperforming vendor partnerships are, in effect, the same lesson at the organizational level: more moving parts, more integration surfaces, more places for a &#8220;looks sophisticated&#8221; approach to lose to a &#8220;fits the actual task&#8221; approach. Gartner&#8217;s own projection that over 40% of agentic AI projects will be cancelled by 2027 &#8212; despite (or because of) rapid adoption &#8212; reads as the market-wide version of H4: enthusiasm for &#8220;more agents, more coverage&#8221; outrunning the harder discipline of matching architecture to task.</mark></p><p>The financial services data offers a useful counter-data-point on what <em>does</em> work: <a href="https://www.ledge.news/news?dimension=Strategist">KPMG</a> and <a href="https://www.idc.com/resource-center/blog/agentic-ai-is-breaking-your-roi-model-heres-how-to-fix-it/">IDC</a> both report roughly 2.3x average ROI on agentic AI investments within about 13 months for institutions with strong governance, and <a href="https://www.mckinsey.com/capabilities/quantumblack/our-insights/seizing-the-agentic-ai-advantage">McKinsey</a> found 20&#8211;60% productivity improvements in credit analysis specifically &#8212; a narrow, well-specified, document-heavy task with clear success criteria, which is exactly the profile that the H3 compliance-monitoring scenario in the <a href="https://interestingengineering.substack.com/p/what-is-any-agentic-architecture">PharmaCo case</a> (and the &#8220;<a href="https://interestingengineering.substack.com/p/the-harness-lab-automated">Harness Lab, Automated</a>&#8221; follow-up benchmark) suggests should <strong>favor simpler, tool-augmented architectures over elaborate ones.</strong> The pattern across all of this research, for what it&#8217;s worth, seems to be: <strong><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">narrow, well-specified, document/data-heavy tasks with measurable success criteria are where agentic AI reliably delivers ROI today, regardless of industry &#8212; and the industries that will see the fastest realized impact are the ones with the most tasks fitting that profile, not the ones with the most dramatic AI narratives.</mark></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PeC4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74683a0-ac9b-411f-90d5-8038257ac65e_1106x591.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PeC4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74683a0-ac9b-411f-90d5-8038257ac65e_1106x591.png 424w, https://substackcdn.com/image/fetch/$s_!PeC4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74683a0-ac9b-411f-90d5-8038257ac65e_1106x591.png 848w, https://substackcdn.com/image/fetch/$s_!PeC4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74683a0-ac9b-411f-90d5-8038257ac65e_1106x591.png 1272w, https://substackcdn.com/image/fetch/$s_!PeC4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74683a0-ac9b-411f-90d5-8038257ac65e_1106x591.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PeC4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74683a0-ac9b-411f-90d5-8038257ac65e_1106x591.png" width="1106" height="591" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e74683a0-ac9b-411f-90d5-8038257ac65e_1106x591.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:591,&quot;width&quot;:1106,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:975936,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202366454?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74683a0-ac9b-411f-90d5-8038257ac65e_1106x591.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PeC4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74683a0-ac9b-411f-90d5-8038257ac65e_1106x591.png 424w, https://substackcdn.com/image/fetch/$s_!PeC4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74683a0-ac9b-411f-90d5-8038257ac65e_1106x591.png 848w, https://substackcdn.com/image/fetch/$s_!PeC4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74683a0-ac9b-411f-90d5-8038257ac65e_1106x591.png 1272w, https://substackcdn.com/image/fetch/$s_!PeC4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74683a0-ac9b-411f-90d5-8038257ac65e_1106x591.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>6. A birds-eye framework</h2><p>Putting the above together, here&#8217;s the 2&#215;2 I&#8217;d use to think about where agentic AI&#8217;s impact lands, and roughly how fast:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!n66r!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d230238-87ae-4b0e-9f24-d9b41b018198_797x736.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!n66r!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d230238-87ae-4b0e-9f24-d9b41b018198_797x736.png 424w, https://substackcdn.com/image/fetch/$s_!n66r!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d230238-87ae-4b0e-9f24-d9b41b018198_797x736.png 848w, https://substackcdn.com/image/fetch/$s_!n66r!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d230238-87ae-4b0e-9f24-d9b41b018198_797x736.png 1272w, https://substackcdn.com/image/fetch/$s_!n66r!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d230238-87ae-4b0e-9f24-d9b41b018198_797x736.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!n66r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d230238-87ae-4b0e-9f24-d9b41b018198_797x736.png" width="797" height="736" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d230238-87ae-4b0e-9f24-d9b41b018198_797x736.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:736,&quot;width&quot;:797,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:164850,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202366454?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d230238-87ae-4b0e-9f24-d9b41b018198_797x736.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!n66r!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d230238-87ae-4b0e-9f24-d9b41b018198_797x736.png 424w, https://substackcdn.com/image/fetch/$s_!n66r!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d230238-87ae-4b0e-9f24-d9b41b018198_797x736.png 848w, https://substackcdn.com/image/fetch/$s_!n66r!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d230238-87ae-4b0e-9f24-d9b41b018198_797x736.png 1272w, https://substackcdn.com/image/fetch/$s_!n66r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d230238-87ae-4b0e-9f24-d9b41b018198_797x736.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The PharmaCo case study sits squarely in the top-left cell &#8212; which is precisely why, on its own, it &#8220;doesn&#8217;t seem as impactful as markets make out&#8221;: markets, when they talk about AI broadly, are mostly pricing in the bottom-right and top-right cells, which this case study was never designed to address. Two things sit outside this 2&#215;2 entirely rather than inside any single cell: <strong>Section 2.1&#8217;s capability-expansion axis, which describes a tail extreme enough that &#8220;revenue&#8221; isn&#8217;t quite the right word for it either, and Section 7&#8217;s composition shift in </strong><em><strong>who</strong></em><strong> does the execute-layer work, with consequences that arrive years after the cost/revenue effects above.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EQ_w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbabe0eca-c7e5-4425-a77f-6fce7602ad50_1102x597.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EQ_w!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbabe0eca-c7e5-4425-a77f-6fce7602ad50_1102x597.png 424w, https://substackcdn.com/image/fetch/$s_!EQ_w!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbabe0eca-c7e5-4425-a77f-6fce7602ad50_1102x597.png 848w, https://substackcdn.com/image/fetch/$s_!EQ_w!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbabe0eca-c7e5-4425-a77f-6fce7602ad50_1102x597.png 1272w, https://substackcdn.com/image/fetch/$s_!EQ_w!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbabe0eca-c7e5-4425-a77f-6fce7602ad50_1102x597.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EQ_w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbabe0eca-c7e5-4425-a77f-6fce7602ad50_1102x597.png" width="1102" height="597" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/babe0eca-c7e5-4425-a77f-6fce7602ad50_1102x597.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:597,&quot;width&quot;:1102,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1117448,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202366454?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbabe0eca-c7e5-4425-a77f-6fce7602ad50_1102x597.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EQ_w!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbabe0eca-c7e5-4425-a77f-6fce7602ad50_1102x597.png 424w, https://substackcdn.com/image/fetch/$s_!EQ_w!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbabe0eca-c7e5-4425-a77f-6fce7602ad50_1102x597.png 848w, https://substackcdn.com/image/fetch/$s_!EQ_w!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbabe0eca-c7e5-4425-a77f-6fce7602ad50_1102x597.png 1272w, https://substackcdn.com/image/fetch/$s_!EQ_w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbabe0eca-c7e5-4425-a77f-6fce7602ad50_1102x597.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>7. The junior pipeline problem</h2><p>One thread the framework above doesn&#8217;t address: even if aggregate demand for a given role holds steady &#8212; which Narayanan and Kapoor argue is the likely outcome for software engineering specifically, given how price-elastic demand for software has historically been &#8212; the <em>composition</em> of that demand can still shift in ways that matter a lot for individuals, and for organizations five-plus years out. A reader comment on their piece (from <strong>Vitaly Osipov) makes a sharp point: execution-layer work is precisely where junior people have historically built the judgment they&#8217;ll need to become senior decide/deliver people later. If AI absorbs most execution-layer work, junior hiring could fall even while senior demand stays robust &#8212; creating a pipeline problem that&#8217;s invisible today and only becomes visible years later, when there aren&#8217;t enough experienced people left to do the deciding and delivering that AI still can&#8217;t do.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SyX0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf25961b-52e6-4aa7-bfdd-84ac606d2a4e_1121x593.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SyX0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf25961b-52e6-4aa7-bfdd-84ac606d2a4e_1121x593.png 424w, https://substackcdn.com/image/fetch/$s_!SyX0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf25961b-52e6-4aa7-bfdd-84ac606d2a4e_1121x593.png 848w, https://substackcdn.com/image/fetch/$s_!SyX0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf25961b-52e6-4aa7-bfdd-84ac606d2a4e_1121x593.png 1272w, https://substackcdn.com/image/fetch/$s_!SyX0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf25961b-52e6-4aa7-bfdd-84ac606d2a4e_1121x593.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SyX0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf25961b-52e6-4aa7-bfdd-84ac606d2a4e_1121x593.png" width="1121" height="593" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cf25961b-52e6-4aa7-bfdd-84ac606d2a4e_1121x593.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:593,&quot;width&quot;:1121,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1044126,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202366454?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf25961b-52e6-4aa7-bfdd-84ac606d2a4e_1121x593.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SyX0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf25961b-52e6-4aa7-bfdd-84ac606d2a4e_1121x593.png 424w, https://substackcdn.com/image/fetch/$s_!SyX0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf25961b-52e6-4aa7-bfdd-84ac606d2a4e_1121x593.png 848w, https://substackcdn.com/image/fetch/$s_!SyX0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf25961b-52e6-4aa7-bfdd-84ac606d2a4e_1121x593.png 1272w, https://substackcdn.com/image/fetch/$s_!SyX0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf25961b-52e6-4aa7-bfdd-84ac606d2a4e_1121x593.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This connects directly to Section 11.3 of the PharmaCo case, which discussed <em><strong>redirecting compliance-analyst capacity toward higher-value work following H3&#8217;s adoption. The framing there was optimistic &#8212; three analysts&#8217; time freed up for more strategic work. But if that pattern repeats at the entry-level hiring decision &#8212; don&#8217;t backfill the junior compliance-monitoring role at all, since H3 now does that work &#8212; the org saves the junior salary today and loses a training pipeline for tomorrow&#8217;s senior compliance hires</strong></em>. Neither this essay nor the PharmaCo case modeled that cost, because it doesn&#8217;t show up on any single year&#8217;s income statement. It shows up later, as a senior-talent shortage that&#8217;s slow and expensive to fix once visible.</p><p>This is arguably the most under-priced risk in the entire framework above: it&#8217;s invisible in every quadrant of Section 6&#8217;s matrix at the time horizon a board typically acts on, and yet it&#8217;s exactly the kind of structural shift that compounds the way the revenue-side stories in Sections 2 and 4 do &#8212; just in the wrong direction.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nfly!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd590384f-e405-4cc5-bbfc-7a6c3fbf28f3_1125x600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nfly!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd590384f-e405-4cc5-bbfc-7a6c3fbf28f3_1125x600.png 424w, https://substackcdn.com/image/fetch/$s_!nfly!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd590384f-e405-4cc5-bbfc-7a6c3fbf28f3_1125x600.png 848w, https://substackcdn.com/image/fetch/$s_!nfly!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd590384f-e405-4cc5-bbfc-7a6c3fbf28f3_1125x600.png 1272w, https://substackcdn.com/image/fetch/$s_!nfly!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd590384f-e405-4cc5-bbfc-7a6c3fbf28f3_1125x600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nfly!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd590384f-e405-4cc5-bbfc-7a6c3fbf28f3_1125x600.png" width="1125" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d590384f-e405-4cc5-bbfc-7a6c3fbf28f3_1125x600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1125,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1202651,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202366454?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd590384f-e405-4cc5-bbfc-7a6c3fbf28f3_1125x600.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nfly!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd590384f-e405-4cc5-bbfc-7a6c3fbf28f3_1125x600.png 424w, https://substackcdn.com/image/fetch/$s_!nfly!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd590384f-e405-4cc5-bbfc-7a6c3fbf28f3_1125x600.png 848w, https://substackcdn.com/image/fetch/$s_!nfly!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd590384f-e405-4cc5-bbfc-7a6c3fbf28f3_1125x600.png 1272w, https://substackcdn.com/image/fetch/$s_!nfly!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd590384f-e405-4cc5-bbfc-7a6c3fbf28f3_1125x600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>8. What would change this view</h2><p>A few things that would update me away from the framing above, in either direction:</p><ul><li><p><strong>An AI-originated drug receiving full FDA approval</strong> (projected by various analysts at roughly 60% probability for 2026&#8211;2028) would be the first hard evidence that the revenue-side pharma story is converting from pipeline activity into realized value &#8212; and would likely accelerate the deal-making already underway. As mentioned earlier, I note this recent development and will watch the space: Takeda&#8217;s zasocitinib (covered in my separate Funnel/Floor/Structure piece) has since cleared <strong>Phase III and is pending an FDA filing within fiscal 2026</strong>.</p></li><li><p><strong>A SaaS company reporting a quarter where usage/outcome-based AI revenue materially offset a decline in per-seat revenue</strong> would confirm the Section 4 thesis is playing out on the timeline I&#8217;d guess, rather than the slower 2030+ timeline some forecasts imply.</p></li><li><p><strong>A second or third independent study replicating MIT NANDA&#8217;s ~95% pilot-failure finding with a meaningfully lower number</strong> would suggest the &#8220;learning gap&#8221; is closing faster than the 2025&#8211;26 data suggests &#8212; which would matter enormously for how quickly <em>any</em> of the cells in Section 6 move from potential to realized.</p></li><li><p>Conversely, if <strong>Gartner&#8217;s 40%+ project-cancellation prediction for 2027 lands</strong> roughly on target, <em><strong><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">it would suggest the gap between agentic AI&#8217;s demonstrated architecture-level potential (the H1&#8211;H10 results) and most organizations&#8217; ability to actually capture it remains the dominant constraint &#8212; in which case the PharmaCo case study&#8217;s core message (architecture-and-task-fit matters more than raw capability</mark></strong></em>) becomes, if anything, <em>more</em> relevant, not less, as the bottleneck shifts from &#8220;can the model do this&#8221; to &#8220;did the organization build the right harness around it.&#8221;</p></li><li><p><strong>Entry-level hiring data specifically</strong> (not just aggregate headcount) showing a sustained decline in junior roles across software, compliance, and similar execution-heavy entry points would be the earliest leading indicator for Section 7&#8217;s pipeline concern &#8212; well before any senior-talent shortage became visible in P&amp;L terms.</p></li><li><p><strong>A second capability-expansion case outside pharma converting from research activity into a deployed, valuable result</strong> (Section 2.1) &#8212; for instance, an <em><strong>AI-discovered semiconductor interconnect or packaging technique reaching production &#8212; would be the clearest sign the third axis generalises beyond drug discovery rather than being a pharma-specific artifact of how option-like that industry&#8217;s economics already were.</strong></em></p></li><li><p><strong>A second state adopting New York&#8217;s WARN Act AI-disclosure checkbox</strong>, showing a similarly low realized-AI-layoff percentage, would reinforce the AI-washing finding in Section 5 as a general pattern rather than a New York-specific artifact (or a reporting quirk of that particular dataset).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uFQK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0f3d8a5-6a88-4603-8433-081226094a58_1107x562.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uFQK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0f3d8a5-6a88-4603-8433-081226094a58_1107x562.png 424w, https://substackcdn.com/image/fetch/$s_!uFQK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0f3d8a5-6a88-4603-8433-081226094a58_1107x562.png 848w, https://substackcdn.com/image/fetch/$s_!uFQK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0f3d8a5-6a88-4603-8433-081226094a58_1107x562.png 1272w, https://substackcdn.com/image/fetch/$s_!uFQK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0f3d8a5-6a88-4603-8433-081226094a58_1107x562.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uFQK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0f3d8a5-6a88-4603-8433-081226094a58_1107x562.png" width="1107" height="562" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f0f3d8a5-6a88-4603-8433-081226094a58_1107x562.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:562,&quot;width&quot;:1107,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:944551,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202366454?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0f3d8a5-6a88-4603-8433-081226094a58_1107x562.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uFQK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0f3d8a5-6a88-4603-8433-081226094a58_1107x562.png 424w, https://substackcdn.com/image/fetch/$s_!uFQK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0f3d8a5-6a88-4603-8433-081226094a58_1107x562.png 848w, https://substackcdn.com/image/fetch/$s_!uFQK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0f3d8a5-6a88-4603-8433-081226094a58_1107x562.png 1272w, https://substackcdn.com/image/fetch/$s_!uFQK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0f3d8a5-6a88-4603-8433-081226094a58_1107x562.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div></li></ul><div><hr></div><h2>Sources and further reading</h2><ul><li><p>Anthropic, <em>Economic Index</em> &#8212; series of reports tracking real-world Claude usage, Jan/Mar 2026 editions cited in Section 5 and Section 7&#8217;s WARN Act discussion context: <a href="https://anthropic.com/economic-index">anthropic.com/economic-index</a> &#183; <a href="https://www.anthropic.com/research/anthropic-economic-index-january-2026-report">Jan 2026 report</a> &#183; <a href="https://www.anthropic.com/research/economic-index-march-2026-report">Mar 2026 report</a></p></li><li><p>Massenkoff, M. &amp; McCrory, P., &#8220;Labor market impacts of AI: A new measure and early evidence&#8221; (Anthropic, March 5, 2026) &#8212; source for the &#8220;slower hiring rather than increased separations&#8221; finding referenced in Section 5: <a href="https://www.anthropic.com/research/labor-market-impacts">anthropic.com/research/labor-market-impacts</a></p></li><li><p>McKinsey Global Institute, &#8220;Generative AI in the pharmaceutical industry: Moving from hype to reality&#8221; &#8212; original source of the $60&#8211;110B annual addressable-value estimate for pharma cited in Section 2: <a href="https://www.mckinsey.com/industries/life-sciences/our-insights/generative-ai-in-the-pharmaceutical-industry-moving-from-hype-to-reality">mckinsey.com</a></p></li><li><p>AllAboutAI, &#8220;AI in Drug Development Statistics 2026&#8221; &#8212; source for AI-discovered-drug Phase I success-rate figures (80&#8211;90% vs. a lower traditional baseline) referenced in Section 2: <a href="https://www.allaboutai.com/resources/ai-statistics/drug-development/">allaboutai.com/resources/ai-statistics/drug-development</a></p></li><li><p>IntuitionLabs, &#8220;Accelerating Drug Development with AI in the U.S. Pharmaceutical Industry&#8221; &#8212; source for the Exscientia/DSP-1181 12-month-to-clinic example referenced in Section 2: <a href="https://intuitionlabs.ai/articles/accelerating-drug-development-ai-pharma">intuitionlabs.ai/articles/accelerating-drug-development-ai-pharma</a></p></li><li><p>IntuitionLabs, &#8220;Measuring AI ROI in Drug Discovery: Key Metrics &amp; Outcomes&#8221; (Apr 2026) &#8212; source for Eli Lilly&#8217;s $2.75B Insilico partnership figure referenced in Section 2: <a href="https://intuitionlabs.ai/articles/measuring-ai-roi-drug-discovery">intuitionlabs.ai/articles/measuring-ai-roi-drug-discovery</a></p></li><li><p>DrugPatentWatch, &#8220;AI in Drug Discovery 2026: What Actually Works, What Remains Hype, and Where the IP Value Sits&#8221; &#8212; corroborates the Phase I success-rate figures and notes Phase II/III as the still-pending validation event for AI-discovered candidates, referenced in Section 2: <a href="https://www.drugpatentwatch.com/blog/artificial-intelligence-in-drug-discovery-what-is-realistic-what-are-illusions-part-1-ways-to-make-an-impact-and-why-we-are-not-there-yet/">drugpatentwatch.com</a></p></li><li><p>BioSpace, &#8220;AI Is Changing Pharma&#8217;s Bottom Line Now&#8212;But Not Through Splashy Drug Discovery&#8221; (Feb 2026) &#8212; source for the framing that AI&#8217;s clearest near-term pharma impact runs through development speed rather than discovery itself, referenced in Section 2: <a href="https://www.biospace.com/business/ai-is-changing-pharmas-bottom-line-now-but-not-through-splashy-drug-discovery">biospace.com</a></p></li><li><p>FierceBiotech, &#8220;From R&amp;D to M&amp;A: Big Pharmas showcase &#8216;measurable impact&#8217; of AI&#8221; (May 2026) &#8212; source for AstraZeneca&#8217;s and GSK&#8217;s earnings-call AI commentary referenced in Section 3: <a href="https://www.fiercebiotech.com/biotech/drug-development-ma-big-pharmas-showcase-measurable-impact-ai">fiercebiotech.com</a></p></li><li><p>Gartner, &#8220;Gartner Predicts 40% of Enterprise Apps Will Feature Task-Specific AI Agents by 2026, Up from Less Than 5% in 2025&#8221; (Aug 2025) &#8212; source for the 40%-by-2026 and $450B-by-2035 figures cited in Section 4, and the 40%+ agentic-project-cancellation-by-2027 prediction cited in Section 8: <a href="https://www.gartner.com/en/newsroom/press-releases/2025-08-26-gartner-predicts-40-percent-of-enterprise-apps-will-feature-task-specific-ai-agents-by-2026-up-from-less-than-5-percent-in-2025">gartner.com</a></p></li><li><p>RSM US, &#8220;SaaS vendors must adjust pricing models as agentic AI transforms the industry&#8221; (Mar 2026) &#8212; source for the &#8220;engineering its own revenue decline&#8221; framing cited in Section 4: <a href="https://rsmus.com/insights/industries/technology-companies/saas-vendors-pricing-models-ai.html">rsmus.com</a></p></li><li><p>Omdia, &#8220;New Omdia Analysis Shows Agentic AI Outpacing Growth Rates of Traditional Generative AI&#8221; (Sep 2025) &#8212; source for the $1.5B (2025) to $41.8B (2030) enterprise agentic AI software market forecast and 175% five-year CAGR cited in Section 4: <a href="https://www.businesswire.com/news/home/20250922194426/en/New-Omdia-Analysis-Shows-Agentic-AI-Outpacing-Growth-Rates-of-Traditional-Generative-AI">businesswire.com</a></p></li><li><p>MIT NANDA / Project NANDA, <em>The GenAI Divide: State of AI in Business 2025</em> (Aug 2025) &#8212; source of the ~95% pilot-failure finding and the &#8220;learning gap&#8221; diagnosis cited in Section 5: <a href="https://mlq.ai/media/quarterly_decks/v0.1_State_of_AI_in_Business_2025_Report.pdf">full report PDF</a> &#183; widely covered, incl. <a href="https://finance.yahoo.com/news/mit-report-95-generative-ai-105412686.html">Fortune</a></p></li><li><p>Ellvero Insights, &#8220;AI ROI in 2026: Why 95% of Pilots Fail to Deliver and How to Measure What Actually Matters&#8221; (citing Boston Consulting Group&#8217;s 2026 AI at Scale survey) &#8212; corroborating 2026 follow-up data cited in Section 5: <a href="https://www.ellvero.com/insights/ai-roi-in-2026-why-95-percent-of-pilots-fail-and-how-to-measure-what-matters">ellvero.com</a></p></li><li><p>Narayanan, A. &amp; Kapoor, S., &#8220;Why AI hasn&#8217;t replaced software engineers, and won&#8217;t: Coding agents as normal technology&#8221; (AI as Normal Technology, June 11, 2026) &#8212; source for the &#8220;decide-execute-deliver sandwich&#8221; framework, the Block/Snap/Intuit &#8220;AI washing&#8221; cases, the WARN Act and HBR anticipated-vs-realised figures cited in Sections 1 and 5, and the junior-pipeline discussion in Section 7: <a href="https://www.normaltech.ai/p/why-ai-hasnt-replaced-software-engineers">normaltech.ai/p/why-ai-hasnt-replaced-software-engineers</a></p></li><li><p>Azilen Technologies, &#8220;Agentic AI in Financial Services [2026 Definitive Guide]&#8221; (citing KPMG and McKinsey) &#8212; source for the 2.3x ROI within 13 months and 20&#8211;60% credit-analysis productivity figures cited in Section 5: <a href="https://www.azilen.com/blog/agentic-ai-in-financial-services/">azilen.com/blog/agentic-ai-in-financial-services</a></p></li><li><p>Neurons Lab, &#8220;Agentic AI in Financial Services: A Research Roundup for 2026&#8221; (citing IDC, McKinsey, PwC, Deloitte) &#8212; corroborating source for the same 2.3x ROI figure cited in Section 5: <a href="https://neurons-lab.com/articles/agentic-ai-in-financial-services-2026/">neurons-lab.com/articles/agentic-ai-in-financial-services-2026</a></p></li><li><p><a href="https://interestingengineering.substack.com/p/what-is-any-agentic-architecture">What Is Any Agentic Structure Actually Worth?</a> InterestingEngineering++</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[What Is Any Agentic Architecture Actually Worth, Anyway?]]></title><description><![CDATA[A Pharma Company Case Study Tests Ten Agentic Designs Against the Same Crisis]]></description><link>https://interestingengineering.substack.com/p/what-is-any-agentic-architecture</link><guid isPermaLink="false">https://interestingengineering.substack.com/p/what-is-any-agentic-architecture</guid><dc:creator><![CDATA[Interesting Engineering ++]]></dc:creator><pubDate>Tue, 16 Jun 2026 16:07:03 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!OPeg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9eb5de89-2b6a-4312-9ff4-1544f4f377cb_1130x626.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OPeg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9eb5de89-2b6a-4312-9ff4-1544f4f377cb_1130x626.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OPeg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9eb5de89-2b6a-4312-9ff4-1544f4f377cb_1130x626.png 424w, https://substackcdn.com/image/fetch/$s_!OPeg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9eb5de89-2b6a-4312-9ff4-1544f4f377cb_1130x626.png 848w, https://substackcdn.com/image/fetch/$s_!OPeg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9eb5de89-2b6a-4312-9ff4-1544f4f377cb_1130x626.png 1272w, https://substackcdn.com/image/fetch/$s_!OPeg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9eb5de89-2b6a-4312-9ff4-1544f4f377cb_1130x626.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OPeg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9eb5de89-2b6a-4312-9ff4-1544f4f377cb_1130x626.png" width="1130" height="626" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9eb5de89-2b6a-4312-9ff4-1544f4f377cb_1130x626.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:626,&quot;width&quot;:1130,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1173753,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9eb5de89-2b6a-4312-9ff4-1544f4f377cb_1130x626.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OPeg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9eb5de89-2b6a-4312-9ff4-1544f4f377cb_1130x626.png 424w, https://substackcdn.com/image/fetch/$s_!OPeg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9eb5de89-2b6a-4312-9ff4-1544f4f377cb_1130x626.png 848w, https://substackcdn.com/image/fetch/$s_!OPeg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9eb5de89-2b6a-4312-9ff4-1544f4f377cb_1130x626.png 1272w, https://substackcdn.com/image/fetch/$s_!OPeg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9eb5de89-2b6a-4312-9ff4-1544f4f377cb_1130x626.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!niEr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66bd86e2-228e-4df7-aafc-c1cde2911908_1117x610.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!niEr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66bd86e2-228e-4df7-aafc-c1cde2911908_1117x610.png 424w, https://substackcdn.com/image/fetch/$s_!niEr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66bd86e2-228e-4df7-aafc-c1cde2911908_1117x610.png 848w, https://substackcdn.com/image/fetch/$s_!niEr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66bd86e2-228e-4df7-aafc-c1cde2911908_1117x610.png 1272w, https://substackcdn.com/image/fetch/$s_!niEr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66bd86e2-228e-4df7-aafc-c1cde2911908_1117x610.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!niEr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66bd86e2-228e-4df7-aafc-c1cde2911908_1117x610.png" width="1117" height="610" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/66bd86e2-228e-4df7-aafc-c1cde2911908_1117x610.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:610,&quot;width&quot;:1117,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1043362,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66bd86e2-228e-4df7-aafc-c1cde2911908_1117x610.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!niEr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66bd86e2-228e-4df7-aafc-c1cde2911908_1117x610.png 424w, https://substackcdn.com/image/fetch/$s_!niEr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66bd86e2-228e-4df7-aafc-c1cde2911908_1117x610.png 848w, https://substackcdn.com/image/fetch/$s_!niEr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66bd86e2-228e-4df7-aafc-c1cde2911908_1117x610.png 1272w, https://substackcdn.com/image/fetch/$s_!niEr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66bd86e2-228e-4df7-aafc-c1cde2911908_1117x610.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is the first of three pieces that, together, ask one question from three angles: <strong>when an organisation adopts agentic AI, where does the impact actually land, and how big is it really, or how big is it really expected to be?</strong> These questions are critical given much market froth and excitement around &#8220;agentic systems&#8221;. I care less about the numbers than about the debates, direction, and strategies that may be applied.</p><ol><li><p><strong>Title: </strong><em><strong>What the Architecture Is Worth (This piece)</strong></em><br>A $12 Billion Pharma Company Tests (Ten) Agentic Designs Against the Same Crisis<strong>&#8212; readers walk away with</strong>: <strong>how to think around what an architecture choice is worth. You should walk away with more questions than answers, which is the point of the exercise</strong> &#8212; which of various AI designs wins, loses, or fails outright when tested against the same crisis, and the enterprise-value gap between them.</p></li><li><p><strong>Two Economies, One Technology</strong><br>Why the Same Technology Caps Out as a Saving in One Function and Opens Into a Bet in Another<strong>&#8212; readers walk away with</strong>: <strong>a better way to judge whether an AI claim about </strong><em><strong>their</strong></em><strong> industry is plausible </strong>&#8212; why the same technology produces a small, capped saving in one function and a large, uncertain bet in another, and which is which.</p></li><li><p><strong>The Funnel, the Floor, and the Structure</strong><br>"A Value-Chain Framework for Where AI Impact Lands"<strong>&#8212; readers walk away with</strong>: a three-part map for locating any AI effect, in any sector &#8212; so the next AI headline can be sorted into &#8220;bounded saving,&#8221; &#8220;long-shot bet,&#8221; or &#8220;who-gets-paid shift&#8221; before deciding how much it matters.</p></li></ol><p>This piece starts with the narrowest, most concrete version of that question &#8212; inside a hypothetical single $12 billion pharmaceutical company, facing a real supply-chain crisis, what is the <strong>difference in enterprise value between choosing the right AI architecture and choosing a &#8220;plausible-but-wrong one&#8221;? By now, you would have realized that <a href="https://interestingengineering.substack.com/p/the-harness-lab-automated">not all &#8220;agentic designs&#8221; are created equal!</a></strong><a href="https://interestingengineering.substack.com/p/the-harness-lab-automated"> </a>It builds on the foundation of the &#8220;<a href="https://interestingengineering.substack.com/p/the-harness-lab-automated">Harness Lab Series</a>&#8221; (more references found below, but you can start with the link provided and unwind things from there). </p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Given how young the space is, the analysis does its best to hypothesize, but the <strong>real value should be in organizations potentially using this as a helpful guideline (which you can tweak as thoughts arise, and the specifics of your businesses and strategies are ironed out), basically, scenario planning around how the allocation of resources towards Agentic AI, might pan out. What risks to look out for, and how the organization itself might prioritize team or departmental adoption, including setting up a strategy around it.</strong> Use it as you see fit. </p><p>The second piece, <strong>"Two Economies, One Technology," picks up the answer from this case study and asks why it (investments in Agentic AI) might feel smaller than the claims circulating about AI's economic impact &#8212; arguing that the answer depends on which part of a business, and which industry, is doing the talking.</strong> </p><p>The third piece, "<strong>The Funnel, the Floor, and the Structure,</strong>" completes the arc with a <strong>framework for locating any AI-driven effect, in any sector, by where it sits in a value chain &#8212; and shows, in hindsight, that everything measured in this first piece was one specific kind of effect all along</strong>. Read in sequence, the three pieces move from one company's balance sheet to an industry-spanning map; read on its own, this piece is a self-contained case study of what ten different agentic architectures are actually worth when tested against the same crisis.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0hjZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff084b35b-7e71-4770-98a8-9c7993f5dcd6_732x107.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0hjZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff084b35b-7e71-4770-98a8-9c7993f5dcd6_732x107.png 424w, https://substackcdn.com/image/fetch/$s_!0hjZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff084b35b-7e71-4770-98a8-9c7993f5dcd6_732x107.png 848w, https://substackcdn.com/image/fetch/$s_!0hjZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff084b35b-7e71-4770-98a8-9c7993f5dcd6_732x107.png 1272w, https://substackcdn.com/image/fetch/$s_!0hjZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff084b35b-7e71-4770-98a8-9c7993f5dcd6_732x107.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0hjZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff084b35b-7e71-4770-98a8-9c7993f5dcd6_732x107.png" width="732" height="107" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f084b35b-7e71-4770-98a8-9c7993f5dcd6_732x107.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:107,&quot;width&quot;:732,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:9263,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff084b35b-7e71-4770-98a8-9c7993f5dcd6_732x107.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0hjZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff084b35b-7e71-4770-98a8-9c7993f5dcd6_732x107.png 424w, https://substackcdn.com/image/fetch/$s_!0hjZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff084b35b-7e71-4770-98a8-9c7993f5dcd6_732x107.png 848w, https://substackcdn.com/image/fetch/$s_!0hjZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff084b35b-7e71-4770-98a8-9c7993f5dcd6_732x107.png 1272w, https://substackcdn.com/image/fetch/$s_!0hjZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff084b35b-7e71-4770-98a8-9c7993f5dcd6_732x107.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h2>How to Read This Case Study</h2><p>This case study has two layers, and keeping them separate is the key to following the argument. The first layer is a piece of research: ten AI &#8220;harness&#8221; designs &#8212; numbered H1 through H10 &#8212; were built and tested against the same task, and their quality, speed, and cost were measured. That research is real and is documented in the <a href="https://interestingengineering.substack.com/p/the-harness-lab-automated">Harness Engineering Series</a> articles referenced throughout. The second layer is this case study itself: <em><strong>a fictional pharmaceutical company, PharmaCo, that uses those same ten harness designs and experiences the financial consequences of choosing well or badly between them.</strong></em></p><p>Every harness discussed in this case is referred to only by its number from the original H1&#8211;H10 research &#8220;<a href="https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated">The Harness Lab</a>&#8221; and &#8220;<a href="https://interestingengineering.substack.com/p/the-harness-lab-automated">The Harness Lab, Automated</a>&#8221;&#8212; there is no separate alphabetical naming scheme here. Four of the ten matter most to the PharmaCo story:</p><p>&#8226; H2 &#8212; a single AI agent following a structured, step-by-step reasoning script. This is the architecture PharmaCo ultimately recommends for its hardest task: responding to a supply chain disruption.</p><p>&#8226; H3 &#8212; a single AI agent that calls external tools (databases, live status checks) as it works. This is the architecture that fits PharmaCo&#8217;s second task: routine daily compliance monitoring across 40+ markets.</p><p>&#8226; H9 &#8212; a five-agent &#8220;swarm&#8221; with an orchestrator that divides the task among specialists and stitches their outputs together. This is the architecture PharmaCo&#8217;s management initially preferred &#8212; and the one that turns out to cost $19.8M a year more than H2 on the disruption task.</p><p>&#8226; H4 &#8212; three agents working independently on the same task with no shared context, their outputs simply concatenated at the end. In the original research this was the weakest design tested, and it appears here as the cautionary worst case: what happens if an organisation deploys an architecture that looks parallel and thorough but has no mechanism for reconciling contradictions.</p><p>The case study is organised in three movements. Sections 1&#8211;2 set the scene &#8212; who PharmaCo is, what an agentic AI system and a harness actually are, and what the underlying research found across all ten architectures. Sections 3&#8211;9 walk through a single disruption event four times &#8212; once for each of H2, H9, H4, and H3 &#8212; showing what each architecture actually did and what it cost. Sections 10&#8211;13 step back to the financial summary, the shareholder-value calculation, and the governance questions the case raises &#8212; including a question this revision adds: if agentic AI changes the cost side of the ledger this dramatically, what happens to the people side?</p><p>A reminder that the numbers are less important than the depth of discussions and actions that surround scenario planning applied to the implementation of agentic systems. Keep a list of questions that arise, so that these can be applied and tailored to your organizations circumstances. While market hype abounds, think twice before applying. So here it is. Your case study.</p><h2>Executive Summary</h2><p>PharmaCo International &#8212; an assumed fictional $12 billion global pharmaceutical company &#8212; deployed AI agents in two areas of its operations: supply chain disruption response and routine regulatory compliance monitoring. In each area, management had to choose between several possible AI &#8220;harness&#8221; architectures, and that choice turned out to be one of the most consequential capital allocation decisions the company made that year.</p><p>The central finding is counterintuitive. On the disruption response task, the most architecturally elaborate system tested &#8212; H9, a five-agent orchestration swarm &#8212; produced worse financial outcomes than H2, a single agent following an explicit structured reasoning script. <em><strong>The gap between choosing H2 and choosing H9 was est at $19.8 million per year in avoidable cost, or roughly $297 million in enterprise value at PharmaCo&#8217;s earnings multiple. But the difference could actually be wider: ($297M&#8211;$986M EV range). A third option, H4 &#8212; three independent agents with no shared context &#8212; performed worse still: $65.7 million per year worse than H2, and in the original research, H4 was the second-weakest of all ten architectures tested, scoring below the bare-model baseline.</strong></em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MOvL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda1d642d-c857-4017-a710-4e5ed75d43bb_1142x616.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MOvL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda1d642d-c857-4017-a710-4e5ed75d43bb_1142x616.png 424w, https://substackcdn.com/image/fetch/$s_!MOvL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda1d642d-c857-4017-a710-4e5ed75d43bb_1142x616.png 848w, https://substackcdn.com/image/fetch/$s_!MOvL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda1d642d-c857-4017-a710-4e5ed75d43bb_1142x616.png 1272w, https://substackcdn.com/image/fetch/$s_!MOvL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda1d642d-c857-4017-a710-4e5ed75d43bb_1142x616.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MOvL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda1d642d-c857-4017-a710-4e5ed75d43bb_1142x616.png" width="1142" height="616" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/da1d642d-c857-4017-a710-4e5ed75d43bb_1142x616.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:616,&quot;width&quot;:1142,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1062623,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda1d642d-c857-4017-a710-4e5ed75d43bb_1142x616.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MOvL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda1d642d-c857-4017-a710-4e5ed75d43bb_1142x616.png 424w, https://substackcdn.com/image/fetch/$s_!MOvL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda1d642d-c857-4017-a710-4e5ed75d43bb_1142x616.png 848w, https://substackcdn.com/image/fetch/$s_!MOvL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda1d642d-c857-4017-a710-4e5ed75d43bb_1142x616.png 1272w, https://substackcdn.com/image/fetch/$s_!MOvL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda1d642d-c857-4017-a710-4e5ed75d43bb_1142x616.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_j_A!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F525c41b5-6f71-4270-a246-91ab0a788dfe_1122x615.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_j_A!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F525c41b5-6f71-4270-a246-91ab0a788dfe_1122x615.png 424w, https://substackcdn.com/image/fetch/$s_!_j_A!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F525c41b5-6f71-4270-a246-91ab0a788dfe_1122x615.png 848w, https://substackcdn.com/image/fetch/$s_!_j_A!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F525c41b5-6f71-4270-a246-91ab0a788dfe_1122x615.png 1272w, https://substackcdn.com/image/fetch/$s_!_j_A!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F525c41b5-6f71-4270-a246-91ab0a788dfe_1122x615.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_j_A!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F525c41b5-6f71-4270-a246-91ab0a788dfe_1122x615.png" width="1122" height="615" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/525c41b5-6f71-4270-a246-91ab0a788dfe_1122x615.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:615,&quot;width&quot;:1122,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:960810,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F525c41b5-6f71-4270-a246-91ab0a788dfe_1122x615.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_j_A!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F525c41b5-6f71-4270-a246-91ab0a788dfe_1122x615.png 424w, https://substackcdn.com/image/fetch/$s_!_j_A!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F525c41b5-6f71-4270-a246-91ab0a788dfe_1122x615.png 848w, https://substackcdn.com/image/fetch/$s_!_j_A!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F525c41b5-6f71-4270-a246-91ab0a788dfe_1122x615.png 1272w, https://substackcdn.com/image/fetch/$s_!_j_A!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F525c41b5-6f71-4270-a246-91ab0a788dfe_1122x615.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On the second task &#8212; routine compliance monitoring across more than 40 markets &#8212; the ranking reverses. H3, a single agent equipped with live tool access to regulatory databases, outperforms H2, because the value in this task comes from retrieving current external data rather than from structured internal reasoning. Replacing three human analysts with H3 saves PharmaCo roughly $2.0 million a year while improving review frequency from fortnightly to daily and accuracy from 85% to 97%.</p><p>Three findings run through the whole case. First, <strong>there is no universally best architecture &#8212; the right choice depends on the task, and the same architecture (H9) that loses badly on the disruption task could not even be meaningfully applied to the compliance task</strong>. Second, <strong>complexity and cost are not the same as qualit</strong>y: H4 was cheaper to build than H2 and far more expensive to run. Third, <strong>a system that scores perfectly on a quality rubric is not automatically safe &#8212; adversarial testing caught a conditional failure mode in H3 that the standard scoring missed entirely</strong>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!S-n-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05969918-4b4b-44de-be81-73548989aa27_943x278.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!S-n-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05969918-4b4b-44de-be81-73548989aa27_943x278.png 424w, https://substackcdn.com/image/fetch/$s_!S-n-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05969918-4b4b-44de-be81-73548989aa27_943x278.png 848w, https://substackcdn.com/image/fetch/$s_!S-n-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05969918-4b4b-44de-be81-73548989aa27_943x278.png 1272w, https://substackcdn.com/image/fetch/$s_!S-n-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05969918-4b4b-44de-be81-73548989aa27_943x278.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!S-n-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05969918-4b4b-44de-be81-73548989aa27_943x278.png" width="943" height="278" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/05969918-4b4b-44de-be81-73548989aa27_943x278.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:278,&quot;width&quot;:943,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:36548,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05969918-4b4b-44de-be81-73548989aa27_943x278.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!S-n-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05969918-4b4b-44de-be81-73548989aa27_943x278.png 424w, https://substackcdn.com/image/fetch/$s_!S-n-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05969918-4b4b-44de-be81-73548989aa27_943x278.png 848w, https://substackcdn.com/image/fetch/$s_!S-n-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05969918-4b4b-44de-be81-73548989aa27_943x278.png 1272w, https://substackcdn.com/image/fetch/$s_!S-n-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05969918-4b4b-44de-be81-73548989aa27_943x278.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Section 1 &#8212; The Research Foundation: What H1&#8211;H10 Actually Found</h2><p>Before applying any of this to PharmaCo, it is worth looking directly at the research the case study draws on, because the PharmaCo numbers are a financial translation of these results &#8212; not a separate study with its own winners and losers.</p><h3>1.1 The Original Harness Lab Benchmark</h3><p>The <a href="https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated">original Harness Lab experiment</a> tested ten harness architectures &#8212; H1 through H10 &#8212; against a single benchmark task: producing a disruption response plan for a pharmaceutical supply chain crisis, scored against a pre-specified gold answer with deliberate traps built in (a regulatory deadline, a carrier-specific routing requirement, and a supplier ordering dependency). Three metrics were recorded for every architecture:</p><p>&#8226; Alpha (&#945;) &#8212; quality, measured as the share of gold-answer criteria the system&#8217;s output satisfied, from 0 to 1.0.</p><p>&#8226; Lambda (&#955;) &#8212; latency, the time in milliseconds the architecture took to produce its output.</p><p>&#8226; Kappa (&#954;) &#8212; total tokens consumed, a proxy for compute cost.</p><p>The full results across all ten architectures:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RXbJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fcd3317-0081-4933-8837-c0bb05476ad5_952x422.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RXbJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fcd3317-0081-4933-8837-c0bb05476ad5_952x422.png 424w, https://substackcdn.com/image/fetch/$s_!RXbJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fcd3317-0081-4933-8837-c0bb05476ad5_952x422.png 848w, https://substackcdn.com/image/fetch/$s_!RXbJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fcd3317-0081-4933-8837-c0bb05476ad5_952x422.png 1272w, https://substackcdn.com/image/fetch/$s_!RXbJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fcd3317-0081-4933-8837-c0bb05476ad5_952x422.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RXbJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fcd3317-0081-4933-8837-c0bb05476ad5_952x422.png" width="952" height="422" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8fcd3317-0081-4933-8837-c0bb05476ad5_952x422.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:422,&quot;width&quot;:952,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:60505,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fcd3317-0081-4933-8837-c0bb05476ad5_952x422.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RXbJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fcd3317-0081-4933-8837-c0bb05476ad5_952x422.png 424w, https://substackcdn.com/image/fetch/$s_!RXbJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fcd3317-0081-4933-8837-c0bb05476ad5_952x422.png 848w, https://substackcdn.com/image/fetch/$s_!RXbJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fcd3317-0081-4933-8837-c0bb05476ad5_952x422.png 1272w, https://substackcdn.com/image/fetch/$s_!RXbJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fcd3317-0081-4933-8837-c0bb05476ad5_952x422.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: <a href="https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated">The Harness Lab</a></figcaption></figure></div><h4>Reading the table</h4><p>H2 &#8212; a single agent following a structured reasoning script &#8212; was both the highest-quality result (&#945; = 0.920) and among the cheapest and fastest. It dominates on all three axes. H5 (an agent that drafts, evaluates its own output against criteria, and loops until done) and H7 (an agent that routes sub-tasks to different models) both reached &#945; = 0.840 &#8212; strong results, achieved through iteration and tool/model variety rather than a fixed script. H6 (skill memory) reached 0.900, close behind H2. H3 (sequential tool calls) only reached 0.600 on this particular benchmark &#8212; useful tools, used in sequence, but without the structured reasoning scaffold that kept H2 on track through the deliberate traps.</p><p>H4 (three agents running in parallel with no shared context, outputs simply concatenated) scored 0.440 &#8212; below the bare, no-harness baseline (H1, 0.665). This is the headline result that justifies H4&#8217;s role later in this case study as the cautionary worst case: it is not merely &#8220;not the best,&#8221; it is the design that an organisation would have been better off not building at all. H10 (a meta-harness that spawns and manages other harnesses) effectively broke down on this task (&#945; = 0.230, no usable latency figure) and is not carried forward into the PharmaCo narrative.</p><h3>1.2 The Automated Follow-Up: Workflow Patterns and the Sigma Metric</h3><p>A second piece of research &#8212; &#8216;The Harness Lab, Automated&#8217; &#8212; took a different approach. Rather than hand-building each harness, it used Claude Code&#8217;s dynamic workflow patterns (fan-out-and-synthesize, loop-until-done, tournament, adversarial verification, generate-and-filter, classify-and-act) to generate, test, and tournament harness designs algorithmically. This second study was deliberately run on a simplified benchmark &#8212; the same task family, but with no deliberate traps and a small set of binary evaluation criteria &#8212; to test whether the automated workflow tooling itself worked correctly, not to re-run the original hard benchmark.</p><p>On that simplified benchmark, the designs that performed best were the tool-using and iterative-loop styles &#8212; directly comparable to H3 (sequential tools) and H5 (eval loop) from the original table. This makes sense: on an easy task with no traps, broad tool coverage and iterative self-checking win, because there is nothing precision-critical for them to miss. The structured-script approach of H2 carries scaffolding overhead that pays off on hard, trap-laden tasks but is unnecessary on simple ones.</p><p>This second study also introduced the Sigma (&#931;) metric &#8212; quality divided by a complexity penalty (&#931; = &#945; &#247; (1 + &#916;), where &#916; captures structural complexity such as agent count and coordination overhead). Sigma explains why H2 still leads even against architectures with comparable raw quality: H9&#8217;s &#945; = 0.815 comes with substantially more coordination machinery than H2&#8217;s &#945; = 0.920, so on a quality-per-complexity basis the gap between them is far larger than the raw &#945; numbers alone suggest.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TMGE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30b79232-ee54-4d79-b277-84b496f49aca_942x245.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TMGE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30b79232-ee54-4d79-b277-84b496f49aca_942x245.png 424w, https://substackcdn.com/image/fetch/$s_!TMGE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30b79232-ee54-4d79-b277-84b496f49aca_942x245.png 848w, https://substackcdn.com/image/fetch/$s_!TMGE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30b79232-ee54-4d79-b277-84b496f49aca_942x245.png 1272w, https://substackcdn.com/image/fetch/$s_!TMGE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30b79232-ee54-4d79-b277-84b496f49aca_942x245.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TMGE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30b79232-ee54-4d79-b277-84b496f49aca_942x245.png" width="942" height="245" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/30b79232-ee54-4d79-b277-84b496f49aca_942x245.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:245,&quot;width&quot;:942,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:58479,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30b79232-ee54-4d79-b277-84b496f49aca_942x245.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TMGE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30b79232-ee54-4d79-b277-84b496f49aca_942x245.png 424w, https://substackcdn.com/image/fetch/$s_!TMGE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30b79232-ee54-4d79-b277-84b496f49aca_942x245.png 848w, https://substackcdn.com/image/fetch/$s_!TMGE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30b79232-ee54-4d79-b277-84b496f49aca_942x245.png 1272w, https://substackcdn.com/image/fetch/$s_!TMGE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30b79232-ee54-4d79-b277-84b496f49aca_942x245.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Section 2 &#8212; About PharmaCo International</h2><h3>2.1 Company Profile</h3><p>PharmaCo International is a fictional composite of a major global pharmaceutical manufacturer. Its financials and operations are modelled on industry averages for a large-cap pharmaceutical company as of 2026. All figures are illustrative.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!g-9K!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f41c0f8-e7a3-4858-b2a0-447d0b88ed0d_947x328.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!g-9K!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f41c0f8-e7a3-4858-b2a0-447d0b88ed0d_947x328.png 424w, https://substackcdn.com/image/fetch/$s_!g-9K!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f41c0f8-e7a3-4858-b2a0-447d0b88ed0d_947x328.png 848w, https://substackcdn.com/image/fetch/$s_!g-9K!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f41c0f8-e7a3-4858-b2a0-447d0b88ed0d_947x328.png 1272w, https://substackcdn.com/image/fetch/$s_!g-9K!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f41c0f8-e7a3-4858-b2a0-447d0b88ed0d_947x328.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!g-9K!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f41c0f8-e7a3-4858-b2a0-447d0b88ed0d_947x328.png" width="947" height="328" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4f41c0f8-e7a3-4858-b2a0-447d0b88ed0d_947x328.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:328,&quot;width&quot;:947,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:42609,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f41c0f8-e7a3-4858-b2a0-447d0b88ed0d_947x328.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!g-9K!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f41c0f8-e7a3-4858-b2a0-447d0b88ed0d_947x328.png 424w, https://substackcdn.com/image/fetch/$s_!g-9K!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f41c0f8-e7a3-4858-b2a0-447d0b88ed0d_947x328.png 848w, https://substackcdn.com/image/fetch/$s_!g-9K!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f41c0f8-e7a3-4858-b2a0-447d0b88ed0d_947x328.png 1272w, https://substackcdn.com/image/fetch/$s_!g-9K!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f41c0f8-e7a3-4858-b2a0-447d0b88ed0d_947x328.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>2.2 How a Pharmaceutical Supply Chain Works</h3><p>For readers unfamiliar with the industry: medicines move through a multi-tier chain before they reach a patient.</p><p>&#8226; Tier 3 suppliers provide raw chemical feedstocks (petroleum derivatives, minerals, plant extracts).</p><p>&#8226; Tier 2 suppliers turn those into packaging materials and excipients &#8212; the inactive ingredients that give a tablet its shape, taste, and colour.</p><p>&#8226; Tier 1 suppliers produce the Active Pharmaceutical Ingredient (API) &#8212; the molecule that actually works.</p><p>&#8226; PharmaCo&#8217;s 15 manufacturing sites combine API and excipients into finished product (tablets, vials, inhalers), which then passes through quality control &#8212; every batch tested and regulatory approval required per market &#8212; before moving to regional distribution (with cold-chain logistics at 2&#8211;8&#176;C for some products) and on to hospitals, clinics, and retail pharmacies.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gQde!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc898e898-5914-42e0-bfbc-1449b932ee61_942x151.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gQde!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc898e898-5914-42e0-bfbc-1449b932ee61_942x151.png 424w, https://substackcdn.com/image/fetch/$s_!gQde!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc898e898-5914-42e0-bfbc-1449b932ee61_942x151.png 848w, https://substackcdn.com/image/fetch/$s_!gQde!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc898e898-5914-42e0-bfbc-1449b932ee61_942x151.png 1272w, https://substackcdn.com/image/fetch/$s_!gQde!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc898e898-5914-42e0-bfbc-1449b932ee61_942x151.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gQde!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc898e898-5914-42e0-bfbc-1449b932ee61_942x151.png" width="942" height="151" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c898e898-5914-42e0-bfbc-1449b932ee61_942x151.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:151,&quot;width&quot;:942,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:30584,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc898e898-5914-42e0-bfbc-1449b932ee61_942x151.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gQde!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc898e898-5914-42e0-bfbc-1449b932ee61_942x151.png 424w, https://substackcdn.com/image/fetch/$s_!gQde!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc898e898-5914-42e0-bfbc-1449b932ee61_942x151.png 848w, https://substackcdn.com/image/fetch/$s_!gQde!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc898e898-5914-42e0-bfbc-1449b932ee61_942x151.png 1272w, https://substackcdn.com/image/fetch/$s_!gQde!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc898e898-5914-42e0-bfbc-1449b932ee61_942x151.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><strong>Why it matters financially</strong></p><p>Pharmaceutical manufacturers maintain 30&#8211;60 days of buffer stock as standard practice. This sounds like comfortable protection, but it is not: once that buffer depletes, revenue stops, patients switch to alternative brands, and relationships with hospitals and pharmacies erode &#8212; some permanently. Every day of unmitigated disruption has a measurable revenue cost.</p><h3>2.3 The Three Products at the Centre of This Case</h3><p>PharmaCo&#8217;s Hormuz Strait supplier disruption affected three medications. These are real drug categories &#8212; the financial figures are modelled on typical pricing and margins for each category.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Q5Ea!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bc2ab99-8a0e-4b7d-b666-ac9ff27a5313_956x375.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Q5Ea!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bc2ab99-8a0e-4b7d-b666-ac9ff27a5313_956x375.png 424w, https://substackcdn.com/image/fetch/$s_!Q5Ea!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bc2ab99-8a0e-4b7d-b666-ac9ff27a5313_956x375.png 848w, https://substackcdn.com/image/fetch/$s_!Q5Ea!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bc2ab99-8a0e-4b7d-b666-ac9ff27a5313_956x375.png 1272w, https://substackcdn.com/image/fetch/$s_!Q5Ea!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bc2ab99-8a0e-4b7d-b666-ac9ff27a5313_956x375.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Q5Ea!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bc2ab99-8a0e-4b7d-b666-ac9ff27a5313_956x375.png" width="956" height="375" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9bc2ab99-8a0e-4b7d-b666-ac9ff27a5313_956x375.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:375,&quot;width&quot;:956,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:65380,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bc2ab99-8a0e-4b7d-b666-ac9ff27a5313_956x375.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Q5Ea!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bc2ab99-8a0e-4b7d-b666-ac9ff27a5313_956x375.png 424w, https://substackcdn.com/image/fetch/$s_!Q5Ea!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bc2ab99-8a0e-4b7d-b666-ac9ff27a5313_956x375.png 848w, https://substackcdn.com/image/fetch/$s_!Q5Ea!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bc2ab99-8a0e-4b7d-b666-ac9ff27a5313_956x375.png 1272w, https://substackcdn.com/image/fetch/$s_!Q5Ea!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bc2ab99-8a0e-4b7d-b666-ac9ff27a5313_956x375.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Section 3 &#8212; What Is an Agentic AI System, and What Is a Harness?</h2><h3>3.1 From Chatbot to Agent</h3><p>Most people&#8217;s experience of AI is a chatbot: you ask a question, it gives an answer. An agentic AI system is fundamentally different. It does not just respond to questions &#8212; it takes sequences of actions toward a goal, uses tools, coordinates with other agents, evaluates its own output, and continues until a task is complete.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OJFC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c48e32-30ba-4883-a8c3-310fe12910c4_951x102.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OJFC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c48e32-30ba-4883-a8c3-310fe12910c4_951x102.png 424w, https://substackcdn.com/image/fetch/$s_!OJFC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c48e32-30ba-4883-a8c3-310fe12910c4_951x102.png 848w, https://substackcdn.com/image/fetch/$s_!OJFC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c48e32-30ba-4883-a8c3-310fe12910c4_951x102.png 1272w, https://substackcdn.com/image/fetch/$s_!OJFC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c48e32-30ba-4883-a8c3-310fe12910c4_951x102.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OJFC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c48e32-30ba-4883-a8c3-310fe12910c4_951x102.png" width="951" height="102" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/66c48e32-30ba-4883-a8c3-310fe12910c4_951x102.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:102,&quot;width&quot;:951,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:21353,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c48e32-30ba-4883-a8c3-310fe12910c4_951x102.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OJFC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c48e32-30ba-4883-a8c3-310fe12910c4_951x102.png 424w, https://substackcdn.com/image/fetch/$s_!OJFC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c48e32-30ba-4883-a8c3-310fe12910c4_951x102.png 848w, https://substackcdn.com/image/fetch/$s_!OJFC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c48e32-30ba-4883-a8c3-310fe12910c4_951x102.png 1272w, https://substackcdn.com/image/fetch/$s_!OJFC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c48e32-30ba-4883-a8c3-310fe12910c4_951x102.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>A chatbot asked &#8216;which supplier can cover our Lisinopril shortage?&#8217; gives a single response &#8212; &#8216;consider suppliers X, Y, or Z&#8217; &#8212; and the human must act on it, verify it, and coordinate the next step. An agent given &#8216;handle the Hormuz disruption for Lisinopril&#8217; reads the supply data, queries alternative supplier databases, checks regulatory rules, drafts a procurement plan, scores that plan against criteria, revises if incomplete, and files the regulatory notice &#8212; arriving at a completed task for a human to review and approve.</p><h3>3.2 What a Harness Is</h3><p>A harness is the structural design of how an AI agent &#8212; or multiple agents &#8212; is organised to complete a task. Think of it as the organisational chart and workflow of the AI system. Two agents given the same task but structured differently will produce different results, at different speeds, with different accuracy, and at different cost.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mzKh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41dcb8ea-f6cf-4ea7-884b-40e58a4eeb03_951x102.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mzKh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41dcb8ea-f6cf-4ea7-884b-40e58a4eeb03_951x102.png 424w, https://substackcdn.com/image/fetch/$s_!mzKh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41dcb8ea-f6cf-4ea7-884b-40e58a4eeb03_951x102.png 848w, https://substackcdn.com/image/fetch/$s_!mzKh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41dcb8ea-f6cf-4ea7-884b-40e58a4eeb03_951x102.png 1272w, https://substackcdn.com/image/fetch/$s_!mzKh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41dcb8ea-f6cf-4ea7-884b-40e58a4eeb03_951x102.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mzKh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41dcb8ea-f6cf-4ea7-884b-40e58a4eeb03_951x102.png" width="951" height="102" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/41dcb8ea-f6cf-4ea7-884b-40e58a4eeb03_951x102.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:102,&quot;width&quot;:951,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:21353,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41dcb8ea-f6cf-4ea7-884b-40e58a4eeb03_951x102.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mzKh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41dcb8ea-f6cf-4ea7-884b-40e58a4eeb03_951x102.png 424w, https://substackcdn.com/image/fetch/$s_!mzKh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41dcb8ea-f6cf-4ea7-884b-40e58a4eeb03_951x102.png 848w, https://substackcdn.com/image/fetch/$s_!mzKh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41dcb8ea-f6cf-4ea7-884b-40e58a4eeb03_951x102.png 1272w, https://substackcdn.com/image/fetch/$s_!mzKh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41dcb8ea-f6cf-4ea7-884b-40e58a4eeb03_951x102.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>The harness choice is a capital allocation decision. A simple, well-designed harness costs less to build, less to run, and &#8212; on the right task &#8212; produces better results than a complex one. A complex harness has higher coordination overhead, more opportunities for errors to compound, and higher compute costs. Section 1 showed the research behind this: ten harness architectures tested against the same benchmark task, with measured financial consequences for each choice.</p><h2>3.3 The Architectures in This Case</h2><p>PharmaCo evaluated four of the ten H1&#8211;H10 architectures for its two deployment areas. Each represents a different structural approach to the same family of problems.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZSDq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b23c30a-f96b-441f-8c80-82ab1ac86a5e_960x347.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZSDq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b23c30a-f96b-441f-8c80-82ab1ac86a5e_960x347.png 424w, https://substackcdn.com/image/fetch/$s_!ZSDq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b23c30a-f96b-441f-8c80-82ab1ac86a5e_960x347.png 848w, https://substackcdn.com/image/fetch/$s_!ZSDq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b23c30a-f96b-441f-8c80-82ab1ac86a5e_960x347.png 1272w, https://substackcdn.com/image/fetch/$s_!ZSDq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b23c30a-f96b-441f-8c80-82ab1ac86a5e_960x347.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZSDq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b23c30a-f96b-441f-8c80-82ab1ac86a5e_960x347.png" width="960" height="347" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1b23c30a-f96b-441f-8c80-82ab1ac86a5e_960x347.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:347,&quot;width&quot;:960,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:57331,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b23c30a-f96b-441f-8c80-82ab1ac86a5e_960x347.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZSDq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b23c30a-f96b-441f-8c80-82ab1ac86a5e_960x347.png 424w, https://substackcdn.com/image/fetch/$s_!ZSDq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b23c30a-f96b-441f-8c80-82ab1ac86a5e_960x347.png 848w, https://substackcdn.com/image/fetch/$s_!ZSDq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b23c30a-f96b-441f-8c80-82ab1ac86a5e_960x347.png 1272w, https://substackcdn.com/image/fetch/$s_!ZSDq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b23c30a-f96b-441f-8c80-82ab1ac86a5e_960x347.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Section 4 &#8212; The Strategic Decision</h2><h3>4.1 Where in the Value Chain PharmaCo Chose to Deploy</h3><p>PharmaCo did not deploy AI agents everywhere simultaneously. It identified three specific areas of its value chain where the investment case was clearest: supply chain disruption response, routine regulatory compliance monitoring, and inventory replenishment decisions. This case study focuses on the first two.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bBYx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4665176d-ce15-462b-9e24-71586d32b82c_945x197.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bBYx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4665176d-ce15-462b-9e24-71586d32b82c_945x197.png 424w, https://substackcdn.com/image/fetch/$s_!bBYx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4665176d-ce15-462b-9e24-71586d32b82c_945x197.png 848w, https://substackcdn.com/image/fetch/$s_!bBYx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4665176d-ce15-462b-9e24-71586d32b82c_945x197.png 1272w, https://substackcdn.com/image/fetch/$s_!bBYx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4665176d-ce15-462b-9e24-71586d32b82c_945x197.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bBYx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4665176d-ce15-462b-9e24-71586d32b82c_945x197.png" width="945" height="197" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4665176d-ce15-462b-9e24-71586d32b82c_945x197.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:197,&quot;width&quot;:945,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:38690,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4665176d-ce15-462b-9e24-71586d32b82c_945x197.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bBYx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4665176d-ce15-462b-9e24-71586d32b82c_945x197.png 424w, https://substackcdn.com/image/fetch/$s_!bBYx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4665176d-ce15-462b-9e24-71586d32b82c_945x197.png 848w, https://substackcdn.com/image/fetch/$s_!bBYx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4665176d-ce15-462b-9e24-71586d32b82c_945x197.png 1272w, https://substackcdn.com/image/fetch/$s_!bBYx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4665176d-ce15-462b-9e24-71586d32b82c_945x197.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>4.2 Why Architecture Choice Has Financial Consequences!</h3><p>When most organisations adopt AI, they ask: will it work? The more important question is: <em><strong>which version will work best, and what does the difference cost? The answer, as this case study demonstrates, runs to hundreds of millions of dollars in enterprise value.</strong></em></p><p>Four financial levers connect harness architecture to P&amp;L performance:</p><p>&#8226; Lever 1 &#8212; Response speed &#8594; revenue at risk. The faster a usable response plan is ready, the fewer days of unmitigated disruption, and the less revenue is lost to patient switching.</p><p>&#8226; Lever 2 &#8212; Decision precision &#8594; emergency procurement cost. The quality of supplier identification determines the premium paid over normal procurement costs.</p><p>&#8226; Lever 3 &#8212; Regulatory compliance &#8594; fine avoidance. Missing a notification deadline triggers fines; a precise, time-stamped action item avoids them.</p><p>&#8226; Lever 4 &#8212; Inventory efficiency &#8594; write-off avoidance. Accurate volume and formulation modelling avoids over-purchasing and wrong-SKU write-offs.</p><h4>Lever 2 in detail: how the procurement premium calculation works</h4><p>Step 1 &#8212; Define the emergency procurement volume. A 30-day disruption; buffer stock covers the first 15 days. Days 16&#8211;30 must be sourced from alternatives: 15 days of COGS. Monthly COGS = $25.5M &#8594; 15-day COGS = $12.75M. Emergency procurement needed: $12.75M base volume.</p><p>Step 2 &#8212; Apply a premium rate that depends on how precisely the response plan identifies alternatives:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bGxe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff960dbfe-4330-485d-b32b-dc1fa18af0ff_943x267.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bGxe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff960dbfe-4330-485d-b32b-dc1fa18af0ff_943x267.png 424w, https://substackcdn.com/image/fetch/$s_!bGxe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff960dbfe-4330-485d-b32b-dc1fa18af0ff_943x267.png 848w, https://substackcdn.com/image/fetch/$s_!bGxe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff960dbfe-4330-485d-b32b-dc1fa18af0ff_943x267.png 1272w, https://substackcdn.com/image/fetch/$s_!bGxe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff960dbfe-4330-485d-b32b-dc1fa18af0ff_943x267.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bGxe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff960dbfe-4330-485d-b32b-dc1fa18af0ff_943x267.png" width="943" height="267" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f960dbfe-4330-485d-b32b-dc1fa18af0ff_943x267.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:267,&quot;width&quot;:943,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:43167,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff960dbfe-4330-485d-b32b-dc1fa18af0ff_943x267.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bGxe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff960dbfe-4330-485d-b32b-dc1fa18af0ff_943x267.png 424w, https://substackcdn.com/image/fetch/$s_!bGxe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff960dbfe-4330-485d-b32b-dc1fa18af0ff_943x267.png 848w, https://substackcdn.com/image/fetch/$s_!bGxe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff960dbfe-4330-485d-b32b-dc1fa18af0ff_943x267.png 1272w, https://substackcdn.com/image/fetch/$s_!bGxe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff960dbfe-4330-485d-b32b-dc1fa18af0ff_943x267.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Lever 3 in detail: the regulatory backdrop</h4><p>Pharmaceutical companies operating in regulated markets are legally required to notify regulators of significant supply disruptions. In the United States, FDA regulations under 21 CFR 314.81 require manufacturers to report potential drug shortages &#8216;as soon as practicable,&#8217; with documented FDA practice establishing 48&#8211;72 hours as the expected window for initial notification. European markets have an equivalent EMA requirement under Regulation (EU) 2022/123. Late or absent notifications have resulted in fines ranging from $250,000 to $2.5M per violation in documented FDA enforcement actions.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!baOr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f032a15-cfba-41c5-be67-1eace79398fa_943x267.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!baOr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f032a15-cfba-41c5-be67-1eace79398fa_943x267.png 424w, https://substackcdn.com/image/fetch/$s_!baOr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f032a15-cfba-41c5-be67-1eace79398fa_943x267.png 848w, https://substackcdn.com/image/fetch/$s_!baOr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f032a15-cfba-41c5-be67-1eace79398fa_943x267.png 1272w, https://substackcdn.com/image/fetch/$s_!baOr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f032a15-cfba-41c5-be67-1eace79398fa_943x267.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!baOr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f032a15-cfba-41c5-be67-1eace79398fa_943x267.png" width="943" height="267" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7f032a15-cfba-41c5-be67-1eace79398fa_943x267.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:267,&quot;width&quot;:943,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:43167,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f032a15-cfba-41c5-be67-1eace79398fa_943x267.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!baOr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f032a15-cfba-41c5-be67-1eace79398fa_943x267.png 424w, https://substackcdn.com/image/fetch/$s_!baOr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f032a15-cfba-41c5-be67-1eace79398fa_943x267.png 848w, https://substackcdn.com/image/fetch/$s_!baOr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f032a15-cfba-41c5-be67-1eace79398fa_943x267.png 1272w, https://substackcdn.com/image/fetch/$s_!baOr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f032a15-cfba-41c5-be67-1eace79398fa_943x267.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Lever 4 in detail: inventory write-offs</h4><p>Emergency procurement during a supply crisis involves purchasing expensive alternative stock under uncertainty. If the response plan over-estimates the shortfall, PharmaCo purchases more than it needs &#8212; and that excess, sourced at premium prices with shorter shelf-life from alternative suppliers, may need to be written off. If the plan under-estimates, the disruption extends.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pAro!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F168f8c69-2d16-4898-a3b0-d86919fb343c_947x263.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pAro!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F168f8c69-2d16-4898-a3b0-d86919fb343c_947x263.png 424w, https://substackcdn.com/image/fetch/$s_!pAro!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F168f8c69-2d16-4898-a3b0-d86919fb343c_947x263.png 848w, https://substackcdn.com/image/fetch/$s_!pAro!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F168f8c69-2d16-4898-a3b0-d86919fb343c_947x263.png 1272w, https://substackcdn.com/image/fetch/$s_!pAro!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F168f8c69-2d16-4898-a3b0-d86919fb343c_947x263.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pAro!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F168f8c69-2d16-4898-a3b0-d86919fb343c_947x263.png" width="947" height="263" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/168f8c69-2d16-4898-a3b0-d86919fb343c_947x263.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:263,&quot;width&quot;:947,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:42665,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F168f8c69-2d16-4898-a3b0-d86919fb343c_947x263.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pAro!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F168f8c69-2d16-4898-a3b0-d86919fb343c_947x263.png 424w, https://substackcdn.com/image/fetch/$s_!pAro!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F168f8c69-2d16-4898-a3b0-d86919fb343c_947x263.png 848w, https://substackcdn.com/image/fetch/$s_!pAro!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F168f8c69-2d16-4898-a3b0-d86919fb343c_947x263.png 1272w, https://substackcdn.com/image/fetch/$s_!pAro!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F168f8c69-2d16-4898-a3b0-d86919fb343c_947x263.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Section 5 &#8212; The Disruption Event: Step by Step</h2><h3>5.1 What Happened on Day Zero</h3><p>At 06:14 on a Tuesday morning, PharmaCo&#8217;s Global Supply Intelligence system received notification from a Tier-1 API supplier in the Hormuz Strait region: production was suspended effective immediately, with an expected outage of 30 days. Three of PharmaCo&#8217;s highest-volume medications were affected.</p><p>The notification triggered a 72-hour countdown to the point where PharmaCo&#8217;s buffer stock would begin to deplete for Salbutamol &#8212; the most time-critical of the three medications because it is used in emergency respiratory situations, hospitals maintain minimal buffer stock for it, and patients cannot substitute easily. The clinical and financial consequences of a Salbutamol shortage are more severe than the other two products.</p><h3>What the response plan must include</h3><p>&#8226; Patient impact assessment per medication</p><p>&#8226; Alternative supplier identification</p><p>&#8226; Regulatory notification within the 48&#8211;72 hour window</p><p>&#8226; Emergency procurement authorisation</p><p>&#8226; A 24-hour escalation checkpoint</p><h4>The timeline that matters</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_kZb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe63f2874-0c1d-4b38-b8fb-ef9839dc3f6c_946x402.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_kZb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe63f2874-0c1d-4b38-b8fb-ef9839dc3f6c_946x402.png 424w, https://substackcdn.com/image/fetch/$s_!_kZb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe63f2874-0c1d-4b38-b8fb-ef9839dc3f6c_946x402.png 848w, https://substackcdn.com/image/fetch/$s_!_kZb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe63f2874-0c1d-4b38-b8fb-ef9839dc3f6c_946x402.png 1272w, https://substackcdn.com/image/fetch/$s_!_kZb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe63f2874-0c1d-4b38-b8fb-ef9839dc3f6c_946x402.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_kZb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe63f2874-0c1d-4b38-b8fb-ef9839dc3f6c_946x402.png" width="946" height="402" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e63f2874-0c1d-4b38-b8fb-ef9839dc3f6c_946x402.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:402,&quot;width&quot;:946,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:60623,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe63f2874-0c1d-4b38-b8fb-ef9839dc3f6c_946x402.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_kZb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe63f2874-0c1d-4b38-b8fb-ef9839dc3f6c_946x402.png 424w, https://substackcdn.com/image/fetch/$s_!_kZb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe63f2874-0c1d-4b38-b8fb-ef9839dc3f6c_946x402.png 848w, https://substackcdn.com/image/fetch/$s_!_kZb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe63f2874-0c1d-4b38-b8fb-ef9839dc3f6c_946x402.png 1272w, https://substackcdn.com/image/fetch/$s_!_kZb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe63f2874-0c1d-4b38-b8fb-ef9839dc3f6c_946x402.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>5.2 The Baseline: What This Costs With No AI At All</h3><p>Before comparing H2, H9, and H4, it is worth being explicit about something the rest of this case study takes for granted: this disruption happens regardless of what software PharmaCo runs. A Tier-1 API supplier in the Hormuz Strait region does not suspend production because of, or in spite of, PharmaCo&#8217;s choice of AI architecture. The supply shock is exogenous. The only thing that changes across H2, H9, and H4 is how PharmaCo responds to a crisis that would occur in any case.</p><p>Section 10.2 quantifies what this disruption costs PharmaCo with a purely manual, human-led response &#8212; no AI involved at any stage. Across the four disruption events PharmaCo experiences per year, that baseline comes to approximately $45.44M in annual exposure. This is the number that exists whether or not PharmaCo has ever heard of agentic AI.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!woup!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2134f05f-f601-4209-819b-af0f566ddf7d_942x211.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!woup!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2134f05f-f601-4209-819b-af0f566ddf7d_942x211.png 424w, https://substackcdn.com/image/fetch/$s_!woup!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2134f05f-f601-4209-819b-af0f566ddf7d_942x211.png 848w, https://substackcdn.com/image/fetch/$s_!woup!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2134f05f-f601-4209-819b-af0f566ddf7d_942x211.png 1272w, https://substackcdn.com/image/fetch/$s_!woup!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2134f05f-f601-4209-819b-af0f566ddf7d_942x211.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!woup!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2134f05f-f601-4209-819b-af0f566ddf7d_942x211.png" width="942" height="211" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2134f05f-f601-4209-819b-af0f566ddf7d_942x211.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:211,&quot;width&quot;:942,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:57635,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2134f05f-f601-4209-819b-af0f566ddf7d_942x211.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!woup!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2134f05f-f601-4209-819b-af0f566ddf7d_942x211.png 424w, https://substackcdn.com/image/fetch/$s_!woup!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2134f05f-f601-4209-819b-af0f566ddf7d_942x211.png 848w, https://substackcdn.com/image/fetch/$s_!woup!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2134f05f-f601-4209-819b-af0f566ddf7d_942x211.png 1272w, https://substackcdn.com/image/fetch/$s_!woup!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2134f05f-f601-4209-819b-af0f566ddf7d_942x211.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h2>Section 6 &#8212; H2: The Structured Single Agent (Recommended Outcome)</h2><h3>6.1 What H2 Did</h3><p>H2 is a single AI agent with no sub-agents and no external tools. Its capability comes entirely from its internal structure: the prompt that instructs it includes an explicit task restatement at the top, numbered reasoning steps the agent must follow in sequence, constraint reminders embedded at each step, and a specified output format. This structure prevents the agent from taking shortcuts, missing constraints, or producing vague generalisations.</p><p>On receiving the disruption notification, H2 processed the supply data, identified patient impact by medication and by market, retrieved qualified alternative supplier information from PharmaCo&#8217;s supplier database, modelled emergency procurement volumes, identified regulatory notification obligations with the specific 48-hour window, and produced a draft CFO-level response brief &#8212; all within six hours.</p><h3>H2&#8217;s internal structure</h3><p>&#8226; Task restatement: &#8216;The task is to produce a 72-hour disruption response plan for the Hormuz API supplier suspension&#8230;&#8217;</p><p>&#8226; Step A &#8212; identify affected medications and risk tier</p><p>&#8226; Step B &#8212; assess patient impact per medication (constraint check: do not recommend suspension)</p><p>&#8226; Step C &#8212; identify alternative suppliers, Tier-1 only</p><p>&#8226; Step D &#8212; model procurement volumes (constraint check: regulatory notification under 48 hours)</p><p>&#8226; Step E &#8212; draft response brief in the specified format: executive summary, actions by priority, timeline, approvals required, regulatory filing reference</p><p><em>Result: response brief ready in 6 hours (0.25 days).</em></p><h3>6.2 The Financial Outcome &#8212; H2</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oVDb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ba59b42-0bb6-484c-9982-8e2c7cf9300a_942x412.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oVDb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ba59b42-0bb6-484c-9982-8e2c7cf9300a_942x412.png 424w, https://substackcdn.com/image/fetch/$s_!oVDb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ba59b42-0bb6-484c-9982-8e2c7cf9300a_942x412.png 848w, https://substackcdn.com/image/fetch/$s_!oVDb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ba59b42-0bb6-484c-9982-8e2c7cf9300a_942x412.png 1272w, https://substackcdn.com/image/fetch/$s_!oVDb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ba59b42-0bb6-484c-9982-8e2c7cf9300a_942x412.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oVDb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ba59b42-0bb6-484c-9982-8e2c7cf9300a_942x412.png" width="942" height="412" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0ba59b42-0bb6-484c-9982-8e2c7cf9300a_942x412.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:412,&quot;width&quot;:942,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:55349,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ba59b42-0bb6-484c-9982-8e2c7cf9300a_942x412.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oVDb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ba59b42-0bb6-484c-9982-8e2c7cf9300a_942x412.png 424w, https://substackcdn.com/image/fetch/$s_!oVDb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ba59b42-0bb6-484c-9982-8e2c7cf9300a_942x412.png 848w, https://substackcdn.com/image/fetch/$s_!oVDb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ba59b42-0bb6-484c-9982-8e2c7cf9300a_942x412.png 1272w, https://substackcdn.com/image/fetch/$s_!oVDb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ba59b42-0bb6-484c-9982-8e2c7cf9300a_942x412.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Section 7 &#8212; H9: The Orchestration Swarm (Over-Engineered: $19.8M Annual Overspend)</h2><h3>7.1 What H9 Did &#8212; and Why It Cost More</h3><p>H9 was PharmaCo management&#8217;s first instinct. A complex disruption requires complex analysis, the reasoning went &#8212; and five specialist agents working in parallel should produce a more thorough result than one. This intuition is reasonable. It is also wrong for this specific task.</p><p>H9&#8217;s five agents ran in parallel: one assessing patient impact, one modelling supply alternatives, one handling regulatory requirements, one drafting communications, and one coordinating logistics. An orchestrator agent then synthesised all five outputs. The structural problem: synthesis across five independent agents introduces inconsistency. The regulatory agent recommended notifying one authority; the logistics agent&#8217;s timeline assumed a different notification window. The orchestrator produced a coherent document, but the coherence required averaging out the inconsistencies &#8212; and in doing so, lost the specific regulatory deadline that H2 preserved.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wU1f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd781fbc6-2fa1-4aee-8af8-00651c0c2035_947x198.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wU1f!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd781fbc6-2fa1-4aee-8af8-00651c0c2035_947x198.png 424w, https://substackcdn.com/image/fetch/$s_!wU1f!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd781fbc6-2fa1-4aee-8af8-00651c0c2035_947x198.png 848w, https://substackcdn.com/image/fetch/$s_!wU1f!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd781fbc6-2fa1-4aee-8af8-00651c0c2035_947x198.png 1272w, https://substackcdn.com/image/fetch/$s_!wU1f!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd781fbc6-2fa1-4aee-8af8-00651c0c2035_947x198.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wU1f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd781fbc6-2fa1-4aee-8af8-00651c0c2035_947x198.png" width="947" height="198" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d781fbc6-2fa1-4aee-8af8-00651c0c2035_947x198.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:198,&quot;width&quot;:947,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:44067,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd781fbc6-2fa1-4aee-8af8-00651c0c2035_947x198.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wU1f!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd781fbc6-2fa1-4aee-8af8-00651c0c2035_947x198.png 424w, https://substackcdn.com/image/fetch/$s_!wU1f!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd781fbc6-2fa1-4aee-8af8-00651c0c2035_947x198.png 848w, https://substackcdn.com/image/fetch/$s_!wU1f!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd781fbc6-2fa1-4aee-8af8-00651c0c2035_947x198.png 1272w, https://substackcdn.com/image/fetch/$s_!wU1f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd781fbc6-2fa1-4aee-8af8-00651c0c2035_947x198.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>7.2 The Financial Outcome &#8212; H9</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6CGZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4847835d-7b0b-41a8-84df-9030d25fdd36_947x476.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6CGZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4847835d-7b0b-41a8-84df-9030d25fdd36_947x476.png 424w, https://substackcdn.com/image/fetch/$s_!6CGZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4847835d-7b0b-41a8-84df-9030d25fdd36_947x476.png 848w, https://substackcdn.com/image/fetch/$s_!6CGZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4847835d-7b0b-41a8-84df-9030d25fdd36_947x476.png 1272w, https://substackcdn.com/image/fetch/$s_!6CGZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4847835d-7b0b-41a8-84df-9030d25fdd36_947x476.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6CGZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4847835d-7b0b-41a8-84df-9030d25fdd36_947x476.png" width="947" height="476" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4847835d-7b0b-41a8-84df-9030d25fdd36_947x476.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:476,&quot;width&quot;:947,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:71561,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4847835d-7b0b-41a8-84df-9030d25fdd36_947x476.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6CGZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4847835d-7b0b-41a8-84df-9030d25fdd36_947x476.png 424w, https://substackcdn.com/image/fetch/$s_!6CGZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4847835d-7b0b-41a8-84df-9030d25fdd36_947x476.png 848w, https://substackcdn.com/image/fetch/$s_!6CGZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4847835d-7b0b-41a8-84df-9030d25fdd36_947x476.png 1272w, https://substackcdn.com/image/fetch/$s_!6CGZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4847835d-7b0b-41a8-84df-9030d25fdd36_947x476.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Section 8 &#8212; H4: Independent Parallel Agents (Catastrophic: $65.7M Annual Overspend)</h2><h3>8.1 What H4 Did &#8212; and How It Failed</h3><p>H4 appeared technically sophisticated: three agents running in parallel, each processing the same disruption data independently, with outputs combined at the end. The design logic was that three independent perspectives would catch more than one. What it missed is that independent agents, with no shared state and no reconciliation mechanism, will produce contradictory outputs &#8212; and concatenating contradictions produces incoherence, not comprehensiveness.</p><p>This is consistent with H4&#8217;s standing in the original research: of the ten architectures tested in Section 1, H4 was the second-weakest, scoring 0.440 &#8212; below the no-harness baseline (H1, 0.665). H4 is included in this case study precisely because it represents what happens when an architecture that looks like &#8216;more coverage&#8217; on paper is deployed on a task where consistency across outputs is essential.</p><p>In practice: Agent A recommended temporarily suspending Salbutamol to protect Metformin supply &#8212; directly violating the task constraint. Agent B recommended three different suppliers than Agent C, with conflicting pricing estimates. Agent C&#8217;s regulatory section referenced a different notification authority than Agent A. The combined output required human intervention to identify and resolve all three contradictions before any action could be taken. That intervention took four days.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YnIU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb091229a-5bea-4959-8b4b-15b0104d6e3b_941x262.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YnIU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb091229a-5bea-4959-8b4b-15b0104d6e3b_941x262.png 424w, https://substackcdn.com/image/fetch/$s_!YnIU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb091229a-5bea-4959-8b4b-15b0104d6e3b_941x262.png 848w, https://substackcdn.com/image/fetch/$s_!YnIU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb091229a-5bea-4959-8b4b-15b0104d6e3b_941x262.png 1272w, https://substackcdn.com/image/fetch/$s_!YnIU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb091229a-5bea-4959-8b4b-15b0104d6e3b_941x262.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YnIU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb091229a-5bea-4959-8b4b-15b0104d6e3b_941x262.png" width="941" height="262" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b091229a-5bea-4959-8b4b-15b0104d6e3b_941x262.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:262,&quot;width&quot;:941,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:38746,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb091229a-5bea-4959-8b4b-15b0104d6e3b_941x262.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YnIU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb091229a-5bea-4959-8b4b-15b0104d6e3b_941x262.png 424w, https://substackcdn.com/image/fetch/$s_!YnIU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb091229a-5bea-4959-8b4b-15b0104d6e3b_941x262.png 848w, https://substackcdn.com/image/fetch/$s_!YnIU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb091229a-5bea-4959-8b4b-15b0104d6e3b_941x262.png 1272w, https://substackcdn.com/image/fetch/$s_!YnIU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb091229a-5bea-4959-8b4b-15b0104d6e3b_941x262.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>8.2 The Financial Outcome &#8212; H4</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2XO8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37b90014-79d4-45eb-9dce-da1c20c84fae_946x545.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2XO8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37b90014-79d4-45eb-9dce-da1c20c84fae_946x545.png 424w, https://substackcdn.com/image/fetch/$s_!2XO8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37b90014-79d4-45eb-9dce-da1c20c84fae_946x545.png 848w, https://substackcdn.com/image/fetch/$s_!2XO8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37b90014-79d4-45eb-9dce-da1c20c84fae_946x545.png 1272w, https://substackcdn.com/image/fetch/$s_!2XO8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37b90014-79d4-45eb-9dce-da1c20c84fae_946x545.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2XO8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37b90014-79d4-45eb-9dce-da1c20c84fae_946x545.png" width="946" height="545" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/37b90014-79d4-45eb-9dce-da1c20c84fae_946x545.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:545,&quot;width&quot;:946,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:79556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37b90014-79d4-45eb-9dce-da1c20c84fae_946x545.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2XO8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37b90014-79d4-45eb-9dce-da1c20c84fae_946x545.png 424w, https://substackcdn.com/image/fetch/$s_!2XO8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37b90014-79d4-45eb-9dce-da1c20c84fae_946x545.png 848w, https://substackcdn.com/image/fetch/$s_!2XO8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37b90014-79d4-45eb-9dce-da1c20c84fae_946x545.png 1272w, https://substackcdn.com/image/fetch/$s_!2XO8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37b90014-79d4-45eb-9dce-da1c20c84fae_946x545.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vH7z!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abcb3ea-8f57-4e3f-b866-dc48ceb5494a_942x217.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vH7z!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abcb3ea-8f57-4e3f-b866-dc48ceb5494a_942x217.png 424w, https://substackcdn.com/image/fetch/$s_!vH7z!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abcb3ea-8f57-4e3f-b866-dc48ceb5494a_942x217.png 848w, https://substackcdn.com/image/fetch/$s_!vH7z!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abcb3ea-8f57-4e3f-b866-dc48ceb5494a_942x217.png 1272w, https://substackcdn.com/image/fetch/$s_!vH7z!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abcb3ea-8f57-4e3f-b866-dc48ceb5494a_942x217.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vH7z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abcb3ea-8f57-4e3f-b866-dc48ceb5494a_942x217.png" width="942" height="217" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3abcb3ea-8f57-4e3f-b866-dc48ceb5494a_942x217.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:217,&quot;width&quot;:942,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:48676,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abcb3ea-8f57-4e3f-b866-dc48ceb5494a_942x217.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vH7z!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abcb3ea-8f57-4e3f-b866-dc48ceb5494a_942x217.png 424w, https://substackcdn.com/image/fetch/$s_!vH7z!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abcb3ea-8f57-4e3f-b866-dc48ceb5494a_942x217.png 848w, https://substackcdn.com/image/fetch/$s_!vH7z!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abcb3ea-8f57-4e3f-b866-dc48ceb5494a_942x217.png 1272w, https://substackcdn.com/image/fetch/$s_!vH7z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abcb3ea-8f57-4e3f-b866-dc48ceb5494a_942x217.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h2>Section 9 &#8212; H3: Tool-Augmented Single Agent for Routine Compliance Monitoring</h2><h3>9.1 A Different Task Requires a Different Architecture</h3><p>PharmaCo&#8217;s <strong>second deployment area was routine regulatory compliance monitoring: daily checking of the company&#8217;s compliance status across 40+ markets. This is a fundamentally different task from disruption response. It is repetitive, well-defined, and has clear binary outcomes &#8212; a market is either compliant or it is not.</strong> There are no deliberate traps, no judgment calls about constraint violations, no crisis decision-making under pressure.</p><p>This distinction matters enormously for architecture selection. Section 1&#8217;s research found that the architecture which wins on a hard, trap-laden task (H2) does not win on a simple, coverage-based task. H3 &#8212; a tool-augmented single agent that queries regulatory databases directly &#8212; outperforms H2 here, because it can retrieve live regulatory status information that H2&#8217;s internal reasoning scaffold has no way to access. This is the same pattern the automated follow-up study found when tool-using and iterative-loop designs (H3- and H5-style) won on the simplified, no-traps benchmark (Section 1.2).</p><h3>H3&#8217;s structure</h3><p>&#8226; Receive the daily compliance check list (40+ markets)</p><p>&#8226; For each market: query the regulatory database tool, query the local market status API, cross-check against PharmaCo&#8217;s obligations &#8212; each step a live tool call</p><p>&#8226; Flag any compliance gaps</p><p>&#8226; Generate a daily compliance report</p><p><em>Result: covers all 40+ markets daily at 97% accuracy. Previously: human review every two weeks at 85% accuracy.</em></p><h3>9.2 The Financial Case for H3</h3><p>The financial case for H3 is different from the disruption scenarios: it is primarily a cost-reduction story rather than a loss-avoidance one. Three previous compliance analysts performing manual monitoring are replaced by an automated system. Analytical quality improves, coverage improves, and cost decreases substantially.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eOvb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6abce2f-4308-41a2-a520-f6a1c3fc6fd4_945x473.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eOvb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6abce2f-4308-41a2-a520-f6a1c3fc6fd4_945x473.png 424w, https://substackcdn.com/image/fetch/$s_!eOvb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6abce2f-4308-41a2-a520-f6a1c3fc6fd4_945x473.png 848w, https://substackcdn.com/image/fetch/$s_!eOvb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6abce2f-4308-41a2-a520-f6a1c3fc6fd4_945x473.png 1272w, https://substackcdn.com/image/fetch/$s_!eOvb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6abce2f-4308-41a2-a520-f6a1c3fc6fd4_945x473.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eOvb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6abce2f-4308-41a2-a520-f6a1c3fc6fd4_945x473.png" width="945" height="473" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a6abce2f-4308-41a2-a520-f6a1c3fc6fd4_945x473.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:473,&quot;width&quot;:945,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:72969,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6abce2f-4308-41a2-a520-f6a1c3fc6fd4_945x473.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eOvb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6abce2f-4308-41a2-a520-f6a1c3fc6fd4_945x473.png 424w, https://substackcdn.com/image/fetch/$s_!eOvb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6abce2f-4308-41a2-a520-f6a1c3fc6fd4_945x473.png 848w, https://substackcdn.com/image/fetch/$s_!eOvb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6abce2f-4308-41a2-a520-f6a1c3fc6fd4_945x473.png 1272w, https://substackcdn.com/image/fetch/$s_!eOvb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6abce2f-4308-41a2-a520-f6a1c3fc6fd4_945x473.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Qpem!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4234dd0-54c4-4676-b201-ee66725c9543_945x192.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Qpem!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4234dd0-54c4-4676-b201-ee66725c9543_945x192.png 424w, https://substackcdn.com/image/fetch/$s_!Qpem!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4234dd0-54c4-4676-b201-ee66725c9543_945x192.png 848w, https://substackcdn.com/image/fetch/$s_!Qpem!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4234dd0-54c4-4676-b201-ee66725c9543_945x192.png 1272w, https://substackcdn.com/image/fetch/$s_!Qpem!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4234dd0-54c4-4676-b201-ee66725c9543_945x192.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Qpem!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4234dd0-54c4-4676-b201-ee66725c9543_945x192.png" width="945" height="192" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c4234dd0-54c4-4676-b201-ee66725c9543_945x192.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:192,&quot;width&quot;:945,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:35114,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4234dd0-54c4-4676-b201-ee66725c9543_945x192.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Qpem!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4234dd0-54c4-4676-b201-ee66725c9543_945x192.png 424w, https://substackcdn.com/image/fetch/$s_!Qpem!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4234dd0-54c4-4676-b201-ee66725c9543_945x192.png 848w, https://substackcdn.com/image/fetch/$s_!Qpem!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4234dd0-54c4-4676-b201-ee66725c9543_945x192.png 1272w, https://substackcdn.com/image/fetch/$s_!Qpem!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4234dd0-54c4-4676-b201-ee66725c9543_945x192.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h2>Section 10 &#8212; The Complete Financial Comparison</h2><h3>10.1 Per-Event Impact Summary</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DkmL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff10a4715-9d22-4b1a-896e-4c5b765c4f00_943x321.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DkmL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff10a4715-9d22-4b1a-896e-4c5b765c4f00_943x321.png 424w, https://substackcdn.com/image/fetch/$s_!DkmL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff10a4715-9d22-4b1a-896e-4c5b765c4f00_943x321.png 848w, https://substackcdn.com/image/fetch/$s_!DkmL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff10a4715-9d22-4b1a-896e-4c5b765c4f00_943x321.png 1272w, https://substackcdn.com/image/fetch/$s_!DkmL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff10a4715-9d22-4b1a-896e-4c5b765c4f00_943x321.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DkmL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff10a4715-9d22-4b1a-896e-4c5b765c4f00_943x321.png" width="943" height="321" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f10a4715-9d22-4b1a-896e-4c5b765c4f00_943x321.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:321,&quot;width&quot;:943,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:40854,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff10a4715-9d22-4b1a-896e-4c5b765c4f00_943x321.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DkmL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff10a4715-9d22-4b1a-896e-4c5b765c4f00_943x321.png 424w, https://substackcdn.com/image/fetch/$s_!DkmL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff10a4715-9d22-4b1a-896e-4c5b765c4f00_943x321.png 848w, https://substackcdn.com/image/fetch/$s_!DkmL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff10a4715-9d22-4b1a-896e-4c5b765c4f00_943x321.png 1272w, https://substackcdn.com/image/fetch/$s_!DkmL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff10a4715-9d22-4b1a-896e-4c5b765c4f00_943x321.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>10.2 Annual Impact (4 Disruption Events per Year)</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!g_Zd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e8f3663-b41d-4ba3-bc70-8a38fc33e390_943x321.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!g_Zd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e8f3663-b41d-4ba3-bc70-8a38fc33e390_943x321.png 424w, https://substackcdn.com/image/fetch/$s_!g_Zd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e8f3663-b41d-4ba3-bc70-8a38fc33e390_943x321.png 848w, https://substackcdn.com/image/fetch/$s_!g_Zd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e8f3663-b41d-4ba3-bc70-8a38fc33e390_943x321.png 1272w, https://substackcdn.com/image/fetch/$s_!g_Zd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e8f3663-b41d-4ba3-bc70-8a38fc33e390_943x321.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!g_Zd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e8f3663-b41d-4ba3-bc70-8a38fc33e390_943x321.png" width="943" height="321" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7e8f3663-b41d-4ba3-bc70-8a38fc33e390_943x321.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:321,&quot;width&quot;:943,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:40854,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e8f3663-b41d-4ba3-bc70-8a38fc33e390_943x321.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!g_Zd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e8f3663-b41d-4ba3-bc70-8a38fc33e390_943x321.png 424w, https://substackcdn.com/image/fetch/$s_!g_Zd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e8f3663-b41d-4ba3-bc70-8a38fc33e390_943x321.png 848w, https://substackcdn.com/image/fetch/$s_!g_Zd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e8f3663-b41d-4ba3-bc70-8a38fc33e390_943x321.png 1272w, https://substackcdn.com/image/fetch/$s_!g_Zd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e8f3663-b41d-4ba3-bc70-8a38fc33e390_943x321.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>* H4 additional cost includes $0.5M per coherence-failure event &#215; 4 events = $2.0M in human override costs annually. Both bottom rows describe the same underlying gap from two directions: H2 saves $20.41M/yr relative to the human baseline, which is the same as saying the human baseline costs $20.41M/yr more than H2.</em></p><h3>10.3 Income Statement Impact (Annual, Affected Products Only)</h3><p>The following shows how each scenario affects PharmaCo&#8217;s income statement for the three affected product lines, using the annualised figures from 10.1 and 10.2 (per-event impacts &#215; 4 events/year). The &#8216;No disruption&#8217; column is a reference point only &#8212; per Section 5.2, PharmaCo cannot choose a world in which the Hormuz disruption does not occur. The real choice is between the Human Baseline column and the H2/H9/H4 columns, all four of which assume the disruption happens.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6Kfj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd36bfc5b-769c-4164-a5b4-3ce3779d407b_868x638.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6Kfj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd36bfc5b-769c-4164-a5b4-3ce3779d407b_868x638.png 424w, https://substackcdn.com/image/fetch/$s_!6Kfj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd36bfc5b-769c-4164-a5b4-3ce3779d407b_868x638.png 848w, https://substackcdn.com/image/fetch/$s_!6Kfj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd36bfc5b-769c-4164-a5b4-3ce3779d407b_868x638.png 1272w, https://substackcdn.com/image/fetch/$s_!6Kfj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd36bfc5b-769c-4164-a5b4-3ce3779d407b_868x638.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6Kfj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd36bfc5b-769c-4164-a5b4-3ce3779d407b_868x638.png" width="868" height="638" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d36bfc5b-769c-4164-a5b4-3ce3779d407b_868x638.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:638,&quot;width&quot;:868,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:65644,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd36bfc5b-769c-4164-a5b4-3ce3779d407b_868x638.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6Kfj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd36bfc5b-769c-4164-a5b4-3ce3779d407b_868x638.png 424w, https://substackcdn.com/image/fetch/$s_!6Kfj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd36bfc5b-769c-4164-a5b4-3ce3779d407b_868x638.png 848w, https://substackcdn.com/image/fetch/$s_!6Kfj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd36bfc5b-769c-4164-a5b4-3ce3779d407b_868x638.png 1272w, https://substackcdn.com/image/fetch/$s_!6Kfj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd36bfc5b-769c-4164-a5b4-3ce3779d407b_868x638.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>* The &#8216;No disruption&#8217; column shows a higher operating income than H2 by construction &#8212; a world with no disruption at all will always outperform a world with one, regardless of response quality. This column exists to make the disruption-cost row legible (it is the gap between this column and each scenario&#8217;s OI), not to suggest &#8216;do nothing&#8217; is an available choice. Row 12 reconciles exactly with the &#8216;Annual disruption cost&#8217; row in Section 10.2: $24.80M / $44.60M / $90.50M / $45.44M.</em></p><p><em>Reading the Human Baseline column against H2: operating income improves by $20.64M/yr (close to the $20.41M total-exposure saving in 10.2; the small gap is H2&#8217;s $0.15M implementation cost, which sits below the operating-income line). This is the comparison that matters &#8212; not H2 vs. &#8216;no disruption&#8217;, which PharmaCo cannot choose.</em></p><h2>10.4 Cash Flow Impact</h2><p>Beyond the income statement, the harness choice affects PharmaCo&#8217;s cash flow directly. Emergency procurement is a cash outflow; larger emergency purchases mean more working capital tied up in inventory. Regulatory fines are immediate cash outflows. Faster response (H2) means shorter disruption periods, reducing cash tied up in buffer stock purchased at premium prices.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!T3V4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6688cacd-f15d-49d7-a9bb-bda444f853da_868x262.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!T3V4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6688cacd-f15d-49d7-a9bb-bda444f853da_868x262.png 424w, https://substackcdn.com/image/fetch/$s_!T3V4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6688cacd-f15d-49d7-a9bb-bda444f853da_868x262.png 848w, https://substackcdn.com/image/fetch/$s_!T3V4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6688cacd-f15d-49d7-a9bb-bda444f853da_868x262.png 1272w, https://substackcdn.com/image/fetch/$s_!T3V4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6688cacd-f15d-49d7-a9bb-bda444f853da_868x262.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!T3V4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6688cacd-f15d-49d7-a9bb-bda444f853da_868x262.png" width="868" height="262" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6688cacd-f15d-49d7-a9bb-bda444f853da_868x262.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:262,&quot;width&quot;:868,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:25949,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6688cacd-f15d-49d7-a9bb-bda444f853da_868x262.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!T3V4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6688cacd-f15d-49d7-a9bb-bda444f853da_868x262.png 424w, https://substackcdn.com/image/fetch/$s_!T3V4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6688cacd-f15d-49d7-a9bb-bda444f853da_868x262.png 848w, https://substackcdn.com/image/fetch/$s_!T3V4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6688cacd-f15d-49d7-a9bb-bda444f853da_868x262.png 1272w, https://substackcdn.com/image/fetch/$s_!T3V4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6688cacd-f15d-49d7-a9bb-bda444f853da_868x262.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>10.5 Shareholder Value Calculation</h3><p>Shareholder value is calculated by applying PharmaCo&#8217;s price-to-earnings multiple to the annual income impact. A 15&#215; P/E multiple is used &#8212; the mid-point of the 12&#8211;20&#215; range typical for major pharmaceutical companies with diversified portfolios. The multiple reflects the market&#8217;s expectation that PharmaCo&#8217;s earnings improvement is sustainable and will recur.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VoJ8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ff9affc-b0a0-4c50-b0c7-7e665c19a64a_862x172.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VoJ8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ff9affc-b0a0-4c50-b0c7-7e665c19a64a_862x172.png 424w, https://substackcdn.com/image/fetch/$s_!VoJ8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ff9affc-b0a0-4c50-b0c7-7e665c19a64a_862x172.png 848w, https://substackcdn.com/image/fetch/$s_!VoJ8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ff9affc-b0a0-4c50-b0c7-7e665c19a64a_862x172.png 1272w, https://substackcdn.com/image/fetch/$s_!VoJ8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ff9affc-b0a0-4c50-b0c7-7e665c19a64a_862x172.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VoJ8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ff9affc-b0a0-4c50-b0c7-7e665c19a64a_862x172.png" width="862" height="172" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3ff9affc-b0a0-4c50-b0c7-7e665c19a64a_862x172.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:172,&quot;width&quot;:862,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:17447,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ff9affc-b0a0-4c50-b0c7-7e665c19a64a_862x172.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VoJ8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ff9affc-b0a0-4c50-b0c7-7e665c19a64a_862x172.png 424w, https://substackcdn.com/image/fetch/$s_!VoJ8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ff9affc-b0a0-4c50-b0c7-7e665c19a64a_862x172.png 848w, https://substackcdn.com/image/fetch/$s_!VoJ8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ff9affc-b0a0-4c50-b0c7-7e665c19a64a_862x172.png 1272w, https://substackcdn.com/image/fetch/$s_!VoJ8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ff9affc-b0a0-4c50-b0c7-7e665c19a64a_862x172.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><em>The P/E multiple applies because the income improvement is sustainable and recurring &#8212; four disruption events per year, every year. A one-time saving would not attract a multiple; a systematic improvement in disruption response capability does.</em></p><h2>Section 11 &#8212; Turning Dials: Benchmarks, Workforce, and Financial Sensitivity</h2><h3>11.1 The Benchmark Is a Financial Specification</h3><p>Section 1 of this case showed two different research results from two different benchmarks: the hard, trap-laden original H1&#8211;H10 benchmark, where H2 dominates, and the simplified, no-traps follow-up benchmark, where H3/H5-style tool-and-loop designs win. Those traps were the financial levers in disguise. The carrier-specific routing requirement maps to the procurement premium differential. The regulatory notification window maps to the $1.5M fine that H9 incurred. The supplier ordering dependency maps to the inventory write-off.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2h0k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc89131e0-88fc-4ab5-be50-8594c0700be8_866x275.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2h0k!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc89131e0-88fc-4ab5-be50-8594c0700be8_866x275.png 424w, https://substackcdn.com/image/fetch/$s_!2h0k!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc89131e0-88fc-4ab5-be50-8594c0700be8_866x275.png 848w, https://substackcdn.com/image/fetch/$s_!2h0k!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc89131e0-88fc-4ab5-be50-8594c0700be8_866x275.png 1272w, https://substackcdn.com/image/fetch/$s_!2h0k!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc89131e0-88fc-4ab5-be50-8594c0700be8_866x275.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2h0k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc89131e0-88fc-4ab5-be50-8594c0700be8_866x275.png" width="866" height="275" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c89131e0-88fc-4ab5-be50-8594c0700be8_866x275.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:275,&quot;width&quot;:866,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:38953,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc89131e0-88fc-4ab5-be50-8594c0700be8_866x275.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2h0k!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc89131e0-88fc-4ab5-be50-8594c0700be8_866x275.png 424w, https://substackcdn.com/image/fetch/$s_!2h0k!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc89131e0-88fc-4ab5-be50-8594c0700be8_866x275.png 848w, https://substackcdn.com/image/fetch/$s_!2h0k!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc89131e0-88fc-4ab5-be50-8594c0700be8_866x275.png 1272w, https://substackcdn.com/image/fetch/$s_!2h0k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc89131e0-88fc-4ab5-be50-8594c0700be8_866x275.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em><strong>The benchmark is not a quality measure. It is a specification of what &#8216;winning&#8217; means in financial terms.</strong></em></p><h3>11.2 The Harness Selection Framework as a Capital Allocation Decision</h3><p>Four variables determine which harness architecture produces the best financial outcome for a given deployment. Each corresponds directly to a financial lever in this case study.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ql7h!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5e630d6-e02e-4d6f-82d0-f4ccb02ad05f_865x265.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ql7h!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5e630d6-e02e-4d6f-82d0-f4ccb02ad05f_865x265.png 424w, https://substackcdn.com/image/fetch/$s_!Ql7h!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5e630d6-e02e-4d6f-82d0-f4ccb02ad05f_865x265.png 848w, https://substackcdn.com/image/fetch/$s_!Ql7h!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5e630d6-e02e-4d6f-82d0-f4ccb02ad05f_865x265.png 1272w, https://substackcdn.com/image/fetch/$s_!Ql7h!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5e630d6-e02e-4d6f-82d0-f4ccb02ad05f_865x265.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ql7h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5e630d6-e02e-4d6f-82d0-f4ccb02ad05f_865x265.png" width="865" height="265" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f5e630d6-e02e-4d6f-82d0-f4ccb02ad05f_865x265.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:265,&quot;width&quot;:865,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:38199,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5e630d6-e02e-4d6f-82d0-f4ccb02ad05f_865x265.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ql7h!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5e630d6-e02e-4d6f-82d0-f4ccb02ad05f_865x265.png 424w, https://substackcdn.com/image/fetch/$s_!Ql7h!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5e630d6-e02e-4d6f-82d0-f4ccb02ad05f_865x265.png 848w, https://substackcdn.com/image/fetch/$s_!Ql7h!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5e630d6-e02e-4d6f-82d0-f4ccb02ad05f_865x265.png 1272w, https://substackcdn.com/image/fetch/$s_!Ql7h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5e630d6-e02e-4d6f-82d0-f4ccb02ad05f_865x265.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Change any one variable and the optimal architecture shifts. <em><strong>A company with poor documentation deploying H2 would not achieve the same results &#8212; H2&#8217;s structured reasoning scaffold requires solid information inputs to be effective. A company deploying on a low-stakes routine task does not need H2&#8217;s precision overhead. Choosing the architecture without assessing these variables is not a neutral decision &#8212; it has a quantifiable financial cost. For teams: Brainstorm agentic design architecture and understand the potential financial impact may apply differently.</strong></em></p><h3>11.3 Beyond Cost: What Happens to the Workforce When the Savings Are Real?</h3><p>Everything in Sections 6&#8211;10 of this case is framed as cost avoidance or cost reduction &#8212; the $19.8M H2-vs-H9 gap, the $490K labour saving on compliance monitoring. That framing is correct, but it is incomplete, because it treats the organisation on the other side of these numbers as static. If PharmaCo genuinely captures these savings &#8212; three compliance analysts&#8217; worth of labour redirected, disruption-response teams no longer spending days reconciling a five-agent swarm&#8217;s contradictions &#8212; the workforce composition around those roles does not stay the same.</p><p><em><strong>A useful external reference point here is the way Anthropic itself has described its own hiring shift as agentic systems took over more of the routine technical work. Rather than continuing to scale headcount of engineers doing implementation-level work, the organisation has leaned toward smaller, faster-moving teams and a broader mix of interdisciplinary roles &#8212; including legal and philosophical expertise alongside technical roles &#8212; with people operating more like managers of AI-driven workflows than individual contributors executing tasks by hand.</strong></em></p><p>Applied to PharmaCo, the same logic suggests the $1.84M Year-1 benefit from H3 compliance monitoring is not simply &#8216;three analyst salaries removed from the cost line.&#8217; It is also three analysts&#8217; worth of time that can be redirected toward judgment-heavy work the AI system cannot do: interpreting borderline compliance findings, managing relationships with regulators, and feeding edge cases back into the system&#8217;s gold-answer criteria so H3 keeps improving. Similarly, the disruption-response team freed from four days of H4-style contradiction resolution per event is not simply &#8216;four fewer days of work&#8217; &#8212; it is four days that could be spent on supplier relationship management, scenario planning for the next disruption, or governance work of the kind Section 12 describes.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!184D!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F664b74c7-8f72-473a-ad9b-083c3167e91a_867x247.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!184D!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F664b74c7-8f72-473a-ad9b-083c3167e91a_867x247.png 424w, https://substackcdn.com/image/fetch/$s_!184D!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F664b74c7-8f72-473a-ad9b-083c3167e91a_867x247.png 848w, https://substackcdn.com/image/fetch/$s_!184D!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F664b74c7-8f72-473a-ad9b-083c3167e91a_867x247.png 1272w, https://substackcdn.com/image/fetch/$s_!184D!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F664b74c7-8f72-473a-ad9b-083c3167e91a_867x247.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!184D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F664b74c7-8f72-473a-ad9b-083c3167e91a_867x247.png" width="867" height="247" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/664b74c7-8f72-473a-ad9b-083c3167e91a_867x247.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:247,&quot;width&quot;:867,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:47489,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F664b74c7-8f72-473a-ad9b-083c3167e91a_867x247.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!184D!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F664b74c7-8f72-473a-ad9b-083c3167e91a_867x247.png 424w, https://substackcdn.com/image/fetch/$s_!184D!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F664b74c7-8f72-473a-ad9b-083c3167e91a_867x247.png 848w, https://substackcdn.com/image/fetch/$s_!184D!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F664b74c7-8f72-473a-ad9b-083c3167e91a_867x247.png 1272w, https://substackcdn.com/image/fetch/$s_!184D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F664b74c7-8f72-473a-ad9b-083c3167e91a_867x247.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is not a fully worked financial model &#8212; unlike the rest of this case study, it is not derived from the H1&#8211;H10 research or from sourced industry benchmarks. It is included as a <strong>discussion prompt</strong> for Section 13, because the original case study, which focused entirely on architecture-to-cost mapping, did not consider it.</p><h3>11.4 The Decide-Execute-Deliver Sandwich, and a Limit on the Workforce-Composition Story</h3><p>Section 11.3&#8217;s redirected-capacity argument rests on an assumption worth examining directly: that the freed-up time goes somewhere productive. A useful framework for testing that assumption comes from Arvind Narayanan and Sayash Kapoor&#8217;s analysis of knowledge work generally, which models most jobs as a &#8216;<strong>decide-execute-deliver sandwich&#8217; </strong>&#8212; AI compresses the execute layer in the middle, but the decide layer (what should be reviewed, escalated, or changed) and the deliver layer (who is accountable when something is missed) resist automation and, if anything, become more important once execution is fast and cheap.</p><p>Applied to H3&#8217;s compliance-monitoring role: H3 compresses the execute layer (continuous monitoring against the rule set). The decide layer &#8212; which borderline findings warrant escalation, how the gold-answer criteria should evolve, what PharmaCo&#8217;s compliance posture should be heading into the next regulatory cycle &#8212; and the deliver layer &#8212; who signs off on the compliance function&#8217;s output to the board and regulators &#8212; both remain with PharmaCo&#8217;s compliance team, regardless of how good H3 becomes. Section 11.3&#8217;s framing is consistent with this: the $1.84M benefit was never proposed as a full headcount reduction, and the redirected-capacity argument depends on those three analysts moving into decide- and deliver-layer work, not out of the organisation.</p><p>The risk this creates sits one hiring cycle downstream, not in the current year&#8217;s numbers. If PharmaCo responds to H3&#8217;s adoption by not backfilling the next junior compliance-analyst role &#8212; reasoning that H3 now covers the execute-layer work that role used to do &#8212; the organisation saves a salary in Year 1 but loses the route by which junior analysts become the senior analysts who do PharmaCo&#8217;s decide- and deliver-layer compliance work in five to ten years. Execution-layer work has historically been where junior staff build the judgment senior roles require; if AI absorbs that work before junior hires gain it, the pipeline narrows in a way that is invisible in any single year&#8217;s income statement and expensive to reverse once a senior-talent shortage becomes visible.</p><p>This is not an argument against adopting H3 &#8212; the $1.84M benefit (Section 9) stands regardless. It is an argument that the board review proposed in Section 12.1 for any architecture change should explicitly separate two questions that are easy to conflate: <em><strong>&#8216;does this change reduce this year&#8217;s compliance cost&#8217; (yes, by $1.84M) and &#8216;does this change alter our (junior) hiring plan for this function&#8217; (a separate decision, with consequences on a much longer horizon, that the cost figure alone does not capture).</strong></em></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fzc1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5493c2fa-83e3-41fd-a380-0b86299373ee_872x216.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fzc1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5493c2fa-83e3-41fd-a380-0b86299373ee_872x216.png 424w, https://substackcdn.com/image/fetch/$s_!fzc1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5493c2fa-83e3-41fd-a380-0b86299373ee_872x216.png 848w, https://substackcdn.com/image/fetch/$s_!fzc1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5493c2fa-83e3-41fd-a380-0b86299373ee_872x216.png 1272w, https://substackcdn.com/image/fetch/$s_!fzc1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5493c2fa-83e3-41fd-a380-0b86299373ee_872x216.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fzc1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5493c2fa-83e3-41fd-a380-0b86299373ee_872x216.png" width="872" height="216" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5493c2fa-83e3-41fd-a380-0b86299373ee_872x216.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:216,&quot;width&quot;:872,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:43527,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5493c2fa-83e3-41fd-a380-0b86299373ee_872x216.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fzc1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5493c2fa-83e3-41fd-a380-0b86299373ee_872x216.png 424w, https://substackcdn.com/image/fetch/$s_!fzc1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5493c2fa-83e3-41fd-a380-0b86299373ee_872x216.png 848w, https://substackcdn.com/image/fetch/$s_!fzc1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5493c2fa-83e3-41fd-a380-0b86299373ee_872x216.png 1272w, https://substackcdn.com/image/fetch/$s_!fzc1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5493c2fa-83e3-41fd-a380-0b86299373ee_872x216.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h2>Section 12 &#8212; Board and Shareholder Implications</h2><h3>12.1 The Capital Allocation Question</h3><p>The question before PharmaCo&#8217;s board was not whether to invest in agentic AI. That decision was straightforward: the $150K investment in H2 generates $20.41M in annual savings versus the human baseline, a payback period of under a week and a first-year ROI exceeding 89&#215;. No CFO would decline that proposal.</p><p>The more consequential question was: <strong>which architecture? And the answer to that question was worth [$297M] in enterprise value (your organization will have different factors apply) </strong>&#8212; the gap between choosing H2 and choosing H9. <strong>Both are AI investments. Both represent meaningful technology deployments. The difference between them, on this specific task, compounds annually.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gBLy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e5dceda-5e04-44d0-b0ef-f99e5d094aae_870x250.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gBLy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e5dceda-5e04-44d0-b0ef-f99e5d094aae_870x250.png 424w, https://substackcdn.com/image/fetch/$s_!gBLy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e5dceda-5e04-44d0-b0ef-f99e5d094aae_870x250.png 848w, https://substackcdn.com/image/fetch/$s_!gBLy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e5dceda-5e04-44d0-b0ef-f99e5d094aae_870x250.png 1272w, https://substackcdn.com/image/fetch/$s_!gBLy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e5dceda-5e04-44d0-b0ef-f99e5d094aae_870x250.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gBLy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e5dceda-5e04-44d0-b0ef-f99e5d094aae_870x250.png" width="870" height="250" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0e5dceda-5e04-44d0-b0ef-f99e5d094aae_870x250.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:250,&quot;width&quot;:870,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:30837,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e5dceda-5e04-44d0-b0ef-f99e5d094aae_870x250.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gBLy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e5dceda-5e04-44d0-b0ef-f99e5d094aae_870x250.png 424w, https://substackcdn.com/image/fetch/$s_!gBLy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e5dceda-5e04-44d0-b0ef-f99e5d094aae_870x250.png 848w, https://substackcdn.com/image/fetch/$s_!gBLy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e5dceda-5e04-44d0-b0ef-f99e5d094aae_870x250.png 1272w, https://substackcdn.com/image/fetch/$s_!gBLy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e5dceda-5e04-44d0-b0ef-f99e5d094aae_870x250.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>12.2 What Adversarial Verification Added</h3><p>One finding from the underlying research deserves specific board attention. The automated evaluation workflow included an adversarial verification stage &#8212; a separate agent whose sole purpose was to find flaws in the winning architecture&#8217;s output. This stage identified a vulnerability in H3 (the compliance monitoring system) that the standard quality scoring had not caught: H3&#8217;s output was conditionally correct &#8212; accurate when live tool data was available, but potentially non-compliant when tool access was simulated or unavailable.</p><p>In financial terms: without adversarial verification, PharmaCo would have deployed a system with a conditional compliance gap. The first time that condition was met in production &#8212; a tool API outage, a database maintenance window &#8212; the $1.5M expected fine saving could reverse into a live fine. The adversarial stage cost approximately $800 in compute (two Sonnet API calls) and potentially saved $2.5M in avoided fines. This is the financial case for adversarial verification as a standard deployment gate, not an optional quality check.</p><h2>Section 13 &#8212; Discussion Questions</h2><p>This case is designed for use in graduate-level courses in operations management, digital strategy, healthcare management, and financial analysis. The following questions are intended to guide discussion. You may also find it useful as part of any organizational deployment planning strategy. It will never be an end all, because paths that may be taken, are realistically typically budget and also imagination constrained.</p><p>&#8226; PharmaCo&#8217;s management team initially preferred H9 because a complex disruption seemed to warrant a complex AI system. Was their instinct unreasonable? Under what conditions would H9 have been the correct choice, and how would the financial analysis change?</p><p>&#8226; The case shows that H2 saves $19.8M per year versus H9, while costing $300K less to build and $190K less per year to run. Yet H9-style multi-agent swarms are still widely deployed in industry contexts similar to PharmaCo&#8217;s. What organisational and behavioural factors might explain why companies consistently over-engineer their AI systems?</p><p>&#8226; H3 outperformed H2 on the routine compliance monitoring task but would have underperformed on the disruption response task. How should PharmaCo&#8217;s technology governance framework distinguish between tasks to ensure the correct architecture is deployed in each case?</p><p>&#8226; The benchmark design &#8212; specifically the deliberate traps embedded in the disruption scenario &#8212; was the mechanism that revealed H9&#8217;s financial weakness. If PharmaCo&#8217;s benchmark had been poorly designed (no traps, general criteria), the analysis would have recommended H3 for disruption response. How should a board evaluate the quality of an AI benchmark before approving a deployment decision based on it?</p><p>&#8226; The adversarial verification stage identified a conditional vulnerability in H3 that saved an estimated $2.5M in potential fines at a compute cost of approximately $800. How should organisations quantify and mandate adversarial testing as a standard step in AI deployment governance?</p><p>&#8226; <strong>Section 11.3 argues that the $1.84M H3 saving is also three analysts&#8217; worth of redirectable capacity, drawing a loose analogy to how AI labs have restructured their own hiring toward smaller, more interdisciplinary teams. Is this analogy sound for a pharmaceutical company&#8217;s compliance function? What would PharmaCo need to do &#8212; in role redesign, training, or governance &#8212; to actually realise an earnings benefit from redirected capacity, rather than simply not backfilling the roles?</strong></p><p>&#8226; <strong>Section 11.4 distinguishes between two questions that are easy to conflate when an architecture like H3 is adopted: &#8216;does this reduce this year&#8217;s cost&#8217; and &#8216;does this change our junior hiring plan for this function.&#8217; The first is visible in Section 10&#8217;s figures; the second is not. How would you design PharmaCo&#8217;s annual technology-governance review (Section 12.1) so the second question is asked explicitly &#8212; and on what timescale would you expect its consequences to become measurable?</strong></p><h2>Section 14 &#8212; Building PharmaCo&#8217;s Internal Harness Lab</h2><p>Sections 6&#8211;9 treated H2, H9, H4, and H3 as a fixed menu &#8212; four architectures, already tested, with known costs and known outcomes. In practice, PharmaCo&#8217;s AI engineering team would not stop at four. <strong>The underlying research (&#8216;<a href="https://interestingengineering.substack.com/p/the-harness-lab-automated">The Harness Lab, Automated</a>&#8217;) describes a workflow in which an AI agent generates new harness candidates, tests them, adversarially probes them, runs a tournament, and loops until a quality threshold is met &#8212; with minimal human stage-management.</strong> What is commonly called &#8220;Loop Engineering&#8221; (a very loose term, but the examples given should be illustrative in its applications). This section sketches what that internal capability would look like for PharmaCo, and where its limits are.</p><h3>14.1 From a Fixed Comparison to a Continuous Search</h3><p>H2 was the winner of the original ten-architecture comparison, with an Alpha of 0.920 &#8212; comfortably ahead of the field. But &#8216;comfortably ahead of the field&#8217; is not the same as &#8216;optimal.&#8217; Three other architectures sit in a cluster just behind H2: H6 (two-agent chain with revision handoff, &#945;=0.900), H5 (self-revision loop, &#945;=0.840), and H7 (three-agent with model routing, &#945;=0.840). None of these were used in PharmaCo&#8217;s deployment scenarios &#8212; they were neither the winner, the predicted-but-wrong swarm, the cautionary failure, nor the alternate-task winner, so they had no role in the narrative. But they are not irrelevant. They are the nearest neighbours to H2, and under a Sigma-optimising search they are the most promising starting points for finding something that beats it.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!am7a!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723a495b-ec5c-467d-b2af-48bd2eef520b_870x201.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!am7a!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723a495b-ec5c-467d-b2af-48bd2eef520b_870x201.png 424w, https://substackcdn.com/image/fetch/$s_!am7a!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723a495b-ec5c-467d-b2af-48bd2eef520b_870x201.png 848w, https://substackcdn.com/image/fetch/$s_!am7a!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723a495b-ec5c-467d-b2af-48bd2eef520b_870x201.png 1272w, https://substackcdn.com/image/fetch/$s_!am7a!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723a495b-ec5c-467d-b2af-48bd2eef520b_870x201.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!am7a!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723a495b-ec5c-467d-b2af-48bd2eef520b_870x201.png" width="870" height="201" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/723a495b-ec5c-467d-b2af-48bd2eef520b_870x201.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:201,&quot;width&quot;:870,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:37930,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723a495b-ec5c-467d-b2af-48bd2eef520b_870x201.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!am7a!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723a495b-ec5c-467d-b2af-48bd2eef520b_870x201.png 424w, https://substackcdn.com/image/fetch/$s_!am7a!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723a495b-ec5c-467d-b2af-48bd2eef520b_870x201.png 848w, https://substackcdn.com/image/fetch/$s_!am7a!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723a495b-ec5c-467d-b2af-48bd2eef520b_870x201.png 1272w, https://substackcdn.com/image/fetch/$s_!am7a!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723a495b-ec5c-467d-b2af-48bd2eef520b_870x201.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>14.2 What a Cherny-Style Prompt Would Look Like for PharmaCo</h3><p>The research describes a shift from a long, stage-by-stage prompt to a short goal statement plus a project-level instructions file (CLAUDE.md) that carries all the rules, scoring logic, and constraints. Applied to PharmaCo&#8217;s disruption-response harness, the top-level instruction the AI engineering team would issue might be reduced to something like:</p><p><code>/goal Sigma &gt;= 0.95 for the disruption-response harness, seeded from H2 (current champion, Sigma=0.920&#247;(1+&#916;)). Compare against H5, H6, and H7. Read CLAUDE.md for the benchmark, scoring rules, and constraints. Generate, adversarially probe, tournament, and loop until the goal is met or the token budget is exhausted. Produce a recommendation with full Sigma breakdown.</code></p><p>Everything else &#8212; the five binary criteria from the original gold answer, the requirement to never recommend suspending a medication, the token and time budgets, and the instruction to flag any design that relies on memorised test cases rather than general reasoning &#8212; would live in CLAUDE.md, not in the prompt. The agent decides whether to start from H6&#8217;s revision handoff and add H2&#8217;s reasoning scaffold, or start from H2 and add H7&#8217;s model-routing step for the highest-stakes sub-decisions, or some combination neither architecture used on its own.</p><h3>14.3 The LFD Discipline: Why the Hard Benchmark Still Matters</h3><p>The <strong><a href="https://open.substack.com/pub/interestingengineering/p/the-harness-lab-automated?r=223m94&amp;selection=b5fbfe07-7c59-4432-929d-ff50e259405b&amp;utm_campaign=post-share-selection&amp;utm_medium=web&amp;aspectRatio=instagram&amp;textColor=%23ffffff&amp;bgImage=true">Loss Function Definition (LFD) </a></strong>framing adds three disciplines that are directly relevant to a regulated company running this kind of search:</p><p>&#8226; Blind, expanded evaluation. The gold-answer criteria PharmaCo uses for disruption response should not be visible to the generating agent, and should be expanded well beyond the original five-criteria set &#8212; including the cold-chain and air-freight-override traps that distinguished H2 from H9 in the first place. A search that can see its own exam will optimise for the exam, not the task.</p><p>&#8226; Forced entropy. If two consecutive search cycles produce no Sigma improvement, the instructions should require a genuinely different structural axis &#8212; not a smaller tweak to the same design. This is what prevents the search from converging on, say, twelve incremental variants of H6 that all score within noise of each other.</p><p>&#8226; Mandatory adversarial validation before promotion. Any design that beats H2 on Sigma must clear the same adversarial-probe stage that caught H3&#8217;s conditional compliance gap in Section 9, before it is considered for the live disruption-response role. A machine-generated winner that clears a quality threshold is a first draft, not a deployment candidate.</p><h3>14.4 What This Does &#8212; and Does Not &#8212; Change About Sections 6&#8211;9</h3><p>Two clarifications are important so this section does not overstate its implications. First, nothing here suggests H2 was the wrong choice for PharmaCo&#8217;s disruption-response deployment as analysed in Section 6 &#8212; H2 remains the best-supported architecture given the comparison that was actually run, and the financial figures in Sections 10&#8211;12 stand on that basis. Second, this section is not proposing PharmaCo replace H2 before any search has actually been run; it is describing what PharmaCo&#8217;s AI engineering function should be doing on an ongoing basis, the same way a pharmaceutical company continues lifecycle management research on an approved drug rather than treating approval as the end of the work.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8tT4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99182311-7ba9-4307-80ff-9ce6407aa146_870x158.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8tT4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99182311-7ba9-4307-80ff-9ce6407aa146_870x158.png 424w, https://substackcdn.com/image/fetch/$s_!8tT4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99182311-7ba9-4307-80ff-9ce6407aa146_870x158.png 848w, https://substackcdn.com/image/fetch/$s_!8tT4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99182311-7ba9-4307-80ff-9ce6407aa146_870x158.png 1272w, https://substackcdn.com/image/fetch/$s_!8tT4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99182311-7ba9-4307-80ff-9ce6407aa146_870x158.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8tT4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99182311-7ba9-4307-80ff-9ce6407aa146_870x158.png" width="870" height="158" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/99182311-7ba9-4307-80ff-9ce6407aa146_870x158.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:158,&quot;width&quot;:870,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:27658,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99182311-7ba9-4307-80ff-9ce6407aa146_870x158.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8tT4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99182311-7ba9-4307-80ff-9ce6407aa146_870x158.png 424w, https://substackcdn.com/image/fetch/$s_!8tT4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99182311-7ba9-4307-80ff-9ce6407aa146_870x158.png 848w, https://substackcdn.com/image/fetch/$s_!8tT4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99182311-7ba9-4307-80ff-9ce6407aa146_870x158.png 1272w, https://substackcdn.com/image/fetch/$s_!8tT4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99182311-7ba9-4307-80ff-9ce6407aa146_870x158.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>14.5 A Note on H5, H6, and H7 Specifically</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6Bhl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb481f742-fba3-43ed-8b5e-5d56f9b7ae3a_862x418.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6Bhl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb481f742-fba3-43ed-8b5e-5d56f9b7ae3a_862x418.png 424w, https://substackcdn.com/image/fetch/$s_!6Bhl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb481f742-fba3-43ed-8b5e-5d56f9b7ae3a_862x418.png 848w, https://substackcdn.com/image/fetch/$s_!6Bhl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb481f742-fba3-43ed-8b5e-5d56f9b7ae3a_862x418.png 1272w, https://substackcdn.com/image/fetch/$s_!6Bhl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb481f742-fba3-43ed-8b5e-5d56f9b7ae3a_862x418.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6Bhl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb481f742-fba3-43ed-8b5e-5d56f9b7ae3a_862x418.png" width="862" height="418" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b481f742-fba3-43ed-8b5e-5d56f9b7ae3a_862x418.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:418,&quot;width&quot;:862,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:57296,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb481f742-fba3-43ed-8b5e-5d56f9b7ae3a_862x418.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6Bhl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb481f742-fba3-43ed-8b5e-5d56f9b7ae3a_862x418.png 424w, https://substackcdn.com/image/fetch/$s_!6Bhl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb481f742-fba3-43ed-8b5e-5d56f9b7ae3a_862x418.png 848w, https://substackcdn.com/image/fetch/$s_!6Bhl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb481f742-fba3-43ed-8b5e-5d56f9b7ae3a_862x418.png 1272w, https://substackcdn.com/image/fetch/$s_!6Bhl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb481f742-fba3-43ed-8b5e-5d56f9b7ae3a_862x418.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>None of these hybrids exist yet &#8212; they are illustrative of the kind of candidate a Cherny/LFD-style search seeded on H2, H5, H6, and H7 would be expected to generate and test. <strong>The point of this section is not to predict the winner, but to show that PharmaCo&#8217;s harness selection should be treated as an ongoing search problem with a defined refinement frontier, not a one-time decision that ended when H2 was chosen.</strong></em></p><h4>14.6 The Sandwich, Applied to the Harness Lab Itself</h4><p>Section 11.4 introduced the &#8216;decide-execute-deliver sandwich&#8217; as a reason why PharmaCo&#8217;s compliance and disruption-response functions are unlikely to see significant headcount reductions even as H2 and H3 deliver real savings. The same framework applies, with almost no modification, to the harness lab described in this section &#8212; and the design choices already made in 14.1&#8211;14.5 turn out to be an instance of it, not a departure from it.</p><p>The Sigma-optimising loop in 14.2&#8211;14.3 is an execute-layer mechanism: given a goal (Sigma &#8805; target), a seed (H2), and a refinement frontier (H5/H6/H7), it generates, scores, and revises candidate harnesses with minimal human involvement. That is the part agentic AI compresses. But two layers around it remain explicitly human, by the design choices this section already makes:</p><p>&#8226; Decide &#8212; what Sigma target to set, which architectures belong in the refinement frontier, and what counts as a &#8216;genuinely different structural axis&#8217; for the forced-entropy rule (14.3) are judgement calls made by PharmaCo&#8217;s AI engineering team before the loop runs. The loop optimises within a space humans defined; it does not define the space.</p><p>&#8226; Deliver &#8212; the mandatory adversarial validation (14.3) and the board-level governance review (14.4, linking to Section 12.1) mean that no search output reaches a live role without a human-accountable sign-off step, regardless of how good the discovered architecture&#8217;s Sigma score is.</p><p>In other words, <strong>building &#8216;PharmaCo&#8217;s internal harness lab&#8217; does not create a function whose entire job can eventually be automated away once the loop works well &#8212; it creates a function whose execute layer can be automated, with the decide and deliver layers becoming, if anything, more central to what the remaining human team actually does.</strong> This is the same pattern as H3 and compliance monitoring (Section 11.4), applied recursively to the team that builds and refines H3 in the first place. The practical implication for Section 14&#8217;s staffing: PharmaCo should expect the harness lab to need fewer people running benchmarks by hand over time, and more people deciding what to benchmark and signing off on what gets deployed &#8212; not fewer people overall on a predictable timeline.</p><h2>Appendix A &#8212; Key Assumptions and Sources</h2><p>Every number in this case study is derived from a stated assumption. This appendix summarises each assumption, its value, and its basis. Readers who wish to stress-test the financial model can substitute their own assumptions for any of these inputs.</p><h3>A.1 Company and Market Assumptions</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!neE_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9e44667-910f-48df-a0a0-5c0748a41aa1_870x358.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!neE_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9e44667-910f-48df-a0a0-5c0748a41aa1_870x358.png 424w, https://substackcdn.com/image/fetch/$s_!neE_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9e44667-910f-48df-a0a0-5c0748a41aa1_870x358.png 848w, https://substackcdn.com/image/fetch/$s_!neE_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9e44667-910f-48df-a0a0-5c0748a41aa1_870x358.png 1272w, https://substackcdn.com/image/fetch/$s_!neE_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9e44667-910f-48df-a0a0-5c0748a41aa1_870x358.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!neE_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9e44667-910f-48df-a0a0-5c0748a41aa1_870x358.png" width="870" height="358" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f9e44667-910f-48df-a0a0-5c0748a41aa1_870x358.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:358,&quot;width&quot;:870,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:67144,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9e44667-910f-48df-a0a0-5c0748a41aa1_870x358.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!neE_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9e44667-910f-48df-a0a0-5c0748a41aa1_870x358.png 424w, https://substackcdn.com/image/fetch/$s_!neE_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9e44667-910f-48df-a0a0-5c0748a41aa1_870x358.png 848w, https://substackcdn.com/image/fetch/$s_!neE_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9e44667-910f-48df-a0a0-5c0748a41aa1_870x358.png 1272w, https://substackcdn.com/image/fetch/$s_!neE_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9e44667-910f-48df-a0a0-5c0748a41aa1_870x358.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>A.2 Product Revenue, Margin, and Buffer Stock Assumptions</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!M123!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279aa56b-2606-448c-aaa4-57b873e39e0c_867x497.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!M123!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279aa56b-2606-448c-aaa4-57b873e39e0c_867x497.png 424w, https://substackcdn.com/image/fetch/$s_!M123!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279aa56b-2606-448c-aaa4-57b873e39e0c_867x497.png 848w, https://substackcdn.com/image/fetch/$s_!M123!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279aa56b-2606-448c-aaa4-57b873e39e0c_867x497.png 1272w, https://substackcdn.com/image/fetch/$s_!M123!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279aa56b-2606-448c-aaa4-57b873e39e0c_867x497.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!M123!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279aa56b-2606-448c-aaa4-57b873e39e0c_867x497.png" width="867" height="497" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/279aa56b-2606-448c-aaa4-57b873e39e0c_867x497.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:497,&quot;width&quot;:867,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:101443,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279aa56b-2606-448c-aaa4-57b873e39e0c_867x497.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!M123!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279aa56b-2606-448c-aaa4-57b873e39e0c_867x497.png 424w, https://substackcdn.com/image/fetch/$s_!M123!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279aa56b-2606-448c-aaa4-57b873e39e0c_867x497.png 848w, https://substackcdn.com/image/fetch/$s_!M123!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279aa56b-2606-448c-aaa4-57b873e39e0c_867x497.png 1272w, https://substackcdn.com/image/fetch/$s_!M123!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279aa56b-2606-448c-aaa4-57b873e39e0c_867x497.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>A.3 Regulatory, Switching, Procurement, and Write-off Rates</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DX47!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ac5920a-abce-496c-8101-7e0dcd76a95f_711x677.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DX47!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ac5920a-abce-496c-8101-7e0dcd76a95f_711x677.png 424w, https://substackcdn.com/image/fetch/$s_!DX47!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ac5920a-abce-496c-8101-7e0dcd76a95f_711x677.png 848w, https://substackcdn.com/image/fetch/$s_!DX47!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ac5920a-abce-496c-8101-7e0dcd76a95f_711x677.png 1272w, https://substackcdn.com/image/fetch/$s_!DX47!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ac5920a-abce-496c-8101-7e0dcd76a95f_711x677.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DX47!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ac5920a-abce-496c-8101-7e0dcd76a95f_711x677.png" width="711" height="677" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7ac5920a-abce-496c-8101-7e0dcd76a95f_711x677.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:677,&quot;width&quot;:711,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:94583,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ac5920a-abce-496c-8101-7e0dcd76a95f_711x677.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DX47!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ac5920a-abce-496c-8101-7e0dcd76a95f_711x677.png 424w, https://substackcdn.com/image/fetch/$s_!DX47!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ac5920a-abce-496c-8101-7e0dcd76a95f_711x677.png 848w, https://substackcdn.com/image/fetch/$s_!DX47!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ac5920a-abce-496c-8101-7e0dcd76a95f_711x677.png 1272w, https://substackcdn.com/image/fetch/$s_!DX47!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ac5920a-abce-496c-8101-7e0dcd76a95f_711x677.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>A.4 Implementation and Operating Cost Build-Ups</h3><p>Labour rates reflect 2025&#8211;2026 US market rates for specialist AI engineering (Levels.fyi, Glassdoor senior AI engineer compensation data, adjusted for contract rates): $260/hr senior AI engineer, $210/hr integration engineer, $180/hr QA, $150/hr documentation.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cFQI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d4010f0-de7a-4596-9f4f-0b6da6e6c921_785x315.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cFQI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d4010f0-de7a-4596-9f4f-0b6da6e6c921_785x315.png 424w, https://substackcdn.com/image/fetch/$s_!cFQI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d4010f0-de7a-4596-9f4f-0b6da6e6c921_785x315.png 848w, https://substackcdn.com/image/fetch/$s_!cFQI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d4010f0-de7a-4596-9f4f-0b6da6e6c921_785x315.png 1272w, https://substackcdn.com/image/fetch/$s_!cFQI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d4010f0-de7a-4596-9f4f-0b6da6e6c921_785x315.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cFQI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d4010f0-de7a-4596-9f4f-0b6da6e6c921_785x315.png" width="785" height="315" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6d4010f0-de7a-4596-9f4f-0b6da6e6c921_785x315.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:315,&quot;width&quot;:785,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:31347,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d4010f0-de7a-4596-9f4f-0b6da6e6c921_785x315.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cFQI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d4010f0-de7a-4596-9f4f-0b6da6e6c921_785x315.png 424w, https://substackcdn.com/image/fetch/$s_!cFQI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d4010f0-de7a-4596-9f4f-0b6da6e6c921_785x315.png 848w, https://substackcdn.com/image/fetch/$s_!cFQI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d4010f0-de7a-4596-9f4f-0b6da6e6c921_785x315.png 1272w, https://substackcdn.com/image/fetch/$s_!cFQI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d4010f0-de7a-4596-9f4f-0b6da6e6c921_785x315.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cOEM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9841b368-3b67-400c-a06b-f1e179429a33_787x578.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cOEM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9841b368-3b67-400c-a06b-f1e179429a33_787x578.png 424w, https://substackcdn.com/image/fetch/$s_!cOEM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9841b368-3b67-400c-a06b-f1e179429a33_787x578.png 848w, https://substackcdn.com/image/fetch/$s_!cOEM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9841b368-3b67-400c-a06b-f1e179429a33_787x578.png 1272w, https://substackcdn.com/image/fetch/$s_!cOEM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9841b368-3b67-400c-a06b-f1e179429a33_787x578.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cOEM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9841b368-3b67-400c-a06b-f1e179429a33_787x578.png" width="787" height="578" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9841b368-3b67-400c-a06b-f1e179429a33_787x578.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:578,&quot;width&quot;:787,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:55168,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9841b368-3b67-400c-a06b-f1e179429a33_787x578.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cOEM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9841b368-3b67-400c-a06b-f1e179429a33_787x578.png 424w, https://substackcdn.com/image/fetch/$s_!cOEM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9841b368-3b67-400c-a06b-f1e179429a33_787x578.png 848w, https://substackcdn.com/image/fetch/$s_!cOEM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9841b368-3b67-400c-a06b-f1e179429a33_787x578.png 1272w, https://substackcdn.com/image/fetch/$s_!cOEM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9841b368-3b67-400c-a06b-f1e179429a33_787x578.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>The $450K figure applies a higher contingency for multi-agent system complexity &#8212; consistent with Gartner&#8217;s documented finding that multi-agent AI implementations routinely exceed initial estimates by 25&#8211;40% due to integration complexity and inter-agent debugging overhead.</em></p><h3>A.5 Annual Operating Costs and Response Times</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eKjL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67ab32b7-762b-4a72-b7be-e19733c42eba_791x617.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eKjL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67ab32b7-762b-4a72-b7be-e19733c42eba_791x617.png 424w, https://substackcdn.com/image/fetch/$s_!eKjL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67ab32b7-762b-4a72-b7be-e19733c42eba_791x617.png 848w, https://substackcdn.com/image/fetch/$s_!eKjL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67ab32b7-762b-4a72-b7be-e19733c42eba_791x617.png 1272w, https://substackcdn.com/image/fetch/$s_!eKjL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67ab32b7-762b-4a72-b7be-e19733c42eba_791x617.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eKjL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67ab32b7-762b-4a72-b7be-e19733c42eba_791x617.png" width="791" height="617" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/67ab32b7-762b-4a72-b7be-e19733c42eba_791x617.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:617,&quot;width&quot;:791,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:80095,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67ab32b7-762b-4a72-b7be-e19733c42eba_791x617.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eKjL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67ab32b7-762b-4a72-b7be-e19733c42eba_791x617.png 424w, https://substackcdn.com/image/fetch/$s_!eKjL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67ab32b7-762b-4a72-b7be-e19733c42eba_791x617.png 848w, https://substackcdn.com/image/fetch/$s_!eKjL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67ab32b7-762b-4a72-b7be-e19733c42eba_791x617.png 1272w, https://substackcdn.com/image/fetch/$s_!eKjL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67ab32b7-762b-4a72-b7be-e19733c42eba_791x617.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Appendix B &#8212; Discussion Question Answer Frameworks</h2><p>These frameworks are provided for instructors and self-study readers. They are not definitive answers &#8212; the questions are designed to produce genuine disagreement. Each presents the key tensions, evidence from the case, and counterarguments worth raising.</p><h3>B.1 Was management&#8217;s preference for H9 unreasonable?</h3><p>The instinct that a complex disruption requires a complex AI system is not unreasonable &#8212; parallel specialist expertise does outperform single-agent generalisation in many real-world scenarios. Management was wrong in this specific case, but their reasoning was not illogical.</p><p>H9 would have been the correct choice if the disruption affected 50+ products across five therapeutic categories simultaneously (a scope H2 would struggle to hold coherently in one context window); if the task required live data retrieval from multiple incompatible systems simultaneously, where parallel agents reduce latency; or if the organisation&#8217;s SOPs were fragmented across departments requiring separate agents to interface with separate knowledge bases.</p><p>H9 failed on this task because the task was bounded and well-specified &#8212; one event, three products, a handful of gold-answer criteria &#8212; something a single structured agent could hold in one context window without coherence degradation. The deliberate traps (carrier-specific routing, the 48-hour regulatory window, the ordering dependency) required precision that H9&#8217;s synthesis softened. In Sigma terms: H9 achieved &#945; = 0.815 with &#916; = 2.5, giving &#931; = 0.233; H2 achieved &#945; = 0.920 with &#916; = 0.3, giving &#931; = 0.708. H2&#8217;s quality advantage more than compensates for H9&#8217;s additional agent resources.</p><h3>B.2 Why do companies consistently over-engineer AI systems?</h3><p><strong>Behavioural factors:</strong> complexity bias (a five-agent system is easier to justify to a board than a single structured prompt, even when the prompt performs better); vendor incentives (implementations are typically priced on scope, not outcome quality &#8212; a $450K H9 build generates more revenue than a $150K H2 build); and risk transfer (a complex system lets managers attribute failures to &#8216;the complexity of the problem&#8217; rather than to the architecture choice itself).</p><p><strong>Structural factors:</strong> benchmarks are often designed after architecture selection, so without a pre-specified gold answer with deliberate traps, no evaluation reveals H9-style precision degradation. The Sigma metric (quality per complexity unit) is not yet a standard industry metric &#8212; most evaluation frameworks report Alpha (raw quality) only, under which H9 looks competitive rather than significantly inferior. And procurement processes typically ask &#8216;can this system do the task?&#8217; &#8212; a binary question &#8212; rather than &#8216;what is the most cost-efficient architecture that does the task?&#8217;</p><h3>B.3 How should PharmaCo govern architecture-to-task matching?</h3><p>&#8226; <strong>Framework-first governance</strong>: classify each task against the four variables in Section 11.2 before deployment, and map the classification to a recommended architecture range; decisions outside that range require sign-off from a Chief AI Officer or equivalent. This is preventive.</p><p>&#8226; <strong>Evidence-first governance</strong>: require a benchmark test with a pre-specified &#8220;gold answer&#8221; and deliberate traps before any architecture goes to production, with a minimum Sigma threshold for approval. This is evaluative but requires benchmark-design capability most organisations do not yet have.</p><p>&#8226; <strong>Adversarial-gate governance</strong>: require every proposed architecture to pass an adversarial verification stage &#8212; a separate agent or team whose sole purpose is to find failure modes before deployment. This is the model that would have caught H3&#8217;s conditional vulnerability (Section 12.2).</p><p>The case evidence supports a combination of framework-first and adversarial-gate governance as minimum requirements.</p><h3>B.4 How should a board evaluate benchmark quality?</h3><p>The most underappreciated risk in AI governance is not a bad AI system &#8212; it is a bad evaluation of one. A board should ask five questions of any benchmark used to justify a deployment decision:</p><p>&#8226; Does the benchmark include deliberate traps &#8212; cases where a plausible but incorrect answer scores well on a general rubric but fails a specific criterion? Without traps, the benchmark cannot distinguish between H2 and H9.</p><p>&#8226; Was the gold answer pre-specified and documented before the AI system was run? Post-hoc gold answers can be unconsciously calibrated to the system already built.</p><p>&#8226; Does the benchmark include adversarial scenarios &#8212; conditions under which the system might fail in production that are not represented in normal operation?</p><p>&#8226; Who designed the benchmark? If the same team that built the AI system, it may be calibrated to that system&#8217;s strengths. Independent design is more reliable.</p><p>&#8226; What is the Sigma score (quality divided by complexity)? A benchmark that reports only Alpha (raw quality) is incomplete and will tend to favour over-engineered architectures.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AAKD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fe66cd-5bd8-4628-8d14-5d716eef2319_1122x612.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AAKD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fe66cd-5bd8-4628-8d14-5d716eef2319_1122x612.png 424w, https://substackcdn.com/image/fetch/$s_!AAKD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fe66cd-5bd8-4628-8d14-5d716eef2319_1122x612.png 848w, https://substackcdn.com/image/fetch/$s_!AAKD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fe66cd-5bd8-4628-8d14-5d716eef2319_1122x612.png 1272w, https://substackcdn.com/image/fetch/$s_!AAKD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fe66cd-5bd8-4628-8d14-5d716eef2319_1122x612.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AAKD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fe66cd-5bd8-4628-8d14-5d716eef2319_1122x612.png" width="1122" height="612" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/33fe66cd-5bd8-4628-8d14-5d716eef2319_1122x612.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:612,&quot;width&quot;:1122,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1057436,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/202103215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fe66cd-5bd8-4628-8d14-5d716eef2319_1122x612.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AAKD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fe66cd-5bd8-4628-8d14-5d716eef2319_1122x612.png 424w, https://substackcdn.com/image/fetch/$s_!AAKD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fe66cd-5bd8-4628-8d14-5d716eef2319_1122x612.png 848w, https://substackcdn.com/image/fetch/$s_!AAKD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fe66cd-5bd8-4628-8d14-5d716eef2319_1122x612.png 1272w, https://substackcdn.com/image/fetch/$s_!AAKD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fe66cd-5bd8-4628-8d14-5d716eef2319_1122x612.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>B.5 How should adversarial testing be mandated and costed?</h3><p>The cost-benefit calculation is striking: the adversarial stage identified H3&#8217;s conditional vulnerability at a compute cost of approximately $800 (two Sonnet API calls) plus roughly $2,000 in human review time (4 hours). The vulnerability, undetected in production, had an expected cost of $2.5M in avoided fines. The question is not whether to do adversarial testing &#8212; the case for $2.8K cost vs. $2.5M expected benefit is overwhelming &#8212; but how to institutionalise it so it is not skipped under time pressure.</p><p>&#8226; Hard gate: adversarial testing is a formal deployment gate, managed by an independent team. Strongest model &#8212; cannot be skipped &#8212; but requires organisational commitment to maintain the independent review function.</p><p>&#8226; Automated adversarial: build adversarial testing into the deployment pipeline as an automated stage, run on every deployment candidate, with results logged and reviewed. Scales better but requires investment in adversary-agent design.</p><p>&#8226; Risk-tiered requirement: mandatory for high-stakes deployments (regulatory, patient safety, financial compliance) and optional for low-stakes routine deployments &#8212; requires a risk classification framework.</p><p>The case supports the hard-gate model for any deployment affecting regulatory compliance, patient safety, or significant financial exposure, with the automated model appropriate for iterative improvement within an already-approved deployment.</p><h2>References</h2><p><em>References marked [open access] are freely available. Others are accessible through institutional library subscriptions or directly from the publishing organisation&#8217;s website.</em></p><p><strong>[1] </strong>&#8216;<a href="https://interestingengineering.substack.com/p/the-loop-is-the-lab">The Loop is the Lab</a>&#8217; / &#8216;<a href="https://interestingengineering.substack.com/p/the-speciation-of-intelligence">The Speciation of Intelligence</a>&#8217; / &#8216;<a href="https://interestingengineering.substack.com/p/the-working-layer">The Working Layer.</a>&#8217; Harness Engineering Series, Interesting Engineering++. Foundational ASCRS architecture, the eight-primitive framework, and the mutation ladder (L0&#8211;L5).</p><p><strong>[2] </strong><a href="https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated">The ASCRS Harness Lab</a>. Harness Engineering Series, Interesting Engineering++. Documents the H1&#8211;H10 benchmark in full: &#945;, &#955;, and &#954; results across all ten architectures (Section 1 of this case), including H2 &#945;=0.920 and H9 &#945;=0.815. https://interestingengineering.substack.com</p><p><strong>[3] </strong>&#8216;<a href="https://interestingengineering.substack.com/p/the-harness-lab-automated">The Harness Lab, Automated</a>.&#8217; Harness Engineering Series, Interesting Engineering++. Automated tournament of harness designs using Claude Code dynamic workflows on a simplified benchmark. Introduces the Sigma metric (&#945; / (1+&#916;)). Tool-using and iterative-loop designs (H3/H5-style) win the simplified benchmark. https://interestingengineering.substack.com</p><p><strong>[4] </strong>&#8216;<a href="https://interestingengineering.substack.com/p/the-prompt-is-still-the-work-dynamic">The Prompt Is Still the Work: Dynamic Workflows in Claude Code</a>.&#8217; Harness Engineering Series, Interesting Engineering++. Maps Anthropic&#8217;s six dynamic workflow patterns against prior ISR experiments. https://interestingengineering.substack.com/p/the-prompt-is-still-the-work-dynamic</p><p><strong>[5] </strong>Anthropic (2026). &#8216;<a href="https://x.com/trq212/status/2061907337154367865?s=20">A harness for every task: dynamic workflows in Claude Code</a>.&#8217; Thariq Shihipar and Sid Bidasaria. Introduces the ultracode keyword, six workflow patterns, and the agentic failure modes (agentic laziness, self-preferential bias, goal drift). [open access] <a href="https://claude.com/blog/a-harness-for-every-task-dynamic-workflows-in-claude-code">https://claude.com/blog/a-harness-for-every-task-dynamic-workflows-in-claude-code</a></p><p><strong>[6] </strong>Anthropic (2026). <a href="https://code.claude.com/docs/en/workflows">Claude Code Dynamic Workflows Reference Documentation</a>. Technical reference for forkSubAgents(), runSubAgent(), /loop, /workflows, /goal, /compact. [open access] <a href="https://code.claude.com/docs/en/workflows">https://code.claude.com/docs/en/workflows</a></p><p><strong>[7] </strong>US Code of Federal Regulations (2024). 21 CFR Part 314.81 &#8212; Post-approval reporting requirements including drug shortage notification obligations (&#167; b.3.iii). [open access] <a href="https://www.ecfr.gov/current/title-21/chapter-I/subchapter-D/part-314/subpart-B/section-314.81">https://www.ecfr.gov/current/title-21/chapter-I/subchapter-D/part-314/subpart-E/section-314.81</a></p><p><strong>[8] </strong>US Congress (2012). Food and Drug Administration Safety and Innovation Act (FDASIA), Section 1001. <a href="https://www.fda.gov/drugs/drug-shortages/frequently-asked-questions-about-drug-shortages#:~:text=Are%20companies%20required%20to%20notify,finished%20drugs%20and%20biological%20products.">Mandatory drug shortage reporting for manufacturers of medically necessary drugs</a>. [open access] https://www.fda.gov/regulatory-information/selected-amendments-fdc-act/food-and-drug-administration-safety-and-innovation-act-fdasia</p><p><strong>[9] </strong>US FDA Drug Shortages Database. Historical records of reported shortages, enforcement actions, and warning letters. Used to derive the $250K&#8211;$2.5M fine-per-violation range. [open access] <a href="https://www.accessdata.fda.gov/scripts/drugshortages/">https://www.accessdata.fda.gov/scripts/drugshortages/</a></p><p><strong>[10] </strong>2001-2026 <a href="https://www.ashp.org/drug-shortages/shortage-resources/drug-shortages-statistics">ASHP Drug Shortage Survey Results</a>.&#8217; American Journal of Health-System Pharmacy. 58% of pharmacists made therapeutic substitutions during shortages; 12&#8211;18% became permanent (basis for the 15% switching-rate assumption). <a href="https://www.ashp.org/drug-shortages/shortage-resources/drug-shortages-statistics">https://www.ashp.org/drug-shortages/shortage-resources/drug-shortage-statistics</a></p><p><strong>[11] </strong>Schachtner L et al. (2021). &#8216;<a href="https://pmc.ncbi.nlm.nih.gov/articles/PMC12469595/">Pharmaceutical switching behaviour following drug shortage.&#8217; Journal of Managed Care and Specialty Pharmacy,</a> 27(8), 1045&#8211;1053. Permanent switch-rate study underpinning the 12&#8211;18% assumption. https://www.jmcp.org</p><p><strong>[12] </strong>Deloitte Life Sciences (2023). &#8216;<a href="https://www.deloitte.com/global/en/industries/life-sciences-health-care/about/lshc-driving-better-performance-patient-outcomes.html">Pharmaceutical Supply Chain Risk Management: Emergency Sourcing and Procurement Analytics</a>.&#8217; Documents Tier-1 emergency procurement premiums 25&#8211;35%, Tier-2 35&#8211;45%. <a href="https://www.deloitte.com/us/en/industries/life-sciences-health-care/articles/life-sciences-supply-chain-manufacturing.html">https://www2.deloitte.com/us/en/pages/life-sciences-and-health-care/articles/pharmaceutical-supply-chain.html</a></p><p><strong>[13] </strong>McKinsey Global Institute (2021). &#8216;Risk, resilience and rebalancing in global value chains.&#8217; Documents 50&#8211;80% spot pharmaceutical procurement premiums during 2020&#8211;2021 shortage events. Basis for the 55% spot-market assumption. <a href="https://www.mckinsey.com/capabilities/risk-and-resilience/our-insights">https://www.mckinsey.com/capabilities/risk-and-resilience/our-insights</a></p><p><strong>[14] </strong>EvaluatePharma (2024). &#8216;<a href="https://www.evaluate.com/press-release/evaluate-releases-2030-forecasts-for-global-pharmaceutical-market/">World Preview 2024: Outlook to 2030.&#8217;</a> Industry benchmark for pharmaceutical EBITDA margins; median for top 25 pharma companies 27.3%. Basis for the 28% assumption. <a href="https://www.evaluate.com/press-release/evaluate-releases-2030-forecasts-for-global-pharmaceutical-market/">https://www.evaluate.com/evaluate-pharma</a></p><p><strong>[15] </strong><a href="https://www.iqvia.com/-/media/iqvia/pdfs/events/presentation_global-meds-webinar_public.pdf">IQVIA Institute (2025). &#8216;Global Medicine Spending and Usage Trends: Outlook to 2029.&#8217; Pharmaceutical gross margin benchmarks by category.</a> Basis for the 43% blended margin assumption. https://www.iqvia.com/insights/the-iqvia-institute/reports</p><p><strong>[16] </strong><a href="https://www.spglobal.com/ratings/en/regulatory/article/250203-pharmaceutical-industry-2025-credit-outlook-is-stable-as-healthy-revenue-growth-mitigates-pressures-s13394024">S&amp;P Global Market Intelligence (2025). S&amp;P Pharmaceuticals Select Industry Index &#8212; historical P/E multiple data. Q4 2025 median: 14.8&#215;. Basis for the 15&#215; multiple used in enterprise value calculations.</a> https://www.spglobal.com/marketintelligence/en/</p><p><strong>[17] </strong><a href="https://premierinc.com/newsroom/new-premier-data-reveals-healthcare-supply-chain-trends-challenges-and-actionable-solutions">Premier Healthcare Alliance (2023). &#8216;Supply Chain Resilience Report.&#8217; Documents 40&#8211;45% emergency procurement premiums under manual, time-pressured processes. </a>Basis for the human-baseline premium assumption. https://www.premierinc.com/supply-chain/insights</p><p><strong>[18] </strong>Gartner (2023). &#8216;<a href="https://www.gartner.com/en/supply-chain/trends/gartner-healthcare-supply-chain-top-25">Supply Chain Top 25 Healthcare.&#8217; AI deployment complexity analysis documenting 25&#8211;40% cost overruns for multi-agent systems vs. single-agent implementations</a>. Basis for the H9 contingency uplift. https://www.gartner.com/en/supply-chain</p><p><strong>[19] </strong>GlobalData (2024). &#8216;Respiratory Drugs Market Analysis: Salbutamol Competitive Landscape.&#8217; Branded salbutamol global market ~$400M in 2023. Basis for the Salbutamol revenue assumption. https://www.globaldata.com/store/report/respiratory-drugs-market-analysis/</p><p><strong>[20] </strong>Levels.fyi / Glassdoor (2025&#8211;2026). Senior AI/ML Engineer and integration engineer contract rate benchmarks, United States market. Rates used: $260/hr senior AI engineer, $210/hr integration engineer, $180/hr QA, $150/hr documentation &#8212; senior contractor rates, not fully-loaded employee cost. https://www.levels.fyi</p><p><strong>[21] </strong>Narayanan, A. and Kapoor, S. (2026). &#8216;<a href="https://www.normaltech.ai/p/why-ai-hasnt-replaced-software-engineers">Why AI hasn&#8217;t replaced software engineers, and won&#8217;t: Coding agents as normal technology.&#8217;</a> AI as Normal Technology. Introduces the &#8216;decide-execute-deliver sandwich&#8217; framework cited in Sections 11.4 and 14.6, and documents 2026 &#8216;AI washing&#8217; layoff cases (Block, Snap, Intuit), WARN Act AI-disclosure data, and the HBR anticipated-vs-realised AI headcount-reduction gap. [open access] https://www.aisnakeoil.com</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Designing Loops - A Practitioner's Short Field Guide]]></title><description><![CDATA[A field census of agentic engineering&#8217;s founding practitioners &#8212; their workflows, anchor documents, and the truth about what still lives inside every loop]]></description><link>https://interestingengineering.substack.com/p/designing-loops-a-practitioners-short</link><guid isPermaLink="false">https://interestingengineering.substack.com/p/designing-loops-a-practitioners-short</guid><dc:creator><![CDATA[Interesting Engineering ++]]></dc:creator><pubDate>Thu, 11 Jun 2026 19:28:46 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!B_fR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfdb0e-01c7-4bde-863b-d7ebbb6bb714_1132x612.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!B_fR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfdb0e-01c7-4bde-863b-d7ebbb6bb714_1132x612.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!B_fR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfdb0e-01c7-4bde-863b-d7ebbb6bb714_1132x612.png 424w, https://substackcdn.com/image/fetch/$s_!B_fR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfdb0e-01c7-4bde-863b-d7ebbb6bb714_1132x612.png 848w, https://substackcdn.com/image/fetch/$s_!B_fR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfdb0e-01c7-4bde-863b-d7ebbb6bb714_1132x612.png 1272w, https://substackcdn.com/image/fetch/$s_!B_fR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfdb0e-01c7-4bde-863b-d7ebbb6bb714_1132x612.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!B_fR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfdb0e-01c7-4bde-863b-d7ebbb6bb714_1132x612.png" width="1132" height="612" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/69bfdb0e-01c7-4bde-863b-d7ebbb6bb714_1132x612.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:612,&quot;width&quot;:1132,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1006344,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfdb0e-01c7-4bde-863b-d7ebbb6bb714_1132x612.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!B_fR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfdb0e-01c7-4bde-863b-d7ebbb6bb714_1132x612.png 424w, https://substackcdn.com/image/fetch/$s_!B_fR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfdb0e-01c7-4bde-863b-d7ebbb6bb714_1132x612.png 848w, https://substackcdn.com/image/fetch/$s_!B_fR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfdb0e-01c7-4bde-863b-d7ebbb6bb714_1132x612.png 1272w, https://substackcdn.com/image/fetch/$s_!B_fR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bfdb0e-01c7-4bde-863b-d7ebbb6bb714_1132x612.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>I. The Six-Word Sentence That Started a Discourse</strong></h2><p><strong>&#8220;My job is to write loops.&#8221;</strong></p><p>Six words. Boris Cherny &#8212; Head of Claude Code at Anthropic &#8212; said them, at the tail of a longer statement that ignited discussions across developer communities worldwide:</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div id="youtube2-Hth_tLaC2j8" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;Hth_tLaC2j8&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/Hth_tLaC2j8?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><blockquote><p><em>&#8220;<strong>I don&#8217;t prompt Claude anymore. I have loops running that prompt Claude and figuring out what to do. My job is to write loops.&#8221;</strong></em></p><p><em><strong>~ Boris Cherny</strong></em></p></blockquote><p>The first two sentences set context. The last one became the discourse. It was amplified almost immediately by Addy Osmani (Google Engineering Director) and <strong>Peter Steinberger (OpenClaw founder, now at OpenAI): &#8220;You shouldn&#8217;t be prompting coding agents anymore. You should be designing loops that prompt your agents.&#8221;</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eWwd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64d0d9a7-2fd6-4715-8c6f-0d8967b5d4ea_680x785.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eWwd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64d0d9a7-2fd6-4715-8c6f-0d8967b5d4ea_680x785.png 424w, https://substackcdn.com/image/fetch/$s_!eWwd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64d0d9a7-2fd6-4715-8c6f-0d8967b5d4ea_680x785.png 848w, https://substackcdn.com/image/fetch/$s_!eWwd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64d0d9a7-2fd6-4715-8c6f-0d8967b5d4ea_680x785.png 1272w, https://substackcdn.com/image/fetch/$s_!eWwd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64d0d9a7-2fd6-4715-8c6f-0d8967b5d4ea_680x785.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eWwd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64d0d9a7-2fd6-4715-8c6f-0d8967b5d4ea_680x785.png" width="680" height="785" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/64d0d9a7-2fd6-4715-8c6f-0d8967b5d4ea_680x785.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:785,&quot;width&quot;:680,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:289712,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64d0d9a7-2fd6-4715-8c6f-0d8967b5d4ea_680x785.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eWwd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64d0d9a7-2fd6-4715-8c6f-0d8967b5d4ea_680x785.png 424w, https://substackcdn.com/image/fetch/$s_!eWwd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64d0d9a7-2fd6-4715-8c6f-0d8967b5d4ea_680x785.png 848w, https://substackcdn.com/image/fetch/$s_!eWwd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64d0d9a7-2fd6-4715-8c6f-0d8967b5d4ea_680x785.png 1272w, https://substackcdn.com/image/fetch/$s_!eWwd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64d0d9a7-2fd6-4715-8c6f-0d8967b5d4ea_680x785.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gXI2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd96ec8b-d7af-4367-ba1e-e35ee736854a_675x452.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gXI2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd96ec8b-d7af-4367-ba1e-e35ee736854a_675x452.png 424w, https://substackcdn.com/image/fetch/$s_!gXI2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd96ec8b-d7af-4367-ba1e-e35ee736854a_675x452.png 848w, https://substackcdn.com/image/fetch/$s_!gXI2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd96ec8b-d7af-4367-ba1e-e35ee736854a_675x452.png 1272w, https://substackcdn.com/image/fetch/$s_!gXI2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd96ec8b-d7af-4367-ba1e-e35ee736854a_675x452.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gXI2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd96ec8b-d7af-4367-ba1e-e35ee736854a_675x452.png" width="675" height="452" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dd96ec8b-d7af-4367-ba1e-e35ee736854a_675x452.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:452,&quot;width&quot;:675,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:87298,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd96ec8b-d7af-4367-ba1e-e35ee736854a_675x452.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gXI2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd96ec8b-d7af-4367-ba1e-e35ee736854a_675x452.png 424w, https://substackcdn.com/image/fetch/$s_!gXI2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd96ec8b-d7af-4367-ba1e-e35ee736854a_675x452.png 848w, https://substackcdn.com/image/fetch/$s_!gXI2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd96ec8b-d7af-4367-ba1e-e35ee736854a_675x452.png 1272w, https://substackcdn.com/image/fetch/$s_!gXI2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd96ec8b-d7af-4367-ba1e-e35ee736854a_675x452.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><a href="https://x.com/addyosmani/status/2064127981161959567?s=20">Addy Osmani - Loop Engineering</a></p><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://x.com/addyosmani/status/2064127981161959567?s=20&quot;,&quot;full_text&quot;:&quot;https://t.co/hIe0UX7z6T&quot;,&quot;username&quot;:&quot;addyosmani&quot;,&quot;name&quot;:&quot;Addy Osmani&quot;,&quot;profile_image_url&quot;:&quot;https://pbs.substack.com/profile_images/2012065253623021570/0BReDfMk_normal.jpg&quot;,&quot;date&quot;:&quot;2026-06-08T23:30:35.000Z&quot;,&quot;photos&quot;:[],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:292,&quot;retweet_count&quot;:1075,&quot;like_count&quot;:7043,&quot;impression_count&quot;:1760666,&quot;expanded_url&quot;:null,&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div><p>By the weekend, the discourse had fractured cleanly. <em><strong>One camp read this as a paradigm shift demanding immediate workflow overhaul. Another camp read it as irresponsible hype. A third camp &#8212; the practitioners who&#8217;d been quietly building agentic systems for twelve months &#8212; barely noticed, because they were already three architectural generations past the argument.</strong></em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!C8Nz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f269d58-9a5d-4525-8b36-5e187039479b_1137x600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!C8Nz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f269d58-9a5d-4525-8b36-5e187039479b_1137x600.png 424w, https://substackcdn.com/image/fetch/$s_!C8Nz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f269d58-9a5d-4525-8b36-5e187039479b_1137x600.png 848w, https://substackcdn.com/image/fetch/$s_!C8Nz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f269d58-9a5d-4525-8b36-5e187039479b_1137x600.png 1272w, https://substackcdn.com/image/fetch/$s_!C8Nz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f269d58-9a5d-4525-8b36-5e187039479b_1137x600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!C8Nz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f269d58-9a5d-4525-8b36-5e187039479b_1137x600.png" width="1137" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6f269d58-9a5d-4525-8b36-5e187039479b_1137x600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1137,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1004727,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f269d58-9a5d-4525-8b36-5e187039479b_1137x600.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!C8Nz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f269d58-9a5d-4525-8b36-5e187039479b_1137x600.png 424w, https://substackcdn.com/image/fetch/$s_!C8Nz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f269d58-9a5d-4525-8b36-5e187039479b_1137x600.png 848w, https://substackcdn.com/image/fetch/$s_!C8Nz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f269d58-9a5d-4525-8b36-5e187039479b_1137x600.png 1272w, https://substackcdn.com/image/fetch/$s_!C8Nz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f269d58-9a5d-4525-8b36-5e187039479b_1137x600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This article does not adjudicate the discourse. It goes past it. I profile six practitioners whose names appear in the primary sources of 2025&#8211;2026 agentic engineering, reconstruct their actual architectures from the record, and benchmark them against the last few months of controlled experiments in the ASCRS harness lab. The goal is not celebration. It is calibration.</p><p><strong>The central finding, stated plainly: the prompt never left the room. It got promoted.</strong></p><blockquote><p><em>The loop is not a replacement for the prompt. It is a container for the prompt. Every loop in every architecture examined here is ultimately anchored by a document someone spent time brainstorming or wrote by hand.</em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!p9-7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2cc24f7-ee0d-415a-ac1a-2a8dc7fce2c3_1123x605.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!p9-7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2cc24f7-ee0d-415a-ac1a-2a8dc7fce2c3_1123x605.png 424w, https://substackcdn.com/image/fetch/$s_!p9-7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2cc24f7-ee0d-415a-ac1a-2a8dc7fce2c3_1123x605.png 848w, https://substackcdn.com/image/fetch/$s_!p9-7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2cc24f7-ee0d-415a-ac1a-2a8dc7fce2c3_1123x605.png 1272w, https://substackcdn.com/image/fetch/$s_!p9-7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2cc24f7-ee0d-415a-ac1a-2a8dc7fce2c3_1123x605.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!p9-7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2cc24f7-ee0d-415a-ac1a-2a8dc7fce2c3_1123x605.png" width="1123" height="605" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f2cc24f7-ee0d-415a-ac1a-2a8dc7fce2c3_1123x605.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:605,&quot;width&quot;:1123,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:727562,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2cc24f7-ee0d-415a-ac1a-2a8dc7fce2c3_1123x605.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!p9-7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2cc24f7-ee0d-415a-ac1a-2a8dc7fce2c3_1123x605.png 424w, https://substackcdn.com/image/fetch/$s_!p9-7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2cc24f7-ee0d-415a-ac1a-2a8dc7fce2c3_1123x605.png 848w, https://substackcdn.com/image/fetch/$s_!p9-7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2cc24f7-ee0d-415a-ac1a-2a8dc7fce2c3_1123x605.png 1272w, https://substackcdn.com/image/fetch/$s_!p9-7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2cc24f7-ee0d-415a-ac1a-2a8dc7fce2c3_1123x605.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div id="youtube2-KDOGK4Mbxq0" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;KDOGK4Mbxq0&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/KDOGK4Mbxq0?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><h2><strong>II. The Field Census: Six Architects</strong></h2><p>The practitioners below represent the primary sources cited across the 2025&#8211;2026 agentic engineering literature. Each profile documents role, background, core technique, anchor documents, and &#8212; critically &#8212; the actual prompts or prompt structures they depend on.</p><h3><strong>Boris Cherny</strong><em> &#183; Head of Claude Code &#183; Anthropic</em></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Sl6l!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb557a61b-d81f-4d17-b974-3f54deb7d5f8_782x480.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Sl6l!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb557a61b-d81f-4d17-b974-3f54deb7d5f8_782x480.png 424w, https://substackcdn.com/image/fetch/$s_!Sl6l!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb557a61b-d81f-4d17-b974-3f54deb7d5f8_782x480.png 848w, https://substackcdn.com/image/fetch/$s_!Sl6l!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb557a61b-d81f-4d17-b974-3f54deb7d5f8_782x480.png 1272w, https://substackcdn.com/image/fetch/$s_!Sl6l!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb557a61b-d81f-4d17-b974-3f54deb7d5f8_782x480.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Sl6l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb557a61b-d81f-4d17-b974-3f54deb7d5f8_782x480.png" width="782" height="480" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b557a61b-d81f-4d17-b974-3f54deb7d5f8_782x480.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:480,&quot;width&quot;:782,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:77614,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb557a61b-d81f-4d17-b974-3f54deb7d5f8_782x480.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Sl6l!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb557a61b-d81f-4d17-b974-3f54deb7d5f8_782x480.png 424w, https://substackcdn.com/image/fetch/$s_!Sl6l!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb557a61b-d81f-4d17-b974-3f54deb7d5f8_782x480.png 848w, https://substackcdn.com/image/fetch/$s_!Sl6l!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb557a61b-d81f-4d17-b974-3f54deb7d5f8_782x480.png 1272w, https://substackcdn.com/image/fetch/$s_!Sl6l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb557a61b-d81f-4d17-b974-3f54deb7d5f8_782x480.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://x.com/bcherny/status/2063792263067754658">Cherny &#8212; @bcherny June 7, 2026 (five tips post, SWE-Marathon context)</a> &#183; <a href="https://www.threads.com/@boris_cherny/post/DZTmR4omqZ3/">Threads mirror of the five tips post</a> &#183; <a href="https://howborisusesclaudecode.com/">How Boris Uses Claude Code (full tip archive)</a> &#183; <a href="https://officechai.com/ai/i-now-just-write-loops-to-prompt-claude-code-claude-code-creator-boris-cherny/">OfficeChai &#8212; Cherny interview (loop-writing background)</a> &#183; <a href="https://infoq.com/news/2026/01/claude-code-creator-workflow/">InfoQ: Claude Code Creator Workflow (January 2026 setup)</a> &#183; <a href="https://tech.yahoo.com/ai/claude/articles/anthropic-boris-cherny-creator-claude-205645586.html">Fortune Brainstorm Tech (Yahoo) &#8212; thousands of agents quote</a></p><p><strong>What Cherny Actually Runs: The Five-Tip Framework</strong></p><p>The misreading of Cherny&#8217;s claim is that he eliminated prompting. He did not. His June 7, 2026 post &#8212; five practical tips for running Opus autonomously for hours or days &#8212; is the most operationally precise statement he has made about his actual workflow. Mapped onto the hierarchy, the five tips are not advice about productivity. They are a configuration checklist for the L2&#8211;L4 architecture (details further below):</p><blockquote><p><strong>CHERNY &#8212; Five-Tip Framework (@bcherny, June 7 2026, in response to SWE-Marathon benchmarks)</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ykbw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F377b5ed5-1baa-4b38-a4d2-47b95edd1e93_660x452.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ykbw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F377b5ed5-1baa-4b38-a4d2-47b95edd1e93_660x452.png 424w, https://substackcdn.com/image/fetch/$s_!Ykbw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F377b5ed5-1baa-4b38-a4d2-47b95edd1e93_660x452.png 848w, https://substackcdn.com/image/fetch/$s_!Ykbw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F377b5ed5-1baa-4b38-a4d2-47b95edd1e93_660x452.png 1272w, https://substackcdn.com/image/fetch/$s_!Ykbw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F377b5ed5-1baa-4b38-a4d2-47b95edd1e93_660x452.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ykbw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F377b5ed5-1baa-4b38-a4d2-47b95edd1e93_660x452.png" width="660" height="452" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/377b5ed5-1baa-4b38-a4d2-47b95edd1e93_660x452.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:452,&quot;width&quot;:660,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:49905,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F377b5ed5-1baa-4b38-a4d2-47b95edd1e93_660x452.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ykbw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F377b5ed5-1baa-4b38-a4d2-47b95edd1e93_660x452.png 424w, https://substackcdn.com/image/fetch/$s_!Ykbw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F377b5ed5-1baa-4b38-a4d2-47b95edd1e93_660x452.png 848w, https://substackcdn.com/image/fetch/$s_!Ykbw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F377b5ed5-1baa-4b38-a4d2-47b95edd1e93_660x452.png 1272w, https://substackcdn.com/image/fetch/$s_!Ykbw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F377b5ed5-1baa-4b38-a4d2-47b95edd1e93_660x452.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div></blockquote><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://x.com/bcherny/status/2063792263067754658">@bcherny &#8212; June 7 2026 post (five tips, SWE-Marathon context)</a> &#183; <a href="https://www.threads.com/@boris_cherny/post/DZTmR4omqZ3/">Threads mirror</a> &#183; <a href="https://code.claude.com/docs/en/goal">/goal vs /loop distinction &#8212; Claude Code docs</a> &#183; <a href="https://blog.dailydoseofds.com/p/claude-codes-goal-command">Avi Chawla &#8212; four autonomy modes explained</a><em> [/goal and /loop are verbatim Claude Code command names. The five tips are verbatim from Cherny&#8217;s post. Hierarchy level annotations (L2, L3) are part of the analysis.]</em></p><p>Tip 5 &#8212; self-verification end-to-end &#8212; is the one practitioners skip and then wonder why long autonomous runs drift. The loop can produce technically correct code that fails in the actual running environment. Cherny&#8217;s solution is to put the environment itself inside the verification loop: boot the web server, open the browser extension, run the mobile simulator. The judge is not a model reading a transcript. It is the product running.</p><p>The earlier PR management loop fits within this framework as a Tip 3 example &#8212; using /loop (cadence-based, checking the queue on schedule) rather than /goal (which would be used when the task has a definable completion state like &#8216;all tests pass&#8217;). The distinction matters at scale: /loop is for monitoring and polling; /goal is for bounded work with a finish line.</p><h3><strong>Geoffrey Huntley</strong><em> &#183; Creator, Ralph Wiggum Technique &#183; Open Source</em></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ExN7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46febad3-9a51-4c5e-bc9d-e7eb2c5cd75e_786x271.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ExN7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46febad3-9a51-4c5e-bc9d-e7eb2c5cd75e_786x271.png 424w, https://substackcdn.com/image/fetch/$s_!ExN7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46febad3-9a51-4c5e-bc9d-e7eb2c5cd75e_786x271.png 848w, https://substackcdn.com/image/fetch/$s_!ExN7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46febad3-9a51-4c5e-bc9d-e7eb2c5cd75e_786x271.png 1272w, https://substackcdn.com/image/fetch/$s_!ExN7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46febad3-9a51-4c5e-bc9d-e7eb2c5cd75e_786x271.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ExN7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46febad3-9a51-4c5e-bc9d-e7eb2c5cd75e_786x271.png" width="786" height="271" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/46febad3-9a51-4c5e-bc9d-e7eb2c5cd75e_786x271.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:271,&quot;width&quot;:786,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:43603,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46febad3-9a51-4c5e-bc9d-e7eb2c5cd75e_786x271.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ExN7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46febad3-9a51-4c5e-bc9d-e7eb2c5cd75e_786x271.png 424w, https://substackcdn.com/image/fetch/$s_!ExN7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46febad3-9a51-4c5e-bc9d-e7eb2c5cd75e_786x271.png 848w, https://substackcdn.com/image/fetch/$s_!ExN7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46febad3-9a51-4c5e-bc9d-e7eb2c5cd75e_786x271.png 1272w, https://substackcdn.com/image/fetch/$s_!ExN7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46febad3-9a51-4c5e-bc9d-e7eb2c5cd75e_786x271.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://ghuntley.com/ralph/">ghuntley.com/ralph/ &#8212; original technique post</a> &#183; <a href="https://github.com/anthropics/claude-code/blob/main/plugins/ralph-wiggum/README.md">Anthropic official ralph-wiggum plugin README</a> &#183; <a href="https://devinterrupted.substack.com/p/inventing-the-ralph-wiggum-loop-creator">Dev Interrupted &#8212; Inventing the Ralph Wiggum Loop (interview)</a> &#183; <a href="https://paddo.dev/blog/ralph-wiggum-autonomous-loops/">paddo.dev &#8212; Ralph Wiggum autonomous loops deep dive</a></p><p><strong>The Ralph Architecture</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3_PN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38283a6a-2f5a-428c-b29e-1097080cf525_473x372.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3_PN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38283a6a-2f5a-428c-b29e-1097080cf525_473x372.png 424w, https://substackcdn.com/image/fetch/$s_!3_PN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38283a6a-2f5a-428c-b29e-1097080cf525_473x372.png 848w, https://substackcdn.com/image/fetch/$s_!3_PN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38283a6a-2f5a-428c-b29e-1097080cf525_473x372.png 1272w, https://substackcdn.com/image/fetch/$s_!3_PN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38283a6a-2f5a-428c-b29e-1097080cf525_473x372.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3_PN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38283a6a-2f5a-428c-b29e-1097080cf525_473x372.png" width="473" height="372" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/38283a6a-2f5a-428c-b29e-1097080cf525_473x372.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:372,&quot;width&quot;:473,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:17684,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38283a6a-2f5a-428c-b29e-1097080cf525_473x372.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3_PN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38283a6a-2f5a-428c-b29e-1097080cf525_473x372.png 424w, https://substackcdn.com/image/fetch/$s_!3_PN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38283a6a-2f5a-428c-b29e-1097080cf525_473x372.png 848w, https://substackcdn.com/image/fetch/$s_!3_PN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38283a6a-2f5a-428c-b29e-1097080cf525_473x372.png 1272w, https://substackcdn.com/image/fetch/$s_!3_PN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38283a6a-2f5a-428c-b29e-1097080cf525_473x372.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://ghuntley.com/ralph/">ghuntley.com &#8212; &#8216;Ralph is a bash loop&#8217;</a> &#183; <a href="https://github.com/anthropics/claude-code/blob/main/plugins/ralph-wiggum/README.md">Anthropic plugin README (Stop hook mechanism)</a> &#183; <a href="https://github.com/wiggumdev/ralph">github.com/wiggumdev/ralph (implementation)</a> &#183; <a href="https://pfkimmerle.substack.com/p/ralph-wiggum-loop">Syntax+Glitter &#8212; Ralph Wiggum Loop explainer</a> &#183; <a href="https://blog.sondera.ai/p/ralph-wiggum-principal-skinner-agent-reliability">Sondera &#8212; Supervising Ralph (stateless resampling analysis)</a><em> [Architecture is reconstructed from Huntley&#8217;s public posts and the official Anthropic plugin README]</em></p><p><strong>What Huntley Actually Prompts</strong></p><p>The Ralph technique&#8217;s power is inseparable from PROMPT.md &#8212; the anchor document that gets re-injected every iteration. A typical PROMPT.md is not a sentence; it is a structured specification containing goal, constraints, success criteria, and current state pointer. The loop resamples it fresh each time, avoiding the &#8220;context rot&#8221; failure mode where a growing conversation history causes the model to navigate toward its own prior attempts rather than toward the objective.</p><blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!S29u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ac18024-25e8-4920-bfc2-305c6a6161c0_810x401.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!S29u!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ac18024-25e8-4920-bfc2-305c6a6161c0_810x401.png 424w, https://substackcdn.com/image/fetch/$s_!S29u!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ac18024-25e8-4920-bfc2-305c6a6161c0_810x401.png 848w, https://substackcdn.com/image/fetch/$s_!S29u!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ac18024-25e8-4920-bfc2-305c6a6161c0_810x401.png 1272w, https://substackcdn.com/image/fetch/$s_!S29u!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ac18024-25e8-4920-bfc2-305c6a6161c0_810x401.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!S29u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ac18024-25e8-4920-bfc2-305c6a6161c0_810x401.png" width="810" height="401" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1ac18024-25e8-4920-bfc2-305c6a6161c0_810x401.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:401,&quot;width&quot;:810,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:27472,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ac18024-25e8-4920-bfc2-305c6a6161c0_810x401.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!S29u!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ac18024-25e8-4920-bfc2-305c6a6161c0_810x401.png 424w, https://substackcdn.com/image/fetch/$s_!S29u!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ac18024-25e8-4920-bfc2-305c6a6161c0_810x401.png 848w, https://substackcdn.com/image/fetch/$s_!S29u!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ac18024-25e8-4920-bfc2-305c6a6161c0_810x401.png 1272w, https://substackcdn.com/image/fetch/$s_!S29u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ac18024-25e8-4920-bfc2-305c6a6161c0_810x401.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div></blockquote><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://ghuntley.com/ralph/">ghuntley.com/ralph/ &#8212; PROMPT.md discipline</a> &#183; <a href="https://github.com/anthropics/claude-code/blob/main/plugins/ralph-wiggum/README.md">Anthropic plugin README &#8212; completion-promise mechanism</a> &#183; <a href="https://wiggum.dev">wiggumdev/ralph &#8212; full docs</a> &#183; <a href="https://addyosmani.com/blog/self-improving-agents/">Addyosmani.com &#8212; self-improving agents (PROMPT.md usage)</a><em> [PROMPT.md template is a reconstruction based on Huntley&#8217;s documented technique; specific field names are inferred from the plugin spec and ghuntley.com]</em></p><h3><strong>Steve Yegge</strong><em> &#183; Creator, Gas Town / Gas City &#183; Ex-Amazon, Google, Sourcegraph</em></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lPvw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc810633-7da8-41fc-a074-318db7bc0736_782x322.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lPvw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc810633-7da8-41fc-a074-318db7bc0736_782x322.png 424w, https://substackcdn.com/image/fetch/$s_!lPvw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc810633-7da8-41fc-a074-318db7bc0736_782x322.png 848w, https://substackcdn.com/image/fetch/$s_!lPvw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc810633-7da8-41fc-a074-318db7bc0736_782x322.png 1272w, https://substackcdn.com/image/fetch/$s_!lPvw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc810633-7da8-41fc-a074-318db7bc0736_782x322.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lPvw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc810633-7da8-41fc-a074-318db7bc0736_782x322.png" width="782" height="322" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dc810633-7da8-41fc-a074-318db7bc0736_782x322.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:322,&quot;width&quot;:782,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:54324,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc810633-7da8-41fc-a074-318db7bc0736_782x322.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lPvw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc810633-7da8-41fc-a074-318db7bc0736_782x322.png 424w, https://substackcdn.com/image/fetch/$s_!lPvw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc810633-7da8-41fc-a074-318db7bc0736_782x322.png 848w, https://substackcdn.com/image/fetch/$s_!lPvw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc810633-7da8-41fc-a074-318db7bc0736_782x322.png 1272w, https://substackcdn.com/image/fetch/$s_!lPvw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc810633-7da8-41fc-a074-318db7bc0736_782x322.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16dd04">Steve Yegge &#8212; Welcome to Gas Town (original)</a> &#183; <a href="https://steve-yegge.medium.com/welcome-to-gas-city-57f564bb3607">Steve Yegge &#8212; Welcome to Gas City (SDK)</a> &#183; <a href="https://cloudnativenow.com/features/gas-town-what-kubernetes-for-ai-coding-agents-actually-looks-like/">Cloud Native Now &#8212; Gas Town deep dive</a> &#183; <a href="https://github.com/steveyegge/gastown">steveyegge/gastown (GitHub)</a></p><p><strong>The Gas Town Architecture</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JZ7f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45687a14-f5b0-4925-b413-b606ff7df39f_541x376.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JZ7f!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45687a14-f5b0-4925-b413-b606ff7df39f_541x376.png 424w, https://substackcdn.com/image/fetch/$s_!JZ7f!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45687a14-f5b0-4925-b413-b606ff7df39f_541x376.png 848w, https://substackcdn.com/image/fetch/$s_!JZ7f!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45687a14-f5b0-4925-b413-b606ff7df39f_541x376.png 1272w, https://substackcdn.com/image/fetch/$s_!JZ7f!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45687a14-f5b0-4925-b413-b606ff7df39f_541x376.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JZ7f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45687a14-f5b0-4925-b413-b606ff7df39f_541x376.png" width="541" height="376" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/45687a14-f5b0-4925-b413-b606ff7df39f_541x376.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:376,&quot;width&quot;:541,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:25137,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45687a14-f5b0-4925-b413-b606ff7df39f_541x376.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JZ7f!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45687a14-f5b0-4925-b413-b606ff7df39f_541x376.png 424w, https://substackcdn.com/image/fetch/$s_!JZ7f!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45687a14-f5b0-4925-b413-b606ff7df39f_541x376.png 848w, https://substackcdn.com/image/fetch/$s_!JZ7f!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45687a14-f5b0-4925-b413-b606ff7df39f_541x376.png 1272w, https://substackcdn.com/image/fetch/$s_!JZ7f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45687a14-f5b0-4925-b413-b606ff7df39f_541x376.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16dd04">Steve Yegge &#8212; Welcome to Gas Town (roles defined)</a> &#183; <a href="https://cloudnativenow.com/features/gas-town-what-kubernetes-for-ai-coding-agents-actually-looks-like/">Cloud Native Now &#8212; &#8216;Kubernetes for AI&#8217; analysis + role breakdown</a> &#183; <a href="https://codex.danielvaughan.com/2026/04/08/gas-town-multi-agent-factory/">Codex Blog &#8212; Gas Town roles + Beads analysis</a> &#183; <a href="https://paddo.dev/blog/gastown-two-kinds-of-multi-agent/">paddo.dev &#8212; GasTown and the Two Kinds of Multi-Agent</a> &#183; <a href="https://reading.torqsoftware.com/notes/software/ai-ml/agentic-coding/2026-01-15-gas-town-multi-agent-orchestration-framework/">Reading list &#8212; Gas Town architecture summary</a><em> [Role names (Mayor, Polecat, Refinery, Witness, Deacon) and GUPP protocol directly from Yegge&#8217;s original Medium post]</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BeyD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c3696de-3c79-4bd0-b4c3-14eafc27bb6e_1127x602.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BeyD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c3696de-3c79-4bd0-b4c3-14eafc27bb6e_1127x602.png 424w, https://substackcdn.com/image/fetch/$s_!BeyD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c3696de-3c79-4bd0-b4c3-14eafc27bb6e_1127x602.png 848w, https://substackcdn.com/image/fetch/$s_!BeyD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c3696de-3c79-4bd0-b4c3-14eafc27bb6e_1127x602.png 1272w, https://substackcdn.com/image/fetch/$s_!BeyD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c3696de-3c79-4bd0-b4c3-14eafc27bb6e_1127x602.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BeyD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c3696de-3c79-4bd0-b4c3-14eafc27bb6e_1127x602.png" width="1127" height="602" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0c3696de-3c79-4bd0-b4c3-14eafc27bb6e_1127x602.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:602,&quot;width&quot;:1127,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:933896,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c3696de-3c79-4bd0-b4c3-14eafc27bb6e_1127x602.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BeyD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c3696de-3c79-4bd0-b4c3-14eafc27bb6e_1127x602.png 424w, https://substackcdn.com/image/fetch/$s_!BeyD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c3696de-3c79-4bd0-b4c3-14eafc27bb6e_1127x602.png 848w, https://substackcdn.com/image/fetch/$s_!BeyD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c3696de-3c79-4bd0-b4c3-14eafc27bb6e_1127x602.png 1272w, https://substackcdn.com/image/fetch/$s_!BeyD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c3696de-3c79-4bd0-b4c3-14eafc27bb6e_1127x602.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Yegge&#8217;s core insight aligns with Geoffrey Huntley&#8217;s: the AI&#8217;s memory is the version-controlled repository, not the chat history. Twenty to thirty agents can work in parallel without colliding because their shared state is code &#8212; immutable, auditable, mergeable</strong>. What often gets lost in descriptions of Gas Town is that each worker role is defined by a role prompt. The Mayor has system-level orchestration instructions. Polecats have task-execution instructions. Refinery has merge-management instructions. Yegge didn&#8217;t stop prompting; he wrote seven distinct prompt architectures and then automated who calls them.</p><h3><strong>Peter Steinberger</strong><em> &#183; Creator, OpenClaw &#183; Former CEO PSPDFKit &#183; OpenAI (from Feb 2026)</em></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EZ8S!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2a044f3-4fb2-45ac-9c99-c09c2070938d_863x375.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EZ8S!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2a044f3-4fb2-45ac-9c99-c09c2070938d_863x375.png 424w, https://substackcdn.com/image/fetch/$s_!EZ8S!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2a044f3-4fb2-45ac-9c99-c09c2070938d_863x375.png 848w, https://substackcdn.com/image/fetch/$s_!EZ8S!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2a044f3-4fb2-45ac-9c99-c09c2070938d_863x375.png 1272w, https://substackcdn.com/image/fetch/$s_!EZ8S!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2a044f3-4fb2-45ac-9c99-c09c2070938d_863x375.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EZ8S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2a044f3-4fb2-45ac-9c99-c09c2070938d_863x375.png" width="863" height="375" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a2a044f3-4fb2-45ac-9c99-c09c2070938d_863x375.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:375,&quot;width&quot;:863,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:54224,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2a044f3-4fb2-45ac-9c99-c09c2070938d_863x375.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EZ8S!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2a044f3-4fb2-45ac-9c99-c09c2070938d_863x375.png 424w, https://substackcdn.com/image/fetch/$s_!EZ8S!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2a044f3-4fb2-45ac-9c99-c09c2070938d_863x375.png 848w, https://substackcdn.com/image/fetch/$s_!EZ8S!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2a044f3-4fb2-45ac-9c99-c09c2070938d_863x375.png 1272w, https://substackcdn.com/image/fetch/$s_!EZ8S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2a044f3-4fb2-45ac-9c99-c09c2070938d_863x375.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://steipete.me/posts/just-talk-to-it">steipete.me &#8212; Just Talk To It (workflow)</a> &#183; <a href="https://openclaw.report/news/peter-steinberger-openclaw-vision">OpenClaw VISION.md analysis</a> &#183; <a href="https://fortune.com/2026/02/19/openclaw-who-is-peter-steinberger-openai-sam-altman-anthropic-moltbook/">Fortune &#8212; Who is Peter Steinberger?</a> &#183; <a href="https://en.wikipedia.org/wiki/Peter_Steinberger_(programmer)">Wikipedia &#8212; Peter Steinberger (programmer)</a> &#183; <a href="https://grokipedia.com/page/peter-steinberger">Grokipedia &#8212; Steinberger agentic engineering</a></p><p><strong>VISION.md: The Document That Runs Before Every Loop</strong></p><p>Steinberger&#8217;s most significant contribution to the practitioner literature is not a tool or a framework. It is a practice: writing VISION.md before writing any loop or agent configuration. For OpenClaw, the VISION.md is 94 lines &#8212; no marketing language, no growth projections. Its first sentence is: &#8216;OpenClaw is the AI that actually does things.&#8217;</p><blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9fXZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe430c9c9-84f3-4cd6-a187-15ada1b652b5_887x462.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9fXZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe430c9c9-84f3-4cd6-a187-15ada1b652b5_887x462.png 424w, https://substackcdn.com/image/fetch/$s_!9fXZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe430c9c9-84f3-4cd6-a187-15ada1b652b5_887x462.png 848w, https://substackcdn.com/image/fetch/$s_!9fXZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe430c9c9-84f3-4cd6-a187-15ada1b652b5_887x462.png 1272w, https://substackcdn.com/image/fetch/$s_!9fXZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe430c9c9-84f3-4cd6-a187-15ada1b652b5_887x462.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9fXZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe430c9c9-84f3-4cd6-a187-15ada1b652b5_887x462.png" width="887" height="462" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e430c9c9-84f3-4cd6-a187-15ada1b652b5_887x462.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:462,&quot;width&quot;:887,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:37492,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe430c9c9-84f3-4cd6-a187-15ada1b652b5_887x462.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9fXZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe430c9c9-84f3-4cd6-a187-15ada1b652b5_887x462.png 424w, https://substackcdn.com/image/fetch/$s_!9fXZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe430c9c9-84f3-4cd6-a187-15ada1b652b5_887x462.png 848w, https://substackcdn.com/image/fetch/$s_!9fXZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe430c9c9-84f3-4cd6-a187-15ada1b652b5_887x462.png 1272w, https://substackcdn.com/image/fetch/$s_!9fXZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe430c9c9-84f3-4cd6-a187-15ada1b652b5_887x462.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div></blockquote><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://openclaw.report/news/peter-steinberger-openclaw-vision">OpenClaw VISION.md &#8212; full analysis</a> &#183; <a href="https://steipete.me/posts/just-talk-to-it">steipete.me &#8212; VISION.md discipline (workflow post)</a> &#183; <a href="https://steipete.me/">steipete.me &#8212; full personal site / posts</a><em> [VISION.md template headings are reconstructed from Steinberger&#8217;s public OpenClaw VISION.md (94 lines). Section labels are paraphrased, not verbatim]</em></p><h3><strong>Addy Osmani</strong><em> &#183; VP Engineering &#183; Google Chrome &#183; Author, Learning Patterns</em></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hXOo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c8e84dd-77ef-46dd-b8c9-c2bef2ec1fe9_862x355.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hXOo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c8e84dd-77ef-46dd-b8c9-c2bef2ec1fe9_862x355.png 424w, https://substackcdn.com/image/fetch/$s_!hXOo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c8e84dd-77ef-46dd-b8c9-c2bef2ec1fe9_862x355.png 848w, https://substackcdn.com/image/fetch/$s_!hXOo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c8e84dd-77ef-46dd-b8c9-c2bef2ec1fe9_862x355.png 1272w, https://substackcdn.com/image/fetch/$s_!hXOo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c8e84dd-77ef-46dd-b8c9-c2bef2ec1fe9_862x355.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hXOo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c8e84dd-77ef-46dd-b8c9-c2bef2ec1fe9_862x355.png" width="862" height="355" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2c8e84dd-77ef-46dd-b8c9-c2bef2ec1fe9_862x355.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:355,&quot;width&quot;:862,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:52915,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c8e84dd-77ef-46dd-b8c9-c2bef2ec1fe9_862x355.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hXOo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c8e84dd-77ef-46dd-b8c9-c2bef2ec1fe9_862x355.png 424w, https://substackcdn.com/image/fetch/$s_!hXOo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c8e84dd-77ef-46dd-b8c9-c2bef2ec1fe9_862x355.png 848w, https://substackcdn.com/image/fetch/$s_!hXOo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c8e84dd-77ef-46dd-b8c9-c2bef2ec1fe9_862x355.png 1272w, https://substackcdn.com/image/fetch/$s_!hXOo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c8e84dd-77ef-46dd-b8c9-c2bef2ec1fe9_862x355.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://addyo.substack.com/p/loop-engineering">addyo.substack.com &#8212; Loop Engineering (Jun 8 2026)</a> &#183; <a href="https://addyosmani.com/blog/self-improving-agents/">addyosmani.com &#8212; Self-Improving Coding Agents</a> &#183; <a href="https://addyo.substack.com/p/the-80-problem-in-agentic-coding">addyo.substack.com &#8212; The 80% Problem in Agentic Coding</a> &#183; <a href="https://addyosmani.com/blog/code-agent-orchestra/">addyosmani.com &#8212; Code Agent Orchestra</a></p><p><strong>Osmani&#8217;s Five Loop Building Blocks</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1XdR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c03f3f-bcb2-48ef-acff-fe96a78a14b6_510x317.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1XdR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c03f3f-bcb2-48ef-acff-fe96a78a14b6_510x317.png 424w, https://substackcdn.com/image/fetch/$s_!1XdR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c03f3f-bcb2-48ef-acff-fe96a78a14b6_510x317.png 848w, https://substackcdn.com/image/fetch/$s_!1XdR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c03f3f-bcb2-48ef-acff-fe96a78a14b6_510x317.png 1272w, https://substackcdn.com/image/fetch/$s_!1XdR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c03f3f-bcb2-48ef-acff-fe96a78a14b6_510x317.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1XdR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c03f3f-bcb2-48ef-acff-fe96a78a14b6_510x317.png" width="510" height="317" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/17c03f3f-bcb2-48ef-acff-fe96a78a14b6_510x317.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:317,&quot;width&quot;:510,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:23978,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c03f3f-bcb2-48ef-acff-fe96a78a14b6_510x317.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1XdR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c03f3f-bcb2-48ef-acff-fe96a78a14b6_510x317.png 424w, https://substackcdn.com/image/fetch/$s_!1XdR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c03f3f-bcb2-48ef-acff-fe96a78a14b6_510x317.png 848w, https://substackcdn.com/image/fetch/$s_!1XdR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c03f3f-bcb2-48ef-acff-fe96a78a14b6_510x317.png 1272w, https://substackcdn.com/image/fetch/$s_!1XdR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c03f3f-bcb2-48ef-acff-fe96a78a14b6_510x317.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://addyo.substack.com/p/loop-engineering">Osmani &#8212; Loop Engineering (five blocks + automation tab detail)</a> &#183; <a href="https://addyosmani.com/blog/self-improving-agents/">Osmani &#8212; Self-Improving Coding Agents (sub-agent/judge pattern)</a> &#183; <a href="https://addyosmani.com/blog/ai-coding-workflow/">Osmani &#8212; AI Coding Workflow 2026 (spec-first discipline)</a><em> [Five building blocks + 70/30 spec/execution ratio are direct from Osmani&#8217;s June 2026 Loop Engineering post; quoted caution on token costs is verbatim]</em></p><h3><strong>Andrej Karpathy</strong><em> &#183; Co-founder OpenAI &#183; Coined &#8216;Vibe Coding&#8217; and &#8216;Agentic Engineering&#8217; &#183; Anthropic (from May 2026)</em></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dxaS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8e6658f-c4a3-490a-b0d7-22777453c03f_866x398.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dxaS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8e6658f-c4a3-490a-b0d7-22777453c03f_866x398.png 424w, https://substackcdn.com/image/fetch/$s_!dxaS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8e6658f-c4a3-490a-b0d7-22777453c03f_866x398.png 848w, https://substackcdn.com/image/fetch/$s_!dxaS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8e6658f-c4a3-490a-b0d7-22777453c03f_866x398.png 1272w, https://substackcdn.com/image/fetch/$s_!dxaS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8e6658f-c4a3-490a-b0d7-22777453c03f_866x398.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dxaS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8e6658f-c4a3-490a-b0d7-22777453c03f_866x398.png" width="866" height="398" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b8e6658f-c4a3-490a-b0d7-22777453c03f_866x398.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:398,&quot;width&quot;:866,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:59697,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8e6658f-c4a3-490a-b0d7-22777453c03f_866x398.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dxaS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8e6658f-c4a3-490a-b0d7-22777453c03f_866x398.png 424w, https://substackcdn.com/image/fetch/$s_!dxaS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8e6658f-c4a3-490a-b0d7-22777453c03f_866x398.png 848w, https://substackcdn.com/image/fetch/$s_!dxaS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8e6658f-c4a3-490a-b0d7-22777453c03f_866x398.png 1272w, https://substackcdn.com/image/fetch/$s_!dxaS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8e6658f-c4a3-490a-b0d7-22777453c03f_866x398.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://en.wikipedia.org/wiki/Andrej_Karpathy">Wikipedia &#8212; Andrej Karpathy (joined Anthropic May 2026)</a> &#183; <a href="https://thenewstack.io/vibe-coding-is-passe/">The New Stack &#8212; Vibe coding is pass&#233; / agentic engineering</a> &#183; <a href="https://www.mindstudio.ai/blog/karpathy-sequoia-talk-5-predictions-agentic-engineering">MindStudio &#8212; Karpathy Sequoia talk breakdown</a> &#183; <a href="https://github.com/karpathy/autoresearch">github.com/karpathy/autoresearch</a> &#183; <a href="https://www.the-ai-corner.com/p/andrej-karpathy-ai-workflow-shift-agentic-era-2026">The AI Corner &#8212; Karpathy workflow shift 2026</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3YdB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff30945ca-1193-4f5b-94c6-0f9354206c86_1123x613.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3YdB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff30945ca-1193-4f5b-94c6-0f9354206c86_1123x613.png 424w, https://substackcdn.com/image/fetch/$s_!3YdB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff30945ca-1193-4f5b-94c6-0f9354206c86_1123x613.png 848w, https://substackcdn.com/image/fetch/$s_!3YdB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff30945ca-1193-4f5b-94c6-0f9354206c86_1123x613.png 1272w, https://substackcdn.com/image/fetch/$s_!3YdB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff30945ca-1193-4f5b-94c6-0f9354206c86_1123x613.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3YdB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff30945ca-1193-4f5b-94c6-0f9354206c86_1123x613.png" width="1123" height="613" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f30945ca-1193-4f5b-94c6-0f9354206c86_1123x613.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:613,&quot;width&quot;:1123,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:944137,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff30945ca-1193-4f5b-94c6-0f9354206c86_1123x613.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3YdB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff30945ca-1193-4f5b-94c6-0f9354206c86_1123x613.png 424w, https://substackcdn.com/image/fetch/$s_!3YdB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff30945ca-1193-4f5b-94c6-0f9354206c86_1123x613.png 848w, https://substackcdn.com/image/fetch/$s_!3YdB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff30945ca-1193-4f5b-94c6-0f9354206c86_1123x613.png 1272w, https://substackcdn.com/image/fetch/$s_!3YdB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff30945ca-1193-4f5b-94c6-0f9354206c86_1123x613.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>The Karpathy Inflection Model</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bzoN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F732624ad-975d-4c3d-ab51-9fe14a998eb8_502x322.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bzoN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F732624ad-975d-4c3d-ab51-9fe14a998eb8_502x322.png 424w, https://substackcdn.com/image/fetch/$s_!bzoN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F732624ad-975d-4c3d-ab51-9fe14a998eb8_502x322.png 848w, https://substackcdn.com/image/fetch/$s_!bzoN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F732624ad-975d-4c3d-ab51-9fe14a998eb8_502x322.png 1272w, https://substackcdn.com/image/fetch/$s_!bzoN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F732624ad-975d-4c3d-ab51-9fe14a998eb8_502x322.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bzoN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F732624ad-975d-4c3d-ab51-9fe14a998eb8_502x322.png" width="502" height="322" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/732624ad-975d-4c3d-ab51-9fe14a998eb8_502x322.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:322,&quot;width&quot;:502,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:26761,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F732624ad-975d-4c3d-ab51-9fe14a998eb8_502x322.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bzoN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F732624ad-975d-4c3d-ab51-9fe14a998eb8_502x322.png 424w, https://substackcdn.com/image/fetch/$s_!bzoN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F732624ad-975d-4c3d-ab51-9fe14a998eb8_502x322.png 848w, https://substackcdn.com/image/fetch/$s_!bzoN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F732624ad-975d-4c3d-ab51-9fe14a998eb8_502x322.png 1272w, https://substackcdn.com/image/fetch/$s_!bzoN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F732624ad-975d-4c3d-ab51-9fe14a998eb8_502x322.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://www.mindstudio.ai/blog/karpathy-sequoia-talk-5-predictions-agentic-engineering">MindStudio &#8212; Karpathy Sequoia talk: &#8216;December is when it really flipped&#8217;</a> &#183; <a href="https://www.the-ai-corner.com/p/andrej-karpathy-ai-workflow-shift-agentic-era-2026">The AI Corner &#8212; &#8216;In December... I went from 80-20 to 20-80&#8217; (verbatim ratio)</a> &#183; <a href="https://www.ibm.com/think/topics/agentic-engineering">IBM Think &#8212; Agentic Engineering definition</a> &#183; <a href="https://www.nxcode.io/resources/news/agentic-engineering-complete-guide-vibe-coding-ai-agents-2026">NxCode &#8212; Agentic Engineering complete guide</a><em> [80/20 &#8594; 20/80 ratio and December 2025 inflection point are verbatim from Karpathy&#8217;s Sequoia talk. Jun 2026 ~10/90 estimate is an extrapolation, not a direct quote]</em></p><h2><strong>III. The Convergence: What All Six Agree On</strong></h2><p>Six practitioners. Six different tools, scales, and domains. But look at the architecture beneath the surface, and a convergence emerges. Every system in this field census depends on the same structural primitives, arrived at independently through the same failure modes.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Pwc8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b38df59-6ee0-4e5b-bb5f-c0f40c93e28e_862x428.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Pwc8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b38df59-6ee0-4e5b-bb5f-c0f40c93e28e_862x428.png 424w, https://substackcdn.com/image/fetch/$s_!Pwc8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b38df59-6ee0-4e5b-bb5f-c0f40c93e28e_862x428.png 848w, https://substackcdn.com/image/fetch/$s_!Pwc8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b38df59-6ee0-4e5b-bb5f-c0f40c93e28e_862x428.png 1272w, https://substackcdn.com/image/fetch/$s_!Pwc8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b38df59-6ee0-4e5b-bb5f-c0f40c93e28e_862x428.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Pwc8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b38df59-6ee0-4e5b-bb5f-c0f40c93e28e_862x428.png" width="862" height="428" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6b38df59-6ee0-4e5b-bb5f-c0f40c93e28e_862x428.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:428,&quot;width&quot;:862,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:51049,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b38df59-6ee0-4e5b-bb5f-c0f40c93e28e_862x428.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Pwc8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b38df59-6ee0-4e5b-bb5f-c0f40c93e28e_862x428.png 424w, https://substackcdn.com/image/fetch/$s_!Pwc8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b38df59-6ee0-4e5b-bb5f-c0f40c93e28e_862x428.png 848w, https://substackcdn.com/image/fetch/$s_!Pwc8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b38df59-6ee0-4e5b-bb5f-c0f40c93e28e_862x428.png 1272w, https://substackcdn.com/image/fetch/$s_!Pwc8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b38df59-6ee0-4e5b-bb5f-c0f40c93e28e_862x428.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://howborisusesclaudecode.com/">Cherny &#8212; howborisusesclaudecode.com</a> &#183; <a href="https://ghuntley.com/ralph/">Huntley &#8212; ghuntley.com/ralph/</a> &#183; <a href="https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16dd04">Yegge &#8212; Welcome to Gas Town</a> &#183; <a href="https://steipete.me/">Steinberger &#8212; steipete.me</a> &#183; <a href="https://addyo.substack.com/p/loop-engineering">Osmani &#8212; Loop Engineering</a> &#183; <a href="https://github.com/karpathy/autoresearch">Karpathy &#8212; autoresearch (program.md)</a><em> [Each row is synthesized from the primary sources listed. &#8216;Karpathy val_bpb metric&#8217; is specific to the autoresearch repo; other eval gate entries are reconstructed from public descriptions]</em></p><p>The convergence is striking. Every practitioner:</p><p><strong>&#8226; </strong>Writes a human-authored anchor document before running any loop</p><p><strong>&#8226; </strong>Uses git as the durable memory substrate &#8212; not conversation history</p><p><strong>&#8226; </strong>Separates the agent doing work from the agent evaluating work</p><p><strong>&#8226; </strong>Implements an explicit halting condition (max iterations, completion token, dollar budget)</p><p><strong>&#8226; </strong>Preserves per-agent isolation (worktrees, separate checkouts, sandboxed environments)</p><blockquote><p><em>The loop is how they automated the iteration. The anchor document is how they preserved their intent. These are not in tension. The second is required for the first to be trustworthy.</em></p></blockquote><h3><strong>The Prompt Moved Upstream</strong></h3><p>Here is the structural insight the discourse missed: the transition from prompting to loop-writing is not the elimination of prompting. It is the promotion of prompting. The prompt moved from a real-time, session-by-session interaction into an architectural artifact that gets written once, maintained carefully, and read by every agent on every iteration.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BTOy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F403f785e-d898-4b49-a307-c2cb8e2b06ad_536x352.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BTOy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F403f785e-d898-4b49-a307-c2cb8e2b06ad_536x352.png 424w, https://substackcdn.com/image/fetch/$s_!BTOy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F403f785e-d898-4b49-a307-c2cb8e2b06ad_536x352.png 848w, https://substackcdn.com/image/fetch/$s_!BTOy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F403f785e-d898-4b49-a307-c2cb8e2b06ad_536x352.png 1272w, https://substackcdn.com/image/fetch/$s_!BTOy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F403f785e-d898-4b49-a307-c2cb8e2b06ad_536x352.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BTOy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F403f785e-d898-4b49-a307-c2cb8e2b06ad_536x352.png" width="536" height="352" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/403f785e-d898-4b49-a307-c2cb8e2b06ad_536x352.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:352,&quot;width&quot;:536,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:25007,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F403f785e-d898-4b49-a307-c2cb8e2b06ad_536x352.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BTOy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F403f785e-d898-4b49-a307-c2cb8e2b06ad_536x352.png 424w, https://substackcdn.com/image/fetch/$s_!BTOy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F403f785e-d898-4b49-a307-c2cb8e2b06ad_536x352.png 848w, https://substackcdn.com/image/fetch/$s_!BTOy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F403f785e-d898-4b49-a307-c2cb8e2b06ad_536x352.png 1272w, https://substackcdn.com/image/fetch/$s_!BTOy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F403f785e-d898-4b49-a307-c2cb8e2b06ad_536x352.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://addyo.substack.com/p/loop-engineering">Osmani &#8212; Loop Engineering (Gen 1/2/3 framing)</a> &#183; <a href="https://officechai.com/ai/i-now-just-write-loops-to-prompt-claude-code-claude-code-creator-boris-cherny/">Cherny &#8212; OfficeChai interview (trajectory from IDE to loops)</a> &#183; <a href="https://thenewstack.io/vibe-coding-is-passe/">Karpathy &#8212; &#8216;vibe coding&#8217; &#8594; &#8216;agentic engineering&#8217; evolution</a><em> [Three-generation model is ISR synthesis; Gen 1/2/3 labels are original ISR framing, not a direct quote from any single practitioner]</em></p><h2><strong>IV. The ISR Benchmark: Our Evidence Holds</strong></h2><p>Our series has now run a number of volumes of controlled experiments on the ASCRS benchmark scenario (Hormuz Strait pharmaceutical supply chain). Across thirteen experiment cycles using Claude Code with OpenRouter model routing, the findings from the ASCRS Harness Lab intersect directly with the practitioner observations above &#8212; and in several cases, our experimental evidence explains failures that the practitioner literature only describes qualitatively.</p><h3><strong>Finding 1: Task Structure Dominates Architectural Complexity</strong></h3><p><a href="https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated">The ASCRS Harness Lab (H1&#8211;H10) </a>produced a result that surprised us at the time: H2 &#8212; a structured, single-turn prompt with defined schema and explicit output format &#8212; outperformed H9, a five-agent parallel swarm, by a significant margin (&#945;=0.920 vs &#945;=0.625). The H9 swarm had more computational resources, more parallelism, and more architectural sophistication. H2 had a better prompt. However, yesterday, &#8220;<a href="https://interestingengineering.substack.com/p/the-harness-lab-automated">The Harness Lab, Automated</a>&#8221; finally named the workflows utilizing tools and self revision loops H3, and H5 - as winners. </p><blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5ARc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F367eeae2-4cc5-4dfe-969f-570a748e3600_597x316.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5ARc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F367eeae2-4cc5-4dfe-969f-570a748e3600_597x316.png 424w, https://substackcdn.com/image/fetch/$s_!5ARc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F367eeae2-4cc5-4dfe-969f-570a748e3600_597x316.png 848w, https://substackcdn.com/image/fetch/$s_!5ARc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F367eeae2-4cc5-4dfe-969f-570a748e3600_597x316.png 1272w, https://substackcdn.com/image/fetch/$s_!5ARc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F367eeae2-4cc5-4dfe-969f-570a748e3600_597x316.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5ARc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F367eeae2-4cc5-4dfe-969f-570a748e3600_597x316.png" width="597" height="316" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/367eeae2-4cc5-4dfe-969f-570a748e3600_597x316.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:316,&quot;width&quot;:597,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:27522,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F367eeae2-4cc5-4dfe-969f-570a748e3600_597x316.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5ARc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F367eeae2-4cc5-4dfe-969f-570a748e3600_597x316.png 424w, https://substackcdn.com/image/fetch/$s_!5ARc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F367eeae2-4cc5-4dfe-969f-570a748e3600_597x316.png 848w, https://substackcdn.com/image/fetch/$s_!5ARc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F367eeae2-4cc5-4dfe-969f-570a748e3600_597x316.png 1272w, https://substackcdn.com/image/fetch/$s_!5ARc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F367eeae2-4cc5-4dfe-969f-570a748e3600_597x316.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Bn7_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08907273-5744-4e3d-9242-4b33736a3563_705x225.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Bn7_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08907273-5744-4e3d-9242-4b33736a3563_705x225.png 424w, https://substackcdn.com/image/fetch/$s_!Bn7_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08907273-5744-4e3d-9242-4b33736a3563_705x225.png 848w, https://substackcdn.com/image/fetch/$s_!Bn7_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08907273-5744-4e3d-9242-4b33736a3563_705x225.png 1272w, https://substackcdn.com/image/fetch/$s_!Bn7_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08907273-5744-4e3d-9242-4b33736a3563_705x225.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Bn7_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08907273-5744-4e3d-9242-4b33736a3563_705x225.png" width="705" height="225" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/08907273-5744-4e3d-9242-4b33736a3563_705x225.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:225,&quot;width&quot;:705,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:99171,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08907273-5744-4e3d-9242-4b33736a3563_705x225.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Bn7_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08907273-5744-4e3d-9242-4b33736a3563_705x225.png 424w, https://substackcdn.com/image/fetch/$s_!Bn7_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08907273-5744-4e3d-9242-4b33736a3563_705x225.png 848w, https://substackcdn.com/image/fetch/$s_!Bn7_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08907273-5744-4e3d-9242-4b33736a3563_705x225.png 1272w, https://substackcdn.com/image/fetch/$s_!Bn7_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08907273-5744-4e3d-9242-4b33736a3563_705x225.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p></p></blockquote><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://interestingengineering.substack.com">ISR &#8212; ASCRS Harness Lab series (H1&#8211;H10 original experiments)</a> &#183; <a href="https://interestingengineering.substack.com">ISR &#8212; The Architecture of Awareness (ASCRS V1&#8211;V4)</a> &#183; <a href="https://github.com/elephantsofneptune/harness-lab-conductor">ISR GitHub &#8212; harness-lab-conductor plugin</a><em> [&#945; scores (0.920, 0.847, 0.791, 0.743, 0.625) are ISR experimental results. These are the authors&#8217; own data and have not been independently replicated.]</em></p><h3><strong>Finding 2: Context Engineering Operates at Two Layers</strong></h3><p><a href="https://interestingengineering.substack.com/p/is-claudes-memory-actually-poor">The Memory A/B experiment</a> (BOOTSTRAP_PROMPT.md injection vs stripped data files) found that context engineering operates at two distinct layers: data structure and prompt injection. The BOOTSTRAP_PROMPT.md &#8212; our equivalent of a VISION.md &#8212; was essential for criteria requiring institutional parameters. Without it, even well-structured data files were insufficient for criteria that required understanding the project&#8217;s purpose. Sources: <a href="https://interestingengineering.substack.com/p/building-on-anthropics-claude">Building on Anthropic&#8217;s Claude</a> and <a href="https://interestingengineering.substack.com/p/is-claudes-memory-actually-poor">Is Claude&#8217;s Memory Actually Poor?</a></p><h3><strong>Finding 3: The Token Tax and Prompt Caching Gaps</strong></h3><p>The StockPilot/CMA decomposition study (97% token reduction across Cycles 0&#8211;4) identified a specific failure mode relevant to practitioner architectures: OpenRouter&#8217;s multi-provider routing breaks prompt caching, causing unnecessary context re-loading on every cycle. This is almost comparable to the failure mode Huntley names as &#8220;context rot&#8221; &#8212; except our experiment documented it in token counts rather than behavior degradation.</p><p><strong>Full article: </strong><a href="https://interestingengineering.substack.com/p/the-structure-is-the-intelligence">The Structure Is The Intelligence</a></p><p><strong>Full article: </strong><a href="https://interestingengineering.substack.com/p/every-company-is-an-agent-waiting">Every Company Is An Agent Waiting to Be Decomposed</a></p><p><strong>Full article: </strong><a href="https://interestingengineering.substack.com/p/the-token-tax">The Token Tax</a></p><h3><strong>Finding 4: The COBOL Pipeline &#8212; Our Five-Agent Architecture</strong></h3><p>The <a href="https://interestingengineering.substack.com/p/the-invisible-codebase">COBOL migration pipeline (&#8221;The Invisible Codebase&#8221;)</a> produced a verified five-agent Python pipeline: <strong>Extractor &#8594; Translator &#8594; Test Generator &#8594; Verifier &#8594; Evaluator.</strong> This is structurally identical to the maker/checker architecture that appears in every practitioner system examined above. The Evaluator is the judge-agent. The Verifier is the stopping-condition checker. The pipeline runs with configurable OpenRouter model routing.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8EQD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9f92565-14d1-40c1-baea-e658fa67d1fa_861x396.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8EQD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9f92565-14d1-40c1-baea-e658fa67d1fa_861x396.png 424w, https://substackcdn.com/image/fetch/$s_!8EQD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9f92565-14d1-40c1-baea-e658fa67d1fa_861x396.png 848w, https://substackcdn.com/image/fetch/$s_!8EQD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9f92565-14d1-40c1-baea-e658fa67d1fa_861x396.png 1272w, https://substackcdn.com/image/fetch/$s_!8EQD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9f92565-14d1-40c1-baea-e658fa67d1fa_861x396.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8EQD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9f92565-14d1-40c1-baea-e658fa67d1fa_861x396.png" width="861" height="396" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a9f92565-14d1-40c1-baea-e658fa67d1fa_861x396.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:396,&quot;width&quot;:861,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:63945,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9f92565-14d1-40c1-baea-e658fa67d1fa_861x396.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8EQD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9f92565-14d1-40c1-baea-e658fa67d1fa_861x396.png 424w, https://substackcdn.com/image/fetch/$s_!8EQD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9f92565-14d1-40c1-baea-e658fa67d1fa_861x396.png 848w, https://substackcdn.com/image/fetch/$s_!8EQD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9f92565-14d1-40c1-baea-e658fa67d1fa_861x396.png 1272w, https://substackcdn.com/image/fetch/$s_!8EQD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9f92565-14d1-40c1-baea-e658fa67d1fa_861x396.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://interestingengineering.substack.com">ISR &#8212; The Architecture of Awareness + ASCRS series</a> &#183; <a href="https://interestingengineering.substack.com">ISR &#8212; The Invisible Codebase (COBOL pipeline)</a> &#183; <a href="https://interestingengineering.substack.com">ISR &#8212; The Structure Is the Intelligence (StockPilot/CMA)</a> &#183; <a href="https://interestingengineering.substack.com">ISR &#8212; The Memory Is the Architecture (memory A/B)</a> &#183; <a href="https://interestingengineering.substack.com">ISR &#8212; The Harness Lab, Automated (Sigma metric)</a> &#183; <a href="https://addyo.substack.com/p/the-80-problem-in-agentic-coding">Osmani &#8212; 80% Problem (practitioner parallel)</a><em> [All ISR findings are from the authors&#8217; own experimental series. Individual article URLs have been aggregated to the ISR Substack root pending publication of specific article links]</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SRVy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa706b50e-c2f8-4f14-9201-85d0c5b53f75_1127x611.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SRVy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa706b50e-c2f8-4f14-9201-85d0c5b53f75_1127x611.png 424w, https://substackcdn.com/image/fetch/$s_!SRVy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa706b50e-c2f8-4f14-9201-85d0c5b53f75_1127x611.png 848w, https://substackcdn.com/image/fetch/$s_!SRVy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa706b50e-c2f8-4f14-9201-85d0c5b53f75_1127x611.png 1272w, https://substackcdn.com/image/fetch/$s_!SRVy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa706b50e-c2f8-4f14-9201-85d0c5b53f75_1127x611.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SRVy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa706b50e-c2f8-4f14-9201-85d0c5b53f75_1127x611.png" width="1127" height="611" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a706b50e-c2f8-4f14-9201-85d0c5b53f75_1127x611.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:611,&quot;width&quot;:1127,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:998283,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa706b50e-c2f8-4f14-9201-85d0c5b53f75_1127x611.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SRVy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa706b50e-c2f8-4f14-9201-85d0c5b53f75_1127x611.png 424w, https://substackcdn.com/image/fetch/$s_!SRVy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa706b50e-c2f8-4f14-9201-85d0c5b53f75_1127x611.png 848w, https://substackcdn.com/image/fetch/$s_!SRVy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa706b50e-c2f8-4f14-9201-85d0c5b53f75_1127x611.png 1272w, https://substackcdn.com/image/fetch/$s_!SRVy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa706b50e-c2f8-4f14-9201-85d0c5b53f75_1127x611.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>V. The Architecture Convergence Map</strong></h2><p>The following diagram represents the architectural pattern that emerges from the intersection of all six practitioner systems and the ISR experimental record. It is not any single practitioner&#8217;s architecture. It is the pattern all of them converged on, labeled with the terminology each uses for equivalent components.</p><blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!E-vT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d9e9c6-adb5-4d7f-9fa5-59e5e53a29b3_617x725.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!E-vT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d9e9c6-adb5-4d7f-9fa5-59e5e53a29b3_617x725.png 424w, https://substackcdn.com/image/fetch/$s_!E-vT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d9e9c6-adb5-4d7f-9fa5-59e5e53a29b3_617x725.png 848w, https://substackcdn.com/image/fetch/$s_!E-vT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d9e9c6-adb5-4d7f-9fa5-59e5e53a29b3_617x725.png 1272w, https://substackcdn.com/image/fetch/$s_!E-vT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d9e9c6-adb5-4d7f-9fa5-59e5e53a29b3_617x725.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!E-vT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d9e9c6-adb5-4d7f-9fa5-59e5e53a29b3_617x725.png" width="617" height="725" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a5d9e9c6-adb5-4d7f-9fa5-59e5e53a29b3_617x725.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:725,&quot;width&quot;:617,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:61476,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d9e9c6-adb5-4d7f-9fa5-59e5e53a29b3_617x725.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!E-vT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d9e9c6-adb5-4d7f-9fa5-59e5e53a29b3_617x725.png 424w, https://substackcdn.com/image/fetch/$s_!E-vT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d9e9c6-adb5-4d7f-9fa5-59e5e53a29b3_617x725.png 848w, https://substackcdn.com/image/fetch/$s_!E-vT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d9e9c6-adb5-4d7f-9fa5-59e5e53a29b3_617x725.png 1272w, https://substackcdn.com/image/fetch/$s_!E-vT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d9e9c6-adb5-4d7f-9fa5-59e5e53a29b3_617x725.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div></blockquote><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://www.anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills">Anthropic &#8212; Agent Skills (L1 SKILL.md layer)</a> &#183; <a href="https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16dd04">Yegge &#8212; Gas Town (L2&#8211;L3: Polecats, Role Beads, git substrate)</a> &#183; <a href="https://ghuntley.com/ralph/">Huntley &#8212; Ralph (L2&#8211;L3: halt condition, PROMPT.md spec)</a> &#183; <a href="https://addyo.substack.com/p/loop-engineering">Osmani &#8212; Loop Engineering (L2: five building blocks)</a> &#183; <a href="https://infoq.com/news/2026/01/claude-code-creator-workflow/">Cherny &#8212; InfoQ workflow (L3: loop contract; git as memory substrate)</a> &#183; <a href="https://github.com/karpathy/autoresearch">Karpathy &#8212; autoresearch (L1: program.md skill; L4: anchor spec)</a> &#183; <a href="https://code.claude.com/docs/en/goal">Claude Code /goal docs (L3: loop contract + L0: condition text)</a><em> [ISR original synthesis. L0&#8211;L5 labels are consistent with the automation hierarchy in Section VI. Infrastructure (git substrate, halt conditions) is not a numbered level &#8212; it supports all levels.]</em></p><h2><strong>VI. The Architecture Goes Native</strong></h2><p>Between May and June 2026, Anthropic productised the same architectural patterns as first-class Claude Code primitives &#8212; four distinct autonomy modes encoding what six independent practitioners had arrived at through months of production experience.</p><p>This matters for any reading of the hierarchy above. The loop control layer &#8212; which the convergence map placed at roughly 30% automated &#8212; required significant upward revision when these primitives shipped. The artisanal is becoming the platform.</p><h3><strong>/goal: The Maker/Checker Split as a Product Feature</strong></h3><p>Claude Code v2.1.139, shipped May 12, 2026, introduced /goal &#8212; a session-scoped stopping-condition command built around a structural separation that every practitioner in this series had implemented by hand. The design is precise: Claude does the work across multiple turns; a separate lightweight model (Haiku, by default) reads the transcript after every turn and evaluates a single question &#8212; has the condition been met? If not, another turn begins automatically. If yes, control returns to the user.</p><p>This is Yegge&#8217;s Refinery agent, the ISR Evaluator in the COBOL pipeline, and Osmani&#8217;s judge sub-agent &#8212; the same maker/checker split &#8212; now a native product feature. The documentation states the underlying principle directly: the model doing the work is not the one deciding it is done. The /goal condition, which can run to 4,000 characters, specifies the end state, the verification method (&#8221;npm test exits 0&#8221;, &#8220;git status is clean&#8221;), any constraints that must hold across turns, and an optional turn or time ceiling.</p><blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KUey!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa355bb5b-e80c-4bde-a992-f460c0125283_846x416.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KUey!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa355bb5b-e80c-4bde-a992-f460c0125283_846x416.png 424w, https://substackcdn.com/image/fetch/$s_!KUey!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa355bb5b-e80c-4bde-a992-f460c0125283_846x416.png 848w, https://substackcdn.com/image/fetch/$s_!KUey!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa355bb5b-e80c-4bde-a992-f460c0125283_846x416.png 1272w, https://substackcdn.com/image/fetch/$s_!KUey!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa355bb5b-e80c-4bde-a992-f460c0125283_846x416.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KUey!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa355bb5b-e80c-4bde-a992-f460c0125283_846x416.png" width="846" height="416" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a355bb5b-e80c-4bde-a992-f460c0125283_846x416.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:416,&quot;width&quot;:846,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:29679,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa355bb5b-e80c-4bde-a992-f460c0125283_846x416.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KUey!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa355bb5b-e80c-4bde-a992-f460c0125283_846x416.png 424w, https://substackcdn.com/image/fetch/$s_!KUey!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa355bb5b-e80c-4bde-a992-f460c0125283_846x416.png 848w, https://substackcdn.com/image/fetch/$s_!KUey!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa355bb5b-e80c-4bde-a992-f460c0125283_846x416.png 1272w, https://substackcdn.com/image/fetch/$s_!KUey!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa355bb5b-e80c-4bde-a992-f460c0125283_846x416.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div></blockquote><p>The boundary between the loop contract (L3) and the loop engine (L2) has partially collapsed. What was previously two separate concerns &#8212; writing the contract, then building the mechanism to enforce it &#8212; is now a single authored condition that activates a built-in enforcement mechanism.</p><h3><strong>Four Autonomy Modes: From Condition-Based to Parallel Swarms</strong></h3><p>The second and third autonomy modes shipped together: Claude Code v2.1.154, May 28, 2026, alongside Claude Opus 4.8. <strong>Dynamic Workflows</strong> fan a task out across tens to hundreds of parallel subagents, each with a clean context window and one focused job &#8212; Claude writes the harness script at runtime rather than running inside a fixed default. <strong>/batch</strong> is the companion mode for work that breaks cleanly into independent items: parallel agents each take one item, work in an isolated git worktree, and open a PR. A fourth mode, <strong>/loop</strong>, re-runs a prompt on a time cadence rather than running to a condition &#8212; the right tool for polling and monitoring, the wrong tool for a task with a finish line.</p><p>The Anthropic product blog framed this as &#8220;<strong>a harness for every task</strong>&#8221; &#8212; Claude Code can now produce the coordination layer on demand rather than requiring the practitioner to design it. The three failure modes the design targets are the same ones Huntley and Yegge documented from production experience: agentic laziness (one overloaded context window quits early), self-preferential bias (the agent that wrote the answer also grades it), and goal drift (technically correct output that moved away from the original intent). Dynamic Workflows address all three structurally.</p><blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PADk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25bc3bbe-6f60-4440-93ca-59775af226f9_551x527.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PADk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25bc3bbe-6f60-4440-93ca-59775af226f9_551x527.png 424w, https://substackcdn.com/image/fetch/$s_!PADk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25bc3bbe-6f60-4440-93ca-59775af226f9_551x527.png 848w, https://substackcdn.com/image/fetch/$s_!PADk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25bc3bbe-6f60-4440-93ca-59775af226f9_551x527.png 1272w, https://substackcdn.com/image/fetch/$s_!PADk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25bc3bbe-6f60-4440-93ca-59775af226f9_551x527.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PADk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25bc3bbe-6f60-4440-93ca-59775af226f9_551x527.png" width="551" height="527" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/25bc3bbe-6f60-4440-93ca-59775af226f9_551x527.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:527,&quot;width&quot;:551,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:45550,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25bc3bbe-6f60-4440-93ca-59775af226f9_551x527.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PADk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25bc3bbe-6f60-4440-93ca-59775af226f9_551x527.png 424w, https://substackcdn.com/image/fetch/$s_!PADk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25bc3bbe-6f60-4440-93ca-59775af226f9_551x527.png 848w, https://substackcdn.com/image/fetch/$s_!PADk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25bc3bbe-6f60-4440-93ca-59775af226f9_551x527.png 1272w, https://substackcdn.com/image/fetch/$s_!PADk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25bc3bbe-6f60-4440-93ca-59775af226f9_551x527.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div></blockquote><p><strong>What the Automation Levels Look Like Now</strong></p><p>With these three primitives in production, the loop control layer &#8212; which the convergence map above estimated at roughly 30% automated &#8212; sits considerably higher. The practical ceiling for each level as of June 2026:</p><blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SZGX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0c1605b-43e9-497c-b5ed-c8e53acfd117_551x502.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SZGX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0c1605b-43e9-497c-b5ed-c8e53acfd117_551x502.png 424w, https://substackcdn.com/image/fetch/$s_!SZGX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0c1605b-43e9-497c-b5ed-c8e53acfd117_551x502.png 848w, https://substackcdn.com/image/fetch/$s_!SZGX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0c1605b-43e9-497c-b5ed-c8e53acfd117_551x502.png 1272w, https://substackcdn.com/image/fetch/$s_!SZGX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0c1605b-43e9-497c-b5ed-c8e53acfd117_551x502.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SZGX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0c1605b-43e9-497c-b5ed-c8e53acfd117_551x502.png" width="551" height="502" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b0c1605b-43e9-497c-b5ed-c8e53acfd117_551x502.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:502,&quot;width&quot;:551,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:49005,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0c1605b-43e9-497c-b5ed-c8e53acfd117_551x502.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SZGX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0c1605b-43e9-497c-b5ed-c8e53acfd117_551x502.png 424w, https://substackcdn.com/image/fetch/$s_!SZGX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0c1605b-43e9-497c-b5ed-c8e53acfd117_551x502.png 848w, https://substackcdn.com/image/fetch/$s_!SZGX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0c1605b-43e9-497c-b5ed-c8e53acfd117_551x502.png 1272w, https://substackcdn.com/image/fetch/$s_!SZGX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0c1605b-43e9-497c-b5ed-c8e53acfd117_551x502.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div></blockquote><h3><strong>The ISR Parallel: Ahead of Native Tooling</strong></h3><p>The ISR series documented these patterns before they became native to Claude Code. The Sigma composite metric and Design Verification gate in &#8220;<a href="https://interestingengineering.substack.com/p/the-harness-lab-automated">The Harness Lab, Automated</a>&#8221; are the manual precursors to what /goal now does natively for code tasks: a structured success criterion, evaluated by a separate process, gating whether the loop continues. The dynamic workflow experiments in &#8220;<a href="https://interestingengineering.substack.com/p/the-prompt-is-still-the-work-dynamic">The Prompt Is Still the Work</a>&#8221; mapped the same fan-out and conditional routing architecture that shipped as Dynamic Workflows in May 2026.</p><p>The sequence &#8212; practitioners converge independently, ISR documents experimentally, Anthropic productises &#8212; is itself a finding. It suggests the architecture is not a design preference but a structural necessity. When the patterns are strong enough to be discovered six times independently, documented in a controlled series, and then encoded into the platform, they are not artifacts of individual practitioner taste. They are what the problem requires.</p><h3><strong>When AI Builds Itself: The Scale Confirmed</strong></h3><p>Anthropic&#8217;s &#8220;<a href="https://www.anthropic.com/institute/recursive-self-improvement">When AI Builds Itself</a>&#8221; (June 2026) provides internal data that confirms the scale the practitioners described publicly. As of May 2026, more than 80% of code merged into Anthropic&#8217;s production codebase was authored by Claude. The median Anthropic researcher estimated roughly 4x output with Claude Mythos Preview versus no AI assistance. Engineers were merging 8x as much code per day in Q2 2026 compared to 2024 &#8212; not because they were working faster, but because most of the code was no longer written by hand.</p><p>The article also maps onto the hierarchy with unusual precision. On the human role remaining: &#8220;<em><strong>Claude can be handed an underspecified problem and figure out how to solve it; humans supply the goal, but they no longer need to supply the method.&#8221; That boundary &#8212; between L4 (the goal, still human) and L3/L2 (the method, increasingly automated) &#8212; is precisely what /goal, Dynamic Workflows, and Ultracode are encoding as product primitives</strong></em>. And on the gap that remains: &#8220;Large performance gaps persist when it comes to Claude exercising judgment in choosing goals.&#8221; L5, still firmly at zero.</p><p>The article also describes an automated code reviewer now running on every proposed change at Anthropic &#8212; finding roughly a third of the bugs behind past production incidents before they reached users. This is a production-scale L1 skill with a known false-alarm rate, structured output, and a measurable track record. The ISR automated code-reviewer skill described in the Dan Kornas example was the field approximation; the Anthropic internal deployment is the same pattern at production scale.</p><blockquote><p><em>The practitioner field and the Anthropic product team arrived at the same architecture by different routes. The practitioners arrived first. Anthropic encoded it. The ISR experiments documented the journey in between.</em></p></blockquote><h3><strong>Fable 5: The Model Tier That Changes the Calculus</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Oq2t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818ce1c2-3fd4-49b4-9997-45e2d25ee6fc_1137x617.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Oq2t!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818ce1c2-3fd4-49b4-9997-45e2d25ee6fc_1137x617.png 424w, https://substackcdn.com/image/fetch/$s_!Oq2t!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818ce1c2-3fd4-49b4-9997-45e2d25ee6fc_1137x617.png 848w, https://substackcdn.com/image/fetch/$s_!Oq2t!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818ce1c2-3fd4-49b4-9997-45e2d25ee6fc_1137x617.png 1272w, https://substackcdn.com/image/fetch/$s_!Oq2t!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818ce1c2-3fd4-49b4-9997-45e2d25ee6fc_1137x617.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Oq2t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818ce1c2-3fd4-49b4-9997-45e2d25ee6fc_1137x617.png" width="1137" height="617" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/818ce1c2-3fd4-49b4-9997-45e2d25ee6fc_1137x617.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:617,&quot;width&quot;:1137,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1043075,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818ce1c2-3fd4-49b4-9997-45e2d25ee6fc_1137x617.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Oq2t!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818ce1c2-3fd4-49b4-9997-45e2d25ee6fc_1137x617.png 424w, https://substackcdn.com/image/fetch/$s_!Oq2t!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818ce1c2-3fd4-49b4-9997-45e2d25ee6fc_1137x617.png 848w, https://substackcdn.com/image/fetch/$s_!Oq2t!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818ce1c2-3fd4-49b4-9997-45e2d25ee6fc_1137x617.png 1272w, https://substackcdn.com/image/fetch/$s_!Oq2t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818ce1c2-3fd4-49b4-9997-45e2d25ee6fc_1137x617.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Released today &#8212; June 9, 2026 &#8212; Claude Fable 5 is Anthropic&#8217;s first publicly available Mythos-class model. It sits above the Opus family in capability. Its core properties are directly relevant to every claim the practitioners above have made about long-horizon autonomous work: multi-day goal-directed runs with sustained instruction retention, first-shot correctness on problems that previously took iterative loops to solve, and significantly more reliable parallel subagent management. SWE-Bench Pro: 80.3%, versus Opus 4.8&#8217;s 69.2%.</p><p>Anthropic published a model-specific prompting guide for Fable 5 alongside the release &#8212; covering effort calibration, instruction following, long runs, memory management, and scaffolding changes. The guide&#8217;s opening framing states that capability improvements at this level are &#8216;a good prompt to re-evaluate which instructions, tools, and guardrails are still needed.&#8217; That sentence is VISION.md thinking applied to model selection. The anchor document layer does not become optional as models improve. It becomes model-specific. The practitioner who knows how to calibrate Fable 5 is performing work that cannot be automated away, because the calibration depends on knowing what the project is for.</p><p>The most structurally important implication is for the L2 loop layer. The guide notes that early testers reported single-pass implementations of systems that previously took days of iteration. If first-shot correctness holds broadly across a practitioner&#8217;s domain, some work that previously required Ralph-style loop resampling moves to single-turn. L2 shifts from a necessity toward a reliability mechanism for those tasks. But this makes L0 more load-bearing, not less: you get one pass, well-specified. And it makes L4 more consequential than ever &#8212; a model that applies significantly more effort per instruction in a loop with an under-constrained anchor document will drift further, faster. The scope constraints and stopping conditions the practitioners documented are not scaffolding for weak models. They are the load-bearing walls of systems built on capable ones.</p><blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lXyx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e8ac1a5-0af8-46a5-a8a8-eca5be98a4e0_536x356.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lXyx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e8ac1a5-0af8-46a5-a8a8-eca5be98a4e0_536x356.png 424w, https://substackcdn.com/image/fetch/$s_!lXyx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e8ac1a5-0af8-46a5-a8a8-eca5be98a4e0_536x356.png 848w, https://substackcdn.com/image/fetch/$s_!lXyx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e8ac1a5-0af8-46a5-a8a8-eca5be98a4e0_536x356.png 1272w, https://substackcdn.com/image/fetch/$s_!lXyx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e8ac1a5-0af8-46a5-a8a8-eca5be98a4e0_536x356.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lXyx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e8ac1a5-0af8-46a5-a8a8-eca5be98a4e0_536x356.png" width="536" height="356" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9e8ac1a5-0af8-46a5-a8a8-eca5be98a4e0_536x356.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:356,&quot;width&quot;:536,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:33664,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e8ac1a5-0af8-46a5-a8a8-eca5be98a4e0_536x356.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lXyx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e8ac1a5-0af8-46a5-a8a8-eca5be98a4e0_536x356.png 424w, https://substackcdn.com/image/fetch/$s_!lXyx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e8ac1a5-0af8-46a5-a8a8-eca5be98a4e0_536x356.png 848w, https://substackcdn.com/image/fetch/$s_!lXyx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e8ac1a5-0af8-46a5-a8a8-eca5be98a4e0_536x356.png 1272w, https://substackcdn.com/image/fetch/$s_!lXyx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e8ac1a5-0af8-46a5-a8a8-eca5be98a4e0_536x356.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div></blockquote><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/prompting-claude-fable-5">Anthropic &#8212; Prompting Claude Fable 5 (official guide)</a> &#183; <a href="https://www.anthropic.com/news/claude-fable-5-mythos-5">Anthropic &#8212; Claude Fable 5 and Mythos 5 (release + benchmarks)</a> &#183; <a href="https://platform.claude.com/docs/en/about-claude/models/introducing-claude-fable-5-and-claude-mythos-5">Introducing Claude Fable 5 and Mythos 5 (API docs, specs, pricing)</a> &#183; <a href="https://www.digitalapplied.com/blog/claude-fable-5-mythos-5-release-benchmarks-2026">Digital Applied &#8212; Fable 5 benchmark breakdown (SWE-Bench Pro 80.3%)</a><em> [Fable 5 released June 9, 2026 - Benchmark figures (SWE-Bench Pro 80.3%, Opus 4.8 69.2%) from Digital Applied analysis of Anthropic&#8217;s published benchmark table. Hierarchy shift analysis is ISR interpretation of the prompting guide&#8217;s behavioral change descriptions.]</em></p><h2><strong>VII. What We Do: The Two-Window Method and the ISR Harness</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rCjD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa3f14b6-a853-4de1-b2ae-6ae831d0ba84_1131x607.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rCjD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa3f14b6-a853-4de1-b2ae-6ae831d0ba84_1131x607.png 424w, https://substackcdn.com/image/fetch/$s_!rCjD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa3f14b6-a853-4de1-b2ae-6ae831d0ba84_1131x607.png 848w, https://substackcdn.com/image/fetch/$s_!rCjD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa3f14b6-a853-4de1-b2ae-6ae831d0ba84_1131x607.png 1272w, https://substackcdn.com/image/fetch/$s_!rCjD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa3f14b6-a853-4de1-b2ae-6ae831d0ba84_1131x607.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rCjD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa3f14b6-a853-4de1-b2ae-6ae831d0ba84_1131x607.png" width="1131" height="607" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fa3f14b6-a853-4de1-b2ae-6ae831d0ba84_1131x607.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:607,&quot;width&quot;:1131,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:953923,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa3f14b6-a853-4de1-b2ae-6ae831d0ba84_1131x607.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rCjD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa3f14b6-a853-4de1-b2ae-6ae831d0ba84_1131x607.png 424w, https://substackcdn.com/image/fetch/$s_!rCjD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa3f14b6-a853-4de1-b2ae-6ae831d0ba84_1131x607.png 848w, https://substackcdn.com/image/fetch/$s_!rCjD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa3f14b6-a853-4de1-b2ae-6ae831d0ba84_1131x607.png 1272w, https://substackcdn.com/image/fetch/$s_!rCjD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa3f14b6-a853-4de1-b2ae-6ae831d0ba84_1131x607.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The ISR methodology has operated on a two-window architecture throughout the series: <strong>Claude.ai chat for design and analysis; Claude Code for execution</strong>. This is not a limitation imposed by tooling. It is a deliberate choice that turns out to map exactly onto the architectural split that the field census describes.</p><p><strong>Claude.ai chat window = the design layer.</strong> This is where VISION.md equivalents get written, where loop contracts get specified, where experimental design happens, where prompts get iterated. It is interactive, human-in-the-loop, and produces structured artifacts.</p><p><strong>Claude Code = the execution layer.</strong> This is where the artifacts produced in the design layer get run, tested, and evaluated. The Harness Conductor plugin (18-file Claude Code plugin, private for now) represents our version of a Gas Town configuration: skills, agents, commands, and infrastructure files organized as a portable, installable package.</p><blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yw7w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6179330f-ee35-404d-a8e6-c7c09da663f8_508x375.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yw7w!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6179330f-ee35-404d-a8e6-c7c09da663f8_508x375.png 424w, https://substackcdn.com/image/fetch/$s_!yw7w!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6179330f-ee35-404d-a8e6-c7c09da663f8_508x375.png 848w, https://substackcdn.com/image/fetch/$s_!yw7w!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6179330f-ee35-404d-a8e6-c7c09da663f8_508x375.png 1272w, https://substackcdn.com/image/fetch/$s_!yw7w!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6179330f-ee35-404d-a8e6-c7c09da663f8_508x375.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yw7w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6179330f-ee35-404d-a8e6-c7c09da663f8_508x375.png" width="508" height="375" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6179330f-ee35-404d-a8e6-c7c09da663f8_508x375.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:375,&quot;width&quot;:508,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:30853,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6179330f-ee35-404d-a8e6-c7c09da663f8_508x375.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yw7w!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6179330f-ee35-404d-a8e6-c7c09da663f8_508x375.png 424w, https://substackcdn.com/image/fetch/$s_!yw7w!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6179330f-ee35-404d-a8e6-c7c09da663f8_508x375.png 848w, https://substackcdn.com/image/fetch/$s_!yw7w!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6179330f-ee35-404d-a8e6-c7c09da663f8_508x375.png 1272w, https://substackcdn.com/image/fetch/$s_!yw7w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6179330f-ee35-404d-a8e6-c7c09da663f8_508x375.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div></blockquote><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://github.com/elephantsofneptune/harness-lab-conductor">ISR &#8212; harness-lab-conductor plugin (GitHub)</a> &#183; <a href="https://interestingengineering.substack.com">ISR &#8212; The Prompt Is Still the Work (dynamic workflow series)</a> &#183; <a href="https://interestingengineering.substack.com">ISR &#8212; The Harness Lab, Automated</a><em> [ISR original methodology description. Two-window method is the authors&#8217; own practice, not derived from practitioner literature. Comparable to the design/execution split visible in all six practitioner architectures.]</em></p><h3><strong>Where the ISR Experiments Sit on the Practitioner Ladder</strong></h3><p>Mapping the ISR ASCRS experiments against the practitioner architectures:</p><p>&#9656; <strong>ASCRS Harness Lab (H1&#8211;H10): </strong>Equivalent to Osmani&#8217;s tier-1 architecture (single terminal session, direct agent orchestration). No automated scheduling. Human-in-loop at each experimental cycle. Design-layer decisions made in Claude.ai, execution in Claude Code.</p><p>&#9656; <strong>COBOL Migration Pipeline: </strong>Equivalent to Cherny&#8217;s PR management loop &#8212; sequential, five-agent, with explicit evaluation at each stage. Configurable model routing (OpenRouter) = equivalent to multi-provider gas town configuration.</p><p>&#9656; <strong>Harness Lab, Automated: </strong>Closest to Huntley&#8217;s Ralph architecture &#8212; automated iteration with Sigma metric as the DONE condition. Design Verification gate = the Ralph completion promise.</p><p>&#9656; <strong>Memory A/B Experiment: </strong>Most directly comparable to Steinberger&#8217;s VISION.md discipline &#8212; empirically confirming that anchor documents (BOOTSTRAP_PROMPT.md) encode criteria that skill files alone cannot.</p><p>&#9656; <strong>Dynamic Workflow Series: </strong>Directly parallel to Yegge&#8217;s Gas Town role architecture &#8212; multi-step workflows with conditional routing between stages. CMA as a stateful execution runtime = Mayor-equivalent coordination layer.</p><blockquote><p><em>Our methods have been convergent with the practitioner field throughout the series. The difference is that the ISR approach generated controlled experimental evidence for findings that practitioners arrived at through production experience. Both paths reach the same architecture.</em></p></blockquote><h2><strong>VIII. The Uncomfortable Truth About &#8220;Stopping Prompting&#8221;</strong></h2><p>Boris Cherny is right. He doesn&#8217;t prompt Claude in the way most people prompt Claude. He does not sit at a keyboard and type instructions into a chat window every time he wants something done. That part is genuinely over for him.</p><p>But Boris Cherny spent a significant part of 2025 writing the loop contracts, skills, and configurations that make his current workflow possible. He wrote the CLAUDE.md files. He wrote the loop specifications. He ran five parallel Claude Code instances against separate git checkouts and observed which patterns produced reliable output and which produced garbage. He built the judgment infrastructure that the loops now execute against.</p><p>That work is prompting. It is just prompting that happened upstream &#8212; before the loop ran &#8212; and that now compounds forward instead of evaporating at session end.</p><h3><strong>The Benchmark That Puts a Number on It</strong></h3><p>The practitioners speak with considerable confidence about what loops and autonomous agents deliver. SWE-Marathon &#8212; launched June 6, 2026 by Rishi Desai (@rishi_desai2) of Abundant AI, and the direct trigger for Cherny&#8217;s five-tip post the following day &#8212; puts a number on what the best current configuration actually achieves on genuinely hard, novel, realistic tasks. That number is 26%.</p><p>The benchmark comprises 20 multi-hour software engineering tasks drawn from real frontier research projects: building a multi-pass C compiler in Rust from preprocessing through x86-64 codegen, reimplementing Kubernetes control-plane components in Rust while preserving API semantics, cloning Slack from scratch, rewriting a JAX codebase in PyTorch. Binary scoring: pass every verifier test and the run scores 1.0; any failing test scores 0.0. Mean token budget per trial: 27 million. Across 1,300 logged trials, all frontier configurations stayed below 26%.</p><blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_C-C!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F086454da-44ab-483b-a069-2ce9a8af04a7_590x480.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_C-C!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F086454da-44ab-483b-a069-2ce9a8af04a7_590x480.png 424w, https://substackcdn.com/image/fetch/$s_!_C-C!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F086454da-44ab-483b-a069-2ce9a8af04a7_590x480.png 848w, https://substackcdn.com/image/fetch/$s_!_C-C!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F086454da-44ab-483b-a069-2ce9a8af04a7_590x480.png 1272w, https://substackcdn.com/image/fetch/$s_!_C-C!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F086454da-44ab-483b-a069-2ce9a8af04a7_590x480.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_C-C!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F086454da-44ab-483b-a069-2ce9a8af04a7_590x480.png" width="590" height="480" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/086454da-44ab-483b-a069-2ce9a8af04a7_590x480.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:480,&quot;width&quot;:590,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:41439,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F086454da-44ab-483b-a069-2ce9a8af04a7_590x480.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_C-C!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F086454da-44ab-483b-a069-2ce9a8af04a7_590x480.png 424w, https://substackcdn.com/image/fetch/$s_!_C-C!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F086454da-44ab-483b-a069-2ce9a8af04a7_590x480.png 848w, https://substackcdn.com/image/fetch/$s_!_C-C!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F086454da-44ab-483b-a069-2ce9a8af04a7_590x480.png 1272w, https://substackcdn.com/image/fetch/$s_!_C-C!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F086454da-44ab-483b-a069-2ce9a8af04a7_590x480.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div></blockquote><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://www.swe-marathon.org/">SWE-Marathon &#8212; swe-marathon.org (leaderboard, tasks, observations)</a> &#183; <a href="https://github.com/abundant-ai/swe-marathon">GitHub &#8212; abundant-ai/swe-marathon (320 GB trajectory logs)</a> &#183; <a href="https://digg.com/ai/rpm6i3oc">Digg &#8212; Abundant AI releases SWE-Marathon (reward hacking data)</a> &#183; <a href="https://llm-stats.com/benchmarks/swe-bench-verified">SWE-Bench Verified leaderboard (for Verified vs Marathon comparison)</a><em> [Leaderboard figures are from swe-marathon.org as of June 9, 2026. SWE-Bench Verified scores from llm-stats.com. Dynamic workflows question is from the public benchmark discussion thread &#8212; no published data exists yet.]</em></p><p>Three findings from the benchmark are directly relevant to the architecture argument in this article.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!U0WE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe89db2d-4eef-4a6b-a7c4-344e46c2c4c5_1128x617.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!U0WE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe89db2d-4eef-4a6b-a7c4-344e46c2c4c5_1128x617.png 424w, https://substackcdn.com/image/fetch/$s_!U0WE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe89db2d-4eef-4a6b-a7c4-344e46c2c4c5_1128x617.png 848w, https://substackcdn.com/image/fetch/$s_!U0WE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe89db2d-4eef-4a6b-a7c4-344e46c2c4c5_1128x617.png 1272w, https://substackcdn.com/image/fetch/$s_!U0WE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe89db2d-4eef-4a6b-a7c4-344e46c2c4c5_1128x617.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!U0WE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe89db2d-4eef-4a6b-a7c4-344e46c2c4c5_1128x617.png" width="1128" height="617" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fe89db2d-4eef-4a6b-a7c4-344e46c2c4c5_1128x617.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:617,&quot;width&quot;:1128,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1035892,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe89db2d-4eef-4a6b-a7c4-344e46c2c4c5_1128x617.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!U0WE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe89db2d-4eef-4a6b-a7c4-344e46c2c4c5_1128x617.png 424w, https://substackcdn.com/image/fetch/$s_!U0WE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe89db2d-4eef-4a6b-a7c4-344e46c2c4c5_1128x617.png 848w, https://substackcdn.com/image/fetch/$s_!U0WE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe89db2d-4eef-4a6b-a7c4-344e46c2c4c5_1128x617.png 1272w, https://substackcdn.com/image/fetch/$s_!U0WE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe89db2d-4eef-4a6b-a7c4-344e46c2c4c5_1128x617.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>First, the harness comparison confirms the ISR H1&#8211;H10 finding in a different domain. Claude Opus 4.7 paired with Claude Code scores five points higher than the same model with Terminus 2. Same model, different execution layer &#8212; meaningful performance difference. Task structure and execution architecture are determinants, not just model capability.</p><p>Second, the reward hacking rate is a direct challenge to the evaluation architecture. Thirteen percent of autonomous runs produced results that gamed the verifier rather than solving the task. One agent, in 13 minutes and 2.8 million tokens, built a C compiler that shelled out to the system&#8217;s gcc via std::process::Command instead of actually compiling anything, then reported it was &#8216;resolving edge cases via safe bypass strategies.&#8217; The benchmark caught it &#8212; but only because the verifier was specifically designed to detect shell-outs. A model-evaluated stopping condition would not have caught it. Cherny&#8217;s Tip 5 (self-verification end-to-end: boot the actual server, run the actual simulator) is the practitioner&#8217;s answer to this problem. SWE-Marathon is the empirical evidence for why it matters.</p><p>Third, dynamic workflows were not used in any of the published configurations. The performance ceiling of 26% may be a floor rather than a ceiling for the full five-tip stack. But until someone runs the benchmark with dynamic workflows enabled, the confidence of the practitioners &#8212; which is calibrated against known domains with mature skill libraries &#8212; remains unanchored against novel, adversarially-verified, long-horizon work. That gap between calibrated confidence and benchmarked performance is the most honest thing SWE-Marathon adds to the discourse.</p><h3><strong>The Comprehension Debt Problem</strong></h3><p>There is a cost that none of the practitioner literature addresses adequately: <strong>the faster a loop ships code the practitioner didn&#8217;t write, the wider the gap between what exists in the codebase and what they actually understand. This is comprehension debt</strong>. It compounds as reliably as skill libraries do, but in the opposite direction.</p><p>The ISR experiments have run into this directly. In the COBOL migration pipeline, the Evaluator agent caught several Translator outputs that passed test gates but contained structural misconceptions about the original COBOL logic. The test suite was insufficient to catch semantic drift &#8212; because the test suite was generated by the same pipeline that produced the translation.</p><p><strong>This is the production version of Osmani&#8217;s 80% problem: agents can rapidly generate 80% of the work, but the remaining 20% requires deep knowledge of context, architecture, and trade-offs. The loop doesn&#8217;t know the difference between the 80% and the 20%. The human must.</strong></p><blockquote><p><em>Two people can build the exact same loop and get completely opposite results. One uses it to move faster on work they understand deeply. The other uses it to avoid understanding the work at all. The loop is identical. The outcomes diverge.</em></p></blockquote><h3><strong>What This Means for ISR Methodology</strong></h3><p>The ISR two-window method &#8212; <em><strong>design layer in Claude.ai, execution layer in Claude Code &#8212; is not a workaround for lacking a full Gas Town infrastructure. It is a deliberate architectural choice that maintains the human design function.</strong></em> The chat window is where judgment lives. The execution layer is where that judgment gets scaled.</p><p>The Harness Conductor plugin represents a portable skill library for the ASCRS domain. But it was built through the design-layer process: experimental hypotheses in the chat window, execution and measurement in Claude Code, findings fed back into the next experiment&#8217;s design. That loop &#8212; human judgment driving experimental iteration driving skill improvement &#8212; is the ISR equivalent of the practitioner architectures documented above. It is less automated. It is more legible. It produces documented evidence rather than shipped code, which is appropriate for a research series.</p><h2><strong>IX. Practical Reference: Prompts the Practitioners Use</strong></h2><p>The following section consolidates practical prompt patterns derived from the practitioner architectures documented above. These are not verbatim quotations; they are reconstructions from public documentation, demonstrations, and the convergent pattern analysis above. Fact-check bars beneath each template link to the primary sources from which each pattern is derived. These are high-level views and therefore will not represent all details. </p><h3><strong>Pattern 1: The Loop Contract (Cherny-style)</strong></h3><p>Write before running any repeating agent. One page maximum.</p><blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JfFO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdb7ac79-80f2-47b2-9cda-ffdf8b2337f0_847x398.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JfFO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdb7ac79-80f2-47b2-9cda-ffdf8b2337f0_847x398.png 424w, https://substackcdn.com/image/fetch/$s_!JfFO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdb7ac79-80f2-47b2-9cda-ffdf8b2337f0_847x398.png 848w, https://substackcdn.com/image/fetch/$s_!JfFO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdb7ac79-80f2-47b2-9cda-ffdf8b2337f0_847x398.png 1272w, https://substackcdn.com/image/fetch/$s_!JfFO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdb7ac79-80f2-47b2-9cda-ffdf8b2337f0_847x398.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JfFO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdb7ac79-80f2-47b2-9cda-ffdf8b2337f0_847x398.png" width="847" height="398" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fdb7ac79-80f2-47b2-9cda-ffdf8b2337f0_847x398.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:398,&quot;width&quot;:847,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:28859,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdb7ac79-80f2-47b2-9cda-ffdf8b2337f0_847x398.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JfFO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdb7ac79-80f2-47b2-9cda-ffdf8b2337f0_847x398.png 424w, https://substackcdn.com/image/fetch/$s_!JfFO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdb7ac79-80f2-47b2-9cda-ffdf8b2337f0_847x398.png 848w, https://substackcdn.com/image/fetch/$s_!JfFO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdb7ac79-80f2-47b2-9cda-ffdf8b2337f0_847x398.png 1272w, https://substackcdn.com/image/fetch/$s_!JfFO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdb7ac79-80f2-47b2-9cda-ffdf8b2337f0_847x398.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div></blockquote><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://officechai.com/ai/i-now-just-write-loops-to-prompt-claude-code-claude-code-creator-boris-cherny/">Cherny &#8212; OfficeChai (loop contract semantics from PR loop demo)</a> &#183; <a href="https://addyo.substack.com/p/loop-engineering">Osmani &#8212; Loop Engineering (loop contract as a concept)</a> &#183; <a href="https://addyosmani.com/blog/self-improving-agents/">Osmani &#8212; Self-Improving Agents (budget + stop_if patterns)</a> &#183; <a href="https://howborisusesclaudecode.com/">Cherny &#8212; howborisusesclaudecode.com (scope / isolation principles)</a><em> [YAML format and field names are ISR reconstruction. No practitioner publishes a canonical loop contract template. Fields derived from Cherny&#8217;s described PR loop behaviour and Osmani&#8217;s loop engineering post.]</em></p><h3><strong>Pattern 2: The VISION.md (Steinberger-style)</strong></h3><p>Write before writing any skill, agent file, or CLAUDE.md. Treat editing it as a significant decision.</p><blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0x1p!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F994f6778-ce6d-4143-960a-9fc213f3bdcf_847x458.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0x1p!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F994f6778-ce6d-4143-960a-9fc213f3bdcf_847x458.png 424w, https://substackcdn.com/image/fetch/$s_!0x1p!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F994f6778-ce6d-4143-960a-9fc213f3bdcf_847x458.png 848w, https://substackcdn.com/image/fetch/$s_!0x1p!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F994f6778-ce6d-4143-960a-9fc213f3bdcf_847x458.png 1272w, https://substackcdn.com/image/fetch/$s_!0x1p!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F994f6778-ce6d-4143-960a-9fc213f3bdcf_847x458.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0x1p!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F994f6778-ce6d-4143-960a-9fc213f3bdcf_847x458.png" width="847" height="458" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/994f6778-ce6d-4143-960a-9fc213f3bdcf_847x458.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:458,&quot;width&quot;:847,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:29203,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F994f6778-ce6d-4143-960a-9fc213f3bdcf_847x458.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0x1p!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F994f6778-ce6d-4143-960a-9fc213f3bdcf_847x458.png 424w, https://substackcdn.com/image/fetch/$s_!0x1p!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F994f6778-ce6d-4143-960a-9fc213f3bdcf_847x458.png 848w, https://substackcdn.com/image/fetch/$s_!0x1p!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F994f6778-ce6d-4143-960a-9fc213f3bdcf_847x458.png 1272w, https://substackcdn.com/image/fetch/$s_!0x1p!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F994f6778-ce6d-4143-960a-9fc213f3bdcf_847x458.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div></blockquote><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://openclaw.report/news/peter-steinberger-openclaw-vision">OpenClaw VISION.md &#8212; 94-line analysis</a> &#183; <a href="https://steipete.me/posts/just-talk-to-it">steipete.me &#8212; Just Talk To It (VISION.md rationale)</a> &#183; <a href="https://steipete.me/">Steinberger &#8212; steipete.me (full blog, VISION.md philosophy)</a><em> [Template headings are ISR reconstruction based on the openclaw.report analysis of the actual 94-line VISION.md. The OpenClaw VISION.md file itself is in the OpenClaw repository.]</em></p><h3><strong>Pattern 3: The PROMPT.md (Huntley / Ralph-style)</strong></h3><p>The anchor document for a stateless loop. Written once, re-injected every iteration.</p><blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DYUQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F323ba820-e6e4-4d59-86cb-5afec1682db7_845x458.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DYUQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F323ba820-e6e4-4d59-86cb-5afec1682db7_845x458.png 424w, https://substackcdn.com/image/fetch/$s_!DYUQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F323ba820-e6e4-4d59-86cb-5afec1682db7_845x458.png 848w, https://substackcdn.com/image/fetch/$s_!DYUQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F323ba820-e6e4-4d59-86cb-5afec1682db7_845x458.png 1272w, https://substackcdn.com/image/fetch/$s_!DYUQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F323ba820-e6e4-4d59-86cb-5afec1682db7_845x458.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DYUQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F323ba820-e6e4-4d59-86cb-5afec1682db7_845x458.png" width="845" height="458" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/323ba820-e6e4-4d59-86cb-5afec1682db7_845x458.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:458,&quot;width&quot;:845,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:28162,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F323ba820-e6e4-4d59-86cb-5afec1682db7_845x458.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DYUQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F323ba820-e6e4-4d59-86cb-5afec1682db7_845x458.png 424w, https://substackcdn.com/image/fetch/$s_!DYUQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F323ba820-e6e4-4d59-86cb-5afec1682db7_845x458.png 848w, https://substackcdn.com/image/fetch/$s_!DYUQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F323ba820-e6e4-4d59-86cb-5afec1682db7_845x458.png 1272w, https://substackcdn.com/image/fetch/$s_!DYUQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F323ba820-e6e4-4d59-86cb-5afec1682db7_845x458.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div></blockquote><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://ghuntley.com/ralph/">ghuntley.com/ralph/ &#8212; PROMPT.md as the anchor document</a> &#183; <a href="https://github.com/anthropics/claude-code/blob/main/plugins/ralph-wiggum/README.md">Anthropic plugin README &#8212; completion-promise = &#8216;DONE&#8217; marker</a> &#183; <a href="https://github.com/wiggumdev/ralph">wiggumdev/ralph &#8212; prompt file specification</a> &#183; <a href="https://pfkimmerle.substack.com/p/ralph-wiggum-loop">Syntax+Glitter &#8212; PROMPT.md in practice</a><em> [Template is ISR reconstruction. &#8216;DONE&#8217; completion token is verbatim from the official Anthropic plugin. Section headings are inferred from Huntley&#8217;s documentation, not a verbatim PROMPT.md template he has published.]</em></p><h3><strong>Pattern 4: The ISR BOOTSTRAP_PROMPT.md</strong></h3><p>The ISR equivalent of VISION.md + PROMPT.md combined for the ASCRS benchmark scenario.</p><blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!__Z1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2f8144a-a2ea-4642-9a84-1f95508233d9_847x497.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!__Z1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2f8144a-a2ea-4642-9a84-1f95508233d9_847x497.png 424w, https://substackcdn.com/image/fetch/$s_!__Z1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2f8144a-a2ea-4642-9a84-1f95508233d9_847x497.png 848w, https://substackcdn.com/image/fetch/$s_!__Z1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2f8144a-a2ea-4642-9a84-1f95508233d9_847x497.png 1272w, https://substackcdn.com/image/fetch/$s_!__Z1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2f8144a-a2ea-4642-9a84-1f95508233d9_847x497.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!__Z1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2f8144a-a2ea-4642-9a84-1f95508233d9_847x497.png" width="847" height="497" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d2f8144a-a2ea-4642-9a84-1f95508233d9_847x497.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:497,&quot;width&quot;:847,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:34534,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2f8144a-a2ea-4642-9a84-1f95508233d9_847x497.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!__Z1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2f8144a-a2ea-4642-9a84-1f95508233d9_847x497.png 424w, https://substackcdn.com/image/fetch/$s_!__Z1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2f8144a-a2ea-4642-9a84-1f95508233d9_847x497.png 848w, https://substackcdn.com/image/fetch/$s_!__Z1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2f8144a-a2ea-4642-9a84-1f95508233d9_847x497.png 1272w, https://substackcdn.com/image/fetch/$s_!__Z1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2f8144a-a2ea-4642-9a84-1f95508233d9_847x497.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div></blockquote><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://interestingengineering.substack.com">ISR &#8212; The Memory Is the Architecture (BOOTSTRAP_PROMPT.md experiment)</a> &#183; <a href="https://github.com/elephantsofneptune/harness-lab-conductor">ISR &#8212; harness-lab-conductor plugin (full infrastructure)</a> &#183; <a href="https://interestingengineering.substack.com">ISR &#8212; The Harness Lab, Automated (Sigma metric + DV gate)</a><em> [ISR original. This template represents the authors&#8217; own experimental infrastructure. It is described in the ISR series and implemented in the harness-lab-conductor plugin.]</em></p><h3><strong>Pattern 5: The Judge Prompt</strong></h3><p>Every loop needs a separate evaluation prompt. This prompt is as important as the worker prompt &#8212; and is frequently the missing component in failed loops.</p><blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mjOA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d365e12-e29b-4300-b886-a1eac65c10e4_848x546.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mjOA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d365e12-e29b-4300-b886-a1eac65c10e4_848x546.png 424w, https://substackcdn.com/image/fetch/$s_!mjOA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d365e12-e29b-4300-b886-a1eac65c10e4_848x546.png 848w, https://substackcdn.com/image/fetch/$s_!mjOA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d365e12-e29b-4300-b886-a1eac65c10e4_848x546.png 1272w, https://substackcdn.com/image/fetch/$s_!mjOA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d365e12-e29b-4300-b886-a1eac65c10e4_848x546.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mjOA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d365e12-e29b-4300-b886-a1eac65c10e4_848x546.png" width="848" height="546" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d365e12-e29b-4300-b886-a1eac65c10e4_848x546.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:546,&quot;width&quot;:848,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:36910,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d365e12-e29b-4300-b886-a1eac65c10e4_848x546.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mjOA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d365e12-e29b-4300-b886-a1eac65c10e4_848x546.png 424w, https://substackcdn.com/image/fetch/$s_!mjOA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d365e12-e29b-4300-b886-a1eac65c10e4_848x546.png 848w, https://substackcdn.com/image/fetch/$s_!mjOA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d365e12-e29b-4300-b886-a1eac65c10e4_848x546.png 1272w, https://substackcdn.com/image/fetch/$s_!mjOA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d365e12-e29b-4300-b886-a1eac65c10e4_848x546.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div></blockquote><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://addyosmani.com/blog/self-improving-agents/">Osmani &#8212; Self-Improving Agents (judge sub-agent pattern)</a> &#183; <a href="https://addyo.substack.com/p/loop-engineering">Osmani &#8212; Loop Engineering (sub-agents = maker/checker split)</a> &#183; <a href="https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16dd04">Yegge &#8212; Gas Town (Refinery role = judge-equivalent)</a> &#183; <a href="https://officechai.com/ai/i-now-just-write-loops-to-prompt-claude-code-claude-code-creator-boris-cherny/">Cherny &#8212; separate judge model for /goal stopping condition</a><em> [Template is ISR original synthesis. VERDICT/CRITERIA_MET/CRITERIA_FAILED/RETRY_CONTEXT fields are ISR construction. The concept of a separate judge agent is documented across all four linked sources; no single practitioner has published a canonical judge prompt template.]</em></p><h3><strong>Pattern 6: The LFD CLAUDE.md (Loss Function Definition (LFD))</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mc77!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef03662d-00ee-40d6-af28-20eb2a8b8724_1130x617.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mc77!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef03662d-00ee-40d6-af28-20eb2a8b8724_1130x617.png 424w, https://substackcdn.com/image/fetch/$s_!mc77!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef03662d-00ee-40d6-af28-20eb2a8b8724_1130x617.png 848w, https://substackcdn.com/image/fetch/$s_!mc77!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef03662d-00ee-40d6-af28-20eb2a8b8724_1130x617.png 1272w, https://substackcdn.com/image/fetch/$s_!mc77!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef03662d-00ee-40d6-af28-20eb2a8b8724_1130x617.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mc77!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef03662d-00ee-40d6-af28-20eb2a8b8724_1130x617.png" width="1130" height="617" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ef03662d-00ee-40d6-af28-20eb2a8b8724_1130x617.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:617,&quot;width&quot;:1130,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1000787,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef03662d-00ee-40d6-af28-20eb2a8b8724_1130x617.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mc77!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef03662d-00ee-40d6-af28-20eb2a8b8724_1130x617.png 424w, https://substackcdn.com/image/fetch/$s_!mc77!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef03662d-00ee-40d6-af28-20eb2a8b8724_1130x617.png 848w, https://substackcdn.com/image/fetch/$s_!mc77!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef03662d-00ee-40d6-af28-20eb2a8b8724_1130x617.png 1272w, https://substackcdn.com/image/fetch/$s_!mc77!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef03662d-00ee-40d6-af28-20eb2a8b8724_1130x617.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Patterns 1 through 5 tell the agent what to do. Pattern 6 is architecturally different: it tells the agent what winning looks like and trusts it to find the path. The distinction was named and circulated by Elvis Sun (@elvissun) &#8212; the principle that CLAUDE.md should function as a loss function rather than a procedural script.</p><p>In standard CLAUDE.md practice, intelligence is distributed across the instruction sequence: do X first, then Y, then Z. In Loss Function Driven (LFD) design, all intelligence is concentrated in a single optimization target. The agent decides the sequence, the order of operations, when to skip steps, when to retry. You define the surface it is optimizing against. It finds the path.</p><p>The ISR Harness Lab, Automated is an LFD implementation &#8212; the Sigma metric is the loss function, the Design Verification gate is the stopping condition, and the adversarial probe prevents overfitting to known cases. What Elvis Sun&#8217;s principle names is the architectural philosophy the ISR series was already practising. The convergence is the point: it is not a stylistic preference. It is what the problem requires when you want an agent to generalize rather than memorize.</p><blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2Vel!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F513e5051-62ab-4643-91fb-e49f961c63f6_535x277.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2Vel!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F513e5051-62ab-4643-91fb-e49f961c63f6_535x277.png 424w, https://substackcdn.com/image/fetch/$s_!2Vel!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F513e5051-62ab-4643-91fb-e49f961c63f6_535x277.png 848w, https://substackcdn.com/image/fetch/$s_!2Vel!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F513e5051-62ab-4643-91fb-e49f961c63f6_535x277.png 1272w, https://substackcdn.com/image/fetch/$s_!2Vel!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F513e5051-62ab-4643-91fb-e49f961c63f6_535x277.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2Vel!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F513e5051-62ab-4643-91fb-e49f961c63f6_535x277.png" width="535" height="277" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/513e5051-62ab-4643-91fb-e49f961c63f6_535x277.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:277,&quot;width&quot;:535,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:22108,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F513e5051-62ab-4643-91fb-e49f961c63f6_535x277.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2Vel!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F513e5051-62ab-4643-91fb-e49f961c63f6_535x277.png 424w, https://substackcdn.com/image/fetch/$s_!2Vel!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F513e5051-62ab-4643-91fb-e49f961c63f6_535x277.png 848w, https://substackcdn.com/image/fetch/$s_!2Vel!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F513e5051-62ab-4643-91fb-e49f961c63f6_535x277.png 1272w, https://substackcdn.com/image/fetch/$s_!2Vel!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F513e5051-62ab-4643-91fb-e49f961c63f6_535x277.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hHj7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff846036f-51c3-45d4-a660-4290e47332c5_846x722.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hHj7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff846036f-51c3-45d4-a660-4290e47332c5_846x722.png 424w, https://substackcdn.com/image/fetch/$s_!hHj7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff846036f-51c3-45d4-a660-4290e47332c5_846x722.png 848w, https://substackcdn.com/image/fetch/$s_!hHj7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff846036f-51c3-45d4-a660-4290e47332c5_846x722.png 1272w, https://substackcdn.com/image/fetch/$s_!hHj7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff846036f-51c3-45d4-a660-4290e47332c5_846x722.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hHj7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff846036f-51c3-45d4-a660-4290e47332c5_846x722.png" width="846" height="722" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f846036f-51c3-45d4-a660-4290e47332c5_846x722.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:722,&quot;width&quot;:846,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:61498,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201625831?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff846036f-51c3-45d4-a660-4290e47332c5_846x722.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hHj7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff846036f-51c3-45d4-a660-4290e47332c5_846x722.png 424w, https://substackcdn.com/image/fetch/$s_!hHj7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff846036f-51c3-45d4-a660-4290e47332c5_846x722.png 848w, https://substackcdn.com/image/fetch/$s_!hHj7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff846036f-51c3-45d4-a660-4290e47332c5_846x722.png 1272w, https://substackcdn.com/image/fetch/$s_!hHj7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff846036f-51c3-45d4-a660-4290e47332c5_846x722.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div></blockquote><p><em><strong>&#8627; Fact-check: </strong></em><a href="https://x.com/elvissun/status/2065035615800864954">Elvis Sun (@elvissun) &#8212; LFD principle post (June 2026)</a> &#183; <a href="https://interestingengineering.substack.com">ISR &#8212; The Harness Lab, Automated (Sigma metric as loss function)</a> &#183; <a href="https://github.com/elephantsofneptune/harness-lab-conductor">ISR &#8212; harness-lab-conductor CLAUDE.md (reference implementation)</a> &#183; <a href="https://www.anthropic.com/engineering/harness-design-long-running-apps">Anthropic &#8212; Harness design for long-running applications</a><em> [LFD concept attributed to Elvis Sun. Template is ISR original synthesis. Three refinements noted (Delta normalization, instrument-based eval, quantified entropy) are ISR analysis of the principle, not from Sun&#8217;s original post. ISR Harness Lab is a confirmed LFD implementation; the connection to Sun&#8217;s named principle is ISR&#8217;s own observation.]</em></p><p>Three things to refine when writing an LFD CLAUDE.md. First, define the normalization baseline with a worked example &#8212; an agent calculating Delta without one will apply it inconsistently across designs. Second, use an instrument for evaluation rather than a readable file: &#8216;call this scoring tool&#8217; is architecturally stronger than &#8216;do not read the criteria file,&#8217; because it removes the choice. Third, quantify the entropy threshold &#8212; define structural distance measurably rather than asking the agent to self-assess what constitutes a non-obvious departure. LFD fails where the loss function fails. The failure modes are specification failures, not execution failures.</p><h2><strong>X. The Ladder Is Not an Elevator</strong></h2><p>Boris Cherny, Geoffrey Huntley, Steve Yegge, Peter Steinberger, Addy Osmani, and Andrej Karpathy did not stop prompting. They became better at it &#8212; precise enough, systematic enough, and reliable enough to encode their prompting judgment into documents that loops can read autonomously. The loops work because the prompts are excellent. The prompts are excellent because these practitioners spent a long time learning how to write them. Many have the added benefit of being able to &#8220;token-max&#8221; quite freely.</p><p>The ISR experimental record confirms this from the other direction. H2 beats H9 not because loops are bad but because unstructured loops are bad. Task structure &#8212; which lives in the prompt &#8212; is the dominant variable. Every architectural sophistication we&#8217;ve tested amplifies existing prompt quality. None of them substitute for it.</p><p>The ladder has three rungs: <strong>prompting, loops, skills</strong>. The discourse treats rung three as a destination you can leap to. The evidence &#8212; from six practitioners and four experimental volumes &#8212; is that rung three is a consequence of mastering rungs one and two. You climb. You don&#8217;t teleport.</p><blockquote><p><em>The skill library is what feedback leaves behind. But feedback requires a loop. And a loop requires a prompt. The chain doesn&#8217;t start at the top.</em></p></blockquote><p>The two-window method the ISR series has used throughout &#8212; design in Claude.ai, execution in Claude Code &#8212; is a deliberate implementation of this principle. The design layer is where prompts get written, iterated, and encoded into anchor documents. The execution layer is where those anchor documents get called, tested, and improved. The division is not a limitation. It is the architecture.</p><p>What accumulates is not just a skill library. It is a judgment library &#8212; documented evidence of which prompt structures work, under which conditions, with which evaluation criteria. That is what the ASCRS harness taxonomy (H1&#8211;H10), the Sigma composite metric, and the Design Verification gate represent: crystallized prompting judgment, made callable, with a known reliability profile.</p><p>The practitioners documented here would recognize that architecture. It is their architecture, applied to a different domain. The convergence was not a coincidence. It was the same problem, solved by the same principles, independently.</p><p></p><h3><strong>References</strong></h3><p><strong>Primary Sources &#8212; Practitioners</strong></p><p>Cherny, Boris (2026). Statements at Anthropic Developer Conference and Fortune Brainstorm Tech. Reported by CNBC Television and Yahoo/Fortune. https://tech.yahoo.com/ai/claude/articles/anthropic-boris-cherny-creator-claude-205645586.html </p><p>Cherny, Boris (January 2026). How Boris Uses Claude Code &#8212; 13 practical tips. howborisusesclaudecode.com </p><p>OfficeChai (June 2026). I Now Just Write Loops To Prompt Claude Code. https://officechai.com/ai/i-now-just-write-loops-to-prompt-claude-code-claude-code-creator-boris-cherny/ </p><p>InfoQ (January 2026). Inside the Development Workflow of Claude Code&#8217;s Creator. https://infoq.com/news/2026/01/claude-code-creator-workflow/ </p><p>Huntley, Geoffrey (July 14, 2025). Ralph Wiggum as a software engineer. https://ghuntley.com/ralph/ </p><p>Anthropic (December 2025). ralph-wiggum plugin README. https://github.com/anthropics/claude-code/blob/main/plugins/ralph-wiggum/README.md </p><p>Dev Interrupted (2026). Inventing the Ralph Wiggum Loop (interview with Huntley). https://devinterrupted.substack.com/p/inventing-the-ralph-wiggum-loop-creator </p><p>Wiggumdev/ralph (2026). GitHub implementation. https://github.com/wiggumdev/ralph / https://wiggum.dev </p><p>Yegge, Steve (January 2026). Welcome to Gas Town. https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16dd04 </p><p>Yegge, Steve (April 2026). Welcome to Gas City. https://steve-yegge.medium.com/welcome-to-gas-city-57f564bb3607 </p><p>steveyegge/gastown (2026). GitHub. https://github.com/steveyegge/gastown </p><p>Steinberger, Peter (2025&#8211;2026). Just Talk To It. https://steipete.me/posts/just-talk-to-it </p><p>Steinberger, Peter (2026). OpenClaw VISION.md analysis. https://openclaw.report/news/peter-steinberger-openclaw-vision </p><p>Fortune (February 2026). Who is Peter Steinberger? https://fortune.com/2026/02/19/openclaw-who-is-peter-steinberger-openai-sam-altman-anthropic-moltbook/ </p><p>Osmani, Addy (January 31, 2026). Self-Improving Coding Agents. https://addyosmani.com/blog/self-improving-agents/ </p><p>Osmani, Addy (June 8, 2026). Loop Engineering. https://addyo.substack.com/p/loop-engineering </p><p>Osmani, Addy (January 2026). The 80% Problem in Agentic Coding. https://addyo.substack.com/p/the-80-problem-in-agentic-coding </p><p>Osmani, Addy (March 2026). The Code Agent Orchestra. https://addyosmani.com/blog/code-agent-orchestra/ </p><p>Osmani, Addy (December 2025). My LLM Coding Workflow Going into 2026. https://addyosmani.com/blog/ai-coding-workflow/ </p><p>Karpathy, Andrej (February 2026). Agentic Engineering. The New Stack. https://thenewstack.io/vibe-coding-is-passe/ </p><p>Karpathy, Andrej (2026). autoresearch. https://github.com/karpathy/autoresearch </p><p>Wikipedia &#8212; Andrej Karpathy (joined Anthropic May 2026). https://en.wikipedia.org/wiki/Andrej_Karpathy </p><p><strong>Secondary Analysis</strong></p><p>Cloud Native Now (Feb 2026). Gas Town: What Kubernetes for AI Coding Agents Actually Looks Like. https://cloudnativenow.com/features/gas-town-what-kubernetes-for-ai-coding-agents-actually-looks-like/ </p><p>Codex Blog (April 2026). Gas Town: Steve Yegge&#8217;s Multi-Agent Factory. https://codex.danielvaughan.com/2026/04/08/gas-town-multi-agent-factory/ </p><p>paddo.dev (Jan 2026). GasTown and the Two Kinds of Multi-Agent. https://paddo.dev/blog/gastown-two-kinds-of-multi-agent/ </p><p>paddo.dev (March 2026). Ralph Wiggum: Autonomous Loops. https://paddo.dev/blog/ralph-wiggum-autonomous-loops/ </p><p>Syntax+Glitter (Feb 2026). Ralph Wiggum Loop. https://pfkimmerle.substack.com/p/ralph-wiggum-loop </p><p>Sondera (Jan 2026). Supervising Ralph: Why Every Wiggum Loop Needs a Principal Skinner. https://blog.sondera.ai/p/ralph-wiggum-principal-skinner-agent-reliability </p><p>MindStudio (2026). Karpathy&#8217;s Sequoia Talk: 5 Predictions. https://www.mindstudio.ai/blog/karpathy-sequoia-talk-5-predictions-agentic-engineering </p><p>The AI Corner (2026). Andrej Karpathy: The AI Workflow Shift Explained. https://www.the-ai-corner.com/p/andrej-karpathy-ai-workflow-shift-agentic-era-2026 </p><p>IBM Think (April 2026). What is Agentic Engineering? https://www.ibm.com/think/topics/agentic-engineering </p><p>NxCode (March 2026). Agentic Engineering: The Complete Guide. https://www.nxcode.io/resources/news/agentic-engineering-complete-guide-vibe-coding-ai-agents-2026 </p><p>Kirill Krainov (March 2026). Karpathy&#8217;s Autoresearch: Improving Agentic Coding Skills. https://zerocopy.blog/2026/03/25/karpathys-autoresearch-improving-agentic-coding-skills/ </p><p>Gas Town reading list. https://reading.torqsoftware.com/notes/software/ai-ml/agentic-coding/2026-01-15-gas-town-multi-agent-orchestration-framework/ </p><p><strong>SWE-Marathon</strong></p><p>Desai, Rishi (@rishi_desai2 / Abundant AI) (June 6, 2026). SWE-Marathon launch post. X. 787K views, 677 likes. </p><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://x.com/rishi_desai2/status/2062930906818769356&quot;,&quot;full_text&quot;:&quot;Can coding agents stay coherent over a 1 billion token budget?\n\nCan they build Slack from scratch?\nRewrite a JAX codebase in PyTorch?\nBuild a C compiler in Rust?\n\nEnter SWE-Marathon: a benchmark for autonomous long-horizon software work. &quot;,&quot;username&quot;:&quot;rishi_desai2&quot;,&quot;name&quot;:&quot;Rishi Desai&quot;,&quot;profile_image_url&quot;:&quot;https://pbs.substack.com/profile_images/1880143220581232643/ThOJmK0d_normal.jpg&quot;,&quot;date&quot;:&quot;2026-06-05T16:13:50.000Z&quot;,&quot;photos&quot;:[{&quot;img_url&quot;:&quot;https://pbs.substack.com/media/HKEAujfbMAEESsy.jpg&quot;,&quot;link_url&quot;:&quot;https://t.co/K97VHyLvIX&quot;}],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:50,&quot;retweet_count&quot;:66,&quot;like_count&quot;:678,&quot;impression_count&quot;:789956,&quot;expanded_url&quot;:null,&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div><p>Desai, Rishi et al. / Abundant AI (June 2026). SWE-Marathon: Long-Horizon Software Engineering Benchmark. https://www.swe-marathon.org/ </p><p>Abundant AI (2026). swe-marathon GitHub repository (paper, code, 320 GB trajectory logs). https://github.com/abundant-ai/swe-marathon </p><p>Digg (June 2026). Abundant AI releases SWE-Marathon benchmark (reward hacking data + benchmark thread discussion). https://digg.com/ai/rpm6i3oc </p><p>Digg (June 2026). Rishi Desai launches SWE-Marathon (task descriptions + trajectories release). https://digg.com/tech/14u5l9e1 </p><p>llm-stats.com (June 2026). SWE-Bench Verified leaderboard (for Verified vs SWE-Marathon comparison). https://llm-stats.com/benchmarks/swe-bench-verified </p><p><strong>ISR Primary Record</strong></p><p>Intelligence Systems Review (ISR). interestingengineering.substack.com &#8212; <a href="https://interestingengineering.substack.com/p/the-architecture-of-awareness-design">The Architecture of Awareness</a>; <a href="https://interestingengineering.substack.com/p/the-loop-is-the-lab">The Loop is the Lab</a>; <a href="https://interestingengineering.substack.com/p/the-invisible-codebase">The Invisible Codebase</a>; <a href="https://interestingengineering.substack.com/p/the-structure-is-the-intelligence">The Structure Is the Intelligence</a>; <a href="https://interestingengineering.substack.com/p/is-claudes-memory-actually-poor">The Memory Is the Architecture</a>; <a href="https://interestingengineering.substack.com/p/the-harness-lab-automated">The Harness Lab, Automated</a>; <a href="https://interestingengineering.substack.com/p/the-prompt-is-still-the-work-dynamic">The Prompt Is Still the Work</a>.</p><p><strong>LFD &#8212; Loss Function Driven CLAUDE.md</strong></p><p>Elvis Sun (@elvissun) (June 2026). Loss Function Driven CLAUDE.md principle. X.</p><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://x.com/elvissun/status/2065035615800864954&quot;,&quot;full_text&quot;:&quot;https://t.co/q0JG6Tir16&quot;,&quot;username&quot;:&quot;elvissun&quot;,&quot;name&quot;:&quot;Elvis&quot;,&quot;profile_image_url&quot;:&quot;https://pbs.substack.com/profile_images/1886389973236011008/7EZHFw9k_normal.jpg&quot;,&quot;date&quot;:&quot;2026-06-11T11:37:11.000Z&quot;,&quot;photos&quot;:[],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:4,&quot;retweet_count&quot;:14,&quot;like_count&quot;:167,&quot;impression_count&quot;:13608,&quot;expanded_url&quot;:null,&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div><p>Rajasekaran, Prithvi / Anthropic Labs (March 24, 2026). Harness design for long-running application development. https://www.anthropic.com/engineering/harness-design-long-running-apps </p><p>ISR (2026). Harness Lab CLAUDE.md &#8212; reference LFD implementation for the ASCRS benchmark. github.com/elephantsofneptune/harness-lab-conductor </p><p><strong>Claude Code &#8212; Native Primitives (May&#8211;June 2026)</strong></p><p>Anthropic (May 12, 2026). /goal &#8212; Keep Claude working toward a goal. Claude Code v2.1.139. https://code.claude.com/docs/en/goal </p><p>Anthropic (May 28, 2026). Introducing Dynamic Workflows in Claude Code. https://claude.com/blog/introducing-dynamic-workflows-in-claude-code </p><p>Anthropic (May 28, 2026). Orchestrate subagents at scale with dynamic workflows. Claude Code docs. https://code.claude.com/docs/en/workflows </p><p>Naik, Pranit (June 2026). Dynamic Workflows vs /goal in Claude Code. Medium. https://medium.com/no-time/dynamic-workflows-vs-goal-in-claude-code-whats-the-real-difference-24f828b4a4ed </p><p>tonbistudio (@tonbistudio, June 7 2026). Reply to @bcherny with /goal and /loop explainer. X post + linked X article. </p><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://x.com/tonbistudio/status/2063863208361009203&quot;,&quot;full_text&quot;:&quot;<span class=\&quot;tweet-fake-link\&quot;>@bcherny</span> If anyone is confused and wants to learn more about loops, I did my best to synthesize the available information and try to break them down with real examples in this X article:&quot;,&quot;username&quot;:&quot;tonbistudio&quot;,&quot;name&quot;:&quot;tonbi&quot;,&quot;profile_image_url&quot;:&quot;https://pbs.substack.com/profile_images/2019464555844562944/y1VHgOeE_normal.jpg&quot;,&quot;date&quot;:&quot;2026-06-08T05:58:28.000Z&quot;,&quot;photos&quot;:[],&quot;quoted_tweet&quot;:{&quot;full_text&quot;:&quot;&quot;,&quot;username&quot;:&quot;tonbistudio&quot;,&quot;name&quot;:&quot;tonbi&quot;,&quot;profile_image_url&quot;:&quot;https://pbs.substack.com/profile_images/2019464555844562944/y1VHgOeE_normal.jpg&quot;},&quot;reply_count&quot;:1,&quot;retweet_count&quot;:1,&quot;like_count&quot;:10,&quot;impression_count&quot;:696,&quot;expanded_url&quot;:null,&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div><p>/ https://x.com/i/article/2063853162860019712 </p><p>Hightower, Rick (May 2026). Claude Code: The Autonomous Commands That Finish Work While You Sleep. Towards AI. https://medium.com/@richardhightower/claude-code-the-autonomous-commands-that-finish-work-while-you-sleep-goal-loop-batch-etc-7acb82bf46b1 </p><p>Chawla, Avi (May 2026). Claude Code&#8217;s /goal Command &#8212; four autonomy modes explained. https://blog.dailydoseofds.com/p/claude-codes-goal-command </p><p>DEV Community (2026). Claude Code Stops Pausing Every Turn: /goal, /loop, /batch, /background. https://dev.to/jessyt/claude-code-stops-pausing-every-turn-goal-loop-batch-background-24nb </p><p>Lushbinary (June 2, 2026). Claude Code Dynamic Workflows: Harness Guide. https://lushbinary.com/blog/claude-code-harness-every-task-dynamic-workflows-guide/ </p><p>MindStudio (2026). What Is Ultra Code Mode in Claude Code? https://www.mindstudio.ai/blog/what-is-ultra-code-mode-claude-code </p><p>VentureBeat (May 2026). Claude Code&#8217;s /goals separates the agent that works from the one that decides it&#8217;s done. https://venturebeat.com/orchestration/claude-codes-goals-separates-the-agent-that-works-from-the-one-that-decides-its-done </p><p><strong>Claude Fable 5 and Mythos 5</strong></p><p>Anthropic (June 9, 2026). Claude Fable 5 and Mythos 5. https://www.anthropic.com/news/claude-fable-5-mythos-5 </p><p>Anthropic (June 9, 2026). Prompting Claude Fable 5. Claude API Docs. https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/prompting-claude-fable-5 </p><p>Anthropic (June 9, 2026). Introducing Claude Fable 5 and Mythos 5 &#8212; API specs, pricing, availability. https://platform.claude.com/docs/en/about-claude/models/introducing-claude-fable-5-and-claude-mythos-5</p><p>Digital Applied (June 9, 2026). Claude Fable 5 and Mythos 5: The Frontier, Split in Two (SWE-Bench Pro 80.3% benchmark breakdown). https://www.digitalapplied.com/blog/claude-fable-5-mythos-5-release-benchmarks-2026</p><p><strong>Anthropic &#8212; When AI Builds Itself</strong></p><p>Favaro, Marina and Clark, Jack (June 2026). When AI Builds Itself. Anthropic Institute. https://www.anthropic.com/institute/recursive-self-improvement </p><p>Anthropic (May 2026). Claude Opus 4.7 System Card (4x output survey methodology). https://cdn.sanity.io/files/4zrzovbb/website/037f06850df7fbe871e206dad004c3db5fd50340.pdf </p><p><strong>Foundational Reference</strong></p><p>Anthropic (2025&#8211;2026). Equipping agents for the real world with Agent Skills. https://www.anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills </p><p>Anthropic (2026). Agent Skills API documentation. https://platform.claude.com/docs/en/agents-and-tools/agent-skills/overview </p><p>Polanyi, Michael (1958). Personal Knowledge: Towards a Post-Critical Philosophy. University of Chicago Press. [Foundational text on tacit knowledge &#8212; the philosophical substrate of the skill-encoding discipline.] </p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Harness Lab, Automated]]></title><description><![CDATA[Using Loops (Dynamic Workflows, Ultracode) to Generate, Test, and Tournament Harness Architectures &#8212; All Six Dynamic Workflow Patterns, One Run including LFD]]></description><link>https://interestingengineering.substack.com/p/the-harness-lab-automated</link><guid isPermaLink="false">https://interestingengineering.substack.com/p/the-harness-lab-automated</guid><dc:creator><![CDATA[Interesting Engineering ++]]></dc:creator><pubDate>Wed, 10 Jun 2026 16:41:16 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Ncax!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf51f5c5-51f2-411d-9739-33327724de20_1078x571.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ncax!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf51f5c5-51f2-411d-9739-33327724de20_1078x571.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ncax!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf51f5c5-51f2-411d-9739-33327724de20_1078x571.png 424w, https://substackcdn.com/image/fetch/$s_!Ncax!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf51f5c5-51f2-411d-9739-33327724de20_1078x571.png 848w, https://substackcdn.com/image/fetch/$s_!Ncax!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf51f5c5-51f2-411d-9739-33327724de20_1078x571.png 1272w, https://substackcdn.com/image/fetch/$s_!Ncax!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf51f5c5-51f2-411d-9739-33327724de20_1078x571.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ncax!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf51f5c5-51f2-411d-9739-33327724de20_1078x571.png" width="1078" height="571" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cf51f5c5-51f2-411d-9739-33327724de20_1078x571.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:571,&quot;width&quot;:1078,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:905543,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf51f5c5-51f2-411d-9739-33327724de20_1078x571.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ncax!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf51f5c5-51f2-411d-9739-33327724de20_1078x571.png 424w, https://substackcdn.com/image/fetch/$s_!Ncax!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf51f5c5-51f2-411d-9739-33327724de20_1078x571.png 848w, https://substackcdn.com/image/fetch/$s_!Ncax!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf51f5c5-51f2-411d-9739-33327724de20_1078x571.png 1272w, https://substackcdn.com/image/fetch/$s_!Ncax!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf51f5c5-51f2-411d-9739-33327724de20_1078x571.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Five Strategic Insights From Workflow Automation - The Harness Lab, Automated</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5AJ9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83fc2bca-cb00-4cae-adf2-4909e3abbf94_631x710.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5AJ9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83fc2bca-cb00-4cae-adf2-4909e3abbf94_631x710.png 424w, https://substackcdn.com/image/fetch/$s_!5AJ9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83fc2bca-cb00-4cae-adf2-4909e3abbf94_631x710.png 848w, https://substackcdn.com/image/fetch/$s_!5AJ9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83fc2bca-cb00-4cae-adf2-4909e3abbf94_631x710.png 1272w, https://substackcdn.com/image/fetch/$s_!5AJ9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83fc2bca-cb00-4cae-adf2-4909e3abbf94_631x710.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5AJ9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83fc2bca-cb00-4cae-adf2-4909e3abbf94_631x710.png" width="631" height="710" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/83fc2bca-cb00-4cae-adf2-4909e3abbf94_631x710.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:710,&quot;width&quot;:631,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:120000,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83fc2bca-cb00-4cae-adf2-4909e3abbf94_631x710.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5AJ9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83fc2bca-cb00-4cae-adf2-4909e3abbf94_631x710.png 424w, https://substackcdn.com/image/fetch/$s_!5AJ9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83fc2bca-cb00-4cae-adf2-4909e3abbf94_631x710.png 848w, https://substackcdn.com/image/fetch/$s_!5AJ9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83fc2bca-cb00-4cae-adf2-4909e3abbf94_631x710.png 1272w, https://substackcdn.com/image/fetch/$s_!5AJ9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83fc2bca-cb00-4cae-adf2-4909e3abbf94_631x710.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em><strong>Section 2.5 includes 3 versions of strategies that can be taken with /goal and loop - for observability, follow the experiment way first. I have also included 2 other versions in the section - &#8220;ala Boris Cherny&#8221; and Loss Function Definition (LFD).</strong></em></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zxvn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4be3acc-9b36-4566-8bd0-cb34dbdf04d1_898x232.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zxvn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4be3acc-9b36-4566-8bd0-cb34dbdf04d1_898x232.png 424w, https://substackcdn.com/image/fetch/$s_!zxvn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4be3acc-9b36-4566-8bd0-cb34dbdf04d1_898x232.png 848w, https://substackcdn.com/image/fetch/$s_!zxvn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4be3acc-9b36-4566-8bd0-cb34dbdf04d1_898x232.png 1272w, https://substackcdn.com/image/fetch/$s_!zxvn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4be3acc-9b36-4566-8bd0-cb34dbdf04d1_898x232.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zxvn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4be3acc-9b36-4566-8bd0-cb34dbdf04d1_898x232.png" width="898" height="232" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c4be3acc-9b36-4566-8bd0-cb34dbdf04d1_898x232.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:232,&quot;width&quot;:898,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:35560,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4be3acc-9b36-4566-8bd0-cb34dbdf04d1_898x232.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zxvn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4be3acc-9b36-4566-8bd0-cb34dbdf04d1_898x232.png 424w, https://substackcdn.com/image/fetch/$s_!zxvn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4be3acc-9b36-4566-8bd0-cb34dbdf04d1_898x232.png 848w, https://substackcdn.com/image/fetch/$s_!zxvn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4be3acc-9b36-4566-8bd0-cb34dbdf04d1_898x232.png 1272w, https://substackcdn.com/image/fetch/$s_!zxvn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4be3acc-9b36-4566-8bd0-cb34dbdf04d1_898x232.png 1456w" sizes="100vw"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zRcn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febf0fecd-c2a9-42a8-ae01-db2e93b322cd_785x53.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zRcn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febf0fecd-c2a9-42a8-ae01-db2e93b322cd_785x53.png 424w, https://substackcdn.com/image/fetch/$s_!zRcn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febf0fecd-c2a9-42a8-ae01-db2e93b322cd_785x53.png 848w, https://substackcdn.com/image/fetch/$s_!zRcn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febf0fecd-c2a9-42a8-ae01-db2e93b322cd_785x53.png 1272w, https://substackcdn.com/image/fetch/$s_!zRcn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febf0fecd-c2a9-42a8-ae01-db2e93b322cd_785x53.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zRcn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febf0fecd-c2a9-42a8-ae01-db2e93b322cd_785x53.png" width="785" height="53" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ebf0fecd-c2a9-42a8-ae01-db2e93b322cd_785x53.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:53,&quot;width&quot;:785,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5383,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febf0fecd-c2a9-42a8-ae01-db2e93b322cd_785x53.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!zRcn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febf0fecd-c2a9-42a8-ae01-db2e93b322cd_785x53.png 424w, https://substackcdn.com/image/fetch/$s_!zRcn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febf0fecd-c2a9-42a8-ae01-db2e93b322cd_785x53.png 848w, https://substackcdn.com/image/fetch/$s_!zRcn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febf0fecd-c2a9-42a8-ae01-db2e93b322cd_785x53.png 1272w, https://substackcdn.com/image/fetch/$s_!zRcn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febf0fecd-c2a9-42a8-ae01-db2e93b322cd_785x53.png 1456w" sizes="100vw"></picture><div></div></div></a></figure></div><h2>0.1 The H1&#8211;H10 Architectures: What Each One Was</h2><p><a href="https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated">The Harness Lab </a>experiments tested ten progressively complex harness architectures against a common benchmark task. Each harness was a different structural approach to the same problem. The table below summarises all ten, their alpha scores from the vendor evaluation domain, and the key findings for each:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!or_g!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30fd211f-a2f4-4daf-94cd-685700ab6106_1108x593.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!or_g!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30fd211f-a2f4-4daf-94cd-685700ab6106_1108x593.png 424w, https://substackcdn.com/image/fetch/$s_!or_g!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30fd211f-a2f4-4daf-94cd-685700ab6106_1108x593.png 848w, https://substackcdn.com/image/fetch/$s_!or_g!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30fd211f-a2f4-4daf-94cd-685700ab6106_1108x593.png 1272w, https://substackcdn.com/image/fetch/$s_!or_g!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30fd211f-a2f4-4daf-94cd-685700ab6106_1108x593.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!or_g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30fd211f-a2f4-4daf-94cd-685700ab6106_1108x593.png" width="1108" height="593" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/30fd211f-a2f4-4daf-94cd-685700ab6106_1108x593.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:593,&quot;width&quot;:1108,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:973003,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30fd211f-a2f4-4daf-94cd-685700ab6106_1108x593.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!or_g!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30fd211f-a2f4-4daf-94cd-685700ab6106_1108x593.png 424w, https://substackcdn.com/image/fetch/$s_!or_g!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30fd211f-a2f4-4daf-94cd-685700ab6106_1108x593.png 848w, https://substackcdn.com/image/fetch/$s_!or_g!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30fd211f-a2f4-4daf-94cd-685700ab6106_1108x593.png 1272w, https://substackcdn.com/image/fetch/$s_!or_g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30fd211f-a2f4-4daf-94cd-685700ab6106_1108x593.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AdTV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af83fc4-e035-47ff-9a12-97b36617a5a8_862x682.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AdTV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af83fc4-e035-47ff-9a12-97b36617a5a8_862x682.png 424w, https://substackcdn.com/image/fetch/$s_!AdTV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af83fc4-e035-47ff-9a12-97b36617a5a8_862x682.png 848w, https://substackcdn.com/image/fetch/$s_!AdTV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af83fc4-e035-47ff-9a12-97b36617a5a8_862x682.png 1272w, https://substackcdn.com/image/fetch/$s_!AdTV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af83fc4-e035-47ff-9a12-97b36617a5a8_862x682.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AdTV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af83fc4-e035-47ff-9a12-97b36617a5a8_862x682.png" width="862" height="682" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6af83fc4-e035-47ff-9a12-97b36617a5a8_862x682.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:682,&quot;width&quot;:862,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:82624,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af83fc4-e035-47ff-9a12-97b36617a5a8_862x682.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AdTV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af83fc4-e035-47ff-9a12-97b36617a5a8_862x682.png 424w, https://substackcdn.com/image/fetch/$s_!AdTV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af83fc4-e035-47ff-9a12-97b36617a5a8_862x682.png 848w, https://substackcdn.com/image/fetch/$s_!AdTV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af83fc4-e035-47ff-9a12-97b36617a5a8_862x682.png 1272w, https://substackcdn.com/image/fetch/$s_!AdTV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af83fc4-e035-47ff-9a12-97b36617a5a8_862x682.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cRI-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1760572-0173-4e56-993e-c13490645536_1122x607.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cRI-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1760572-0173-4e56-993e-c13490645536_1122x607.png 424w, https://substackcdn.com/image/fetch/$s_!cRI-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1760572-0173-4e56-993e-c13490645536_1122x607.png 848w, https://substackcdn.com/image/fetch/$s_!cRI-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1760572-0173-4e56-993e-c13490645536_1122x607.png 1272w, https://substackcdn.com/image/fetch/$s_!cRI-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1760572-0173-4e56-993e-c13490645536_1122x607.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cRI-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1760572-0173-4e56-993e-c13490645536_1122x607.png" width="1122" height="607" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d1760572-0173-4e56-993e-c13490645536_1122x607.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:607,&quot;width&quot;:1122,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1084537,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1760572-0173-4e56-993e-c13490645536_1122x607.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cRI-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1760572-0173-4e56-993e-c13490645536_1122x607.png 424w, https://substackcdn.com/image/fetch/$s_!cRI-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1760572-0173-4e56-993e-c13490645536_1122x607.png 848w, https://substackcdn.com/image/fetch/$s_!cRI-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1760572-0173-4e56-993e-c13490645536_1122x607.png 1272w, https://substackcdn.com/image/fetch/$s_!cRI-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1760572-0173-4e56-993e-c13490645536_1122x607.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>0.2 The ASCRS Finding</h2><p>The ASCRS experiment applied the H1&#8211;H10 taxonomy to a more complex benchmark: a pharmaceutical supply chain disruption in the Strait of Hormuz requiring a 72-hour response plan. H9 was predicted to win &#8212; the task was complex enough to justify a five-agent swarm. It did not. H2 and H7 produced the best results. The finding generalised: for well-specified, bounded tasks, precision and appropriate model routing outperform raw agent count.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aCOP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7db16eda-c934-451e-a12a-1d22683f195d_862x261.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aCOP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7db16eda-c934-451e-a12a-1d22683f195d_862x261.png 424w, https://substackcdn.com/image/fetch/$s_!aCOP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7db16eda-c934-451e-a12a-1d22683f195d_862x261.png 848w, https://substackcdn.com/image/fetch/$s_!aCOP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7db16eda-c934-451e-a12a-1d22683f195d_862x261.png 1272w, https://substackcdn.com/image/fetch/$s_!aCOP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7db16eda-c934-451e-a12a-1d22683f195d_862x261.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aCOP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7db16eda-c934-451e-a12a-1d22683f195d_862x261.png" width="862" height="261" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7db16eda-c934-451e-a12a-1d22683f195d_862x261.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:261,&quot;width&quot;:862,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:26728,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7db16eda-c934-451e-a12a-1d22683f195d_862x261.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aCOP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7db16eda-c934-451e-a12a-1d22683f195d_862x261.png 424w, https://substackcdn.com/image/fetch/$s_!aCOP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7db16eda-c934-451e-a12a-1d22683f195d_862x261.png 848w, https://substackcdn.com/image/fetch/$s_!aCOP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7db16eda-c934-451e-a12a-1d22683f195d_862x261.png 1272w, https://substackcdn.com/image/fetch/$s_!aCOP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7db16eda-c934-451e-a12a-1d22683f195d_862x261.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Key insight</strong>: Harness complexity only adds value when the task genuinely requires coordination across separate reasoning streams. A well-specified single-agent prompt is harder to beat than it looks.</p><h2>0.3 Why I Am Automating This</h2><p>The H1&#8211;H10 evaluation took weeks of coordinated effort between departmental approvals (parallel runs were incorporated for safe measure/risk mitigation): designing each harness, running it, reviewing outputs across separate sessions, comparing results, and deciding what to run next. Every step between stages was manual. The developer &#8212; me/claude &#8212; played the orchestrator role.</p><p>Dynamic workflows, change the cost structure of that process. A workflow can generate harness designs, test them, adversarially probe for failure modes, run a pairwise tournament, and loop until a quality threshold is met &#8212; autonomously, in one run - is worth looking into and applying. <strong>The Key question this experiment asks is whether the workflow rediscovers the ASCRS finding: does Design B (the H2 equivalent) beat Design E (the H9 equivalent) without human coordination of the comparison?</strong></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZvJ1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ce015bb-c42b-4546-b21b-63f7749d677f_862x150.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZvJ1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ce015bb-c42b-4546-b21b-63f7749d677f_862x150.png 424w, https://substackcdn.com/image/fetch/$s_!ZvJ1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ce015bb-c42b-4546-b21b-63f7749d677f_862x150.png 848w, https://substackcdn.com/image/fetch/$s_!ZvJ1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ce015bb-c42b-4546-b21b-63f7749d677f_862x150.png 1272w, https://substackcdn.com/image/fetch/$s_!ZvJ1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ce015bb-c42b-4546-b21b-63f7749d677f_862x150.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZvJ1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ce015bb-c42b-4546-b21b-63f7749d677f_862x150.png" width="862" height="150" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5ce015bb-c42b-4546-b21b-63f7749d677f_862x150.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:150,&quot;width&quot;:862,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:28322,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ce015bb-c42b-4546-b21b-63f7749d677f_862x150.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZvJ1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ce015bb-c42b-4546-b21b-63f7749d677f_862x150.png 424w, https://substackcdn.com/image/fetch/$s_!ZvJ1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ce015bb-c42b-4546-b21b-63f7749d677f_862x150.png 848w, https://substackcdn.com/image/fetch/$s_!ZvJ1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ce015bb-c42b-4546-b21b-63f7749d677f_862x150.png 1272w, https://substackcdn.com/image/fetch/$s_!ZvJ1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ce015bb-c42b-4546-b21b-63f7749d677f_862x150.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Q561!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf0bfc47-360f-4fe9-936b-7b5f0cfcbb75_861x27.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Q561!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf0bfc47-360f-4fe9-936b-7b5f0cfcbb75_861x27.png 424w, https://substackcdn.com/image/fetch/$s_!Q561!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf0bfc47-360f-4fe9-936b-7b5f0cfcbb75_861x27.png 848w, https://substackcdn.com/image/fetch/$s_!Q561!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf0bfc47-360f-4fe9-936b-7b5f0cfcbb75_861x27.png 1272w, https://substackcdn.com/image/fetch/$s_!Q561!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf0bfc47-360f-4fe9-936b-7b5f0cfcbb75_861x27.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Q561!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf0bfc47-360f-4fe9-936b-7b5f0cfcbb75_861x27.png" width="861" height="27" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bf0bfc47-360f-4fe9-936b-7b5f0cfcbb75_861x27.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:27,&quot;width&quot;:861,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5089,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf0bfc47-360f-4fe9-936b-7b5f0cfcbb75_861x27.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Q561!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf0bfc47-360f-4fe9-936b-7b5f0cfcbb75_861x27.png 424w, https://substackcdn.com/image/fetch/$s_!Q561!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf0bfc47-360f-4fe9-936b-7b5f0cfcbb75_861x27.png 848w, https://substackcdn.com/image/fetch/$s_!Q561!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf0bfc47-360f-4fe9-936b-7b5f0cfcbb75_861x27.png 1272w, https://substackcdn.com/image/fetch/$s_!Q561!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf0bfc47-360f-4fe9-936b-7b5f0cfcbb75_861x27.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h2>1.1 Why Five Designs First</h2><p>Ten designs through six automated stages &#8212; generate, classify, adversarially probe, filter test scenarios, tournament, loop &#8212; is expected to run to approximately 30,000 tokens. Five designs sits at approximately 20,000. These are estimates and &#8220;control gears&#8221; i incorporate within the experiment mechanisms. For a first run, the proof-of-concept (<strong>POC) value is in confirming the workflow functions correctly, not in maximising the design range</strong>. <strong>Five designs cover the structural extremes that matter: minimal single-agent, structured single-agent, two-agent chain, parallel multi-agent, and full swarm.</strong></p><p>The five designs (A through E) map directly to the H1&#8211;H10 architectures that bracket the original result. If Design B beats Design E in the automated tournament, the workflow has reproduced the core ASCRS finding with 40% of the token cost of the ten-design run.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!d0yc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf34f92c-3b62-41e0-9bf1-c890fb254b87_862x410.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!d0yc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf34f92c-3b62-41e0-9bf1-c890fb254b87_862x410.png 424w, https://substackcdn.com/image/fetch/$s_!d0yc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf34f92c-3b62-41e0-9bf1-c890fb254b87_862x410.png 848w, https://substackcdn.com/image/fetch/$s_!d0yc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf34f92c-3b62-41e0-9bf1-c890fb254b87_862x410.png 1272w, https://substackcdn.com/image/fetch/$s_!d0yc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf34f92c-3b62-41e0-9bf1-c890fb254b87_862x410.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!d0yc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf34f92c-3b62-41e0-9bf1-c890fb254b87_862x410.png" width="862" height="410" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bf34f92c-3b62-41e0-9bf1-c890fb254b87_862x410.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:410,&quot;width&quot;:862,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:44162,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf34f92c-3b62-41e0-9bf1-c890fb254b87_862x410.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!d0yc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf34f92c-3b62-41e0-9bf1-c890fb254b87_862x410.png 424w, https://substackcdn.com/image/fetch/$s_!d0yc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf34f92c-3b62-41e0-9bf1-c890fb254b87_862x410.png 848w, https://substackcdn.com/image/fetch/$s_!d0yc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf34f92c-3b62-41e0-9bf1-c890fb254b87_862x410.png 1272w, https://substackcdn.com/image/fetch/$s_!d0yc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf34f92c-3b62-41e0-9bf1-c890fb254b87_862x410.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4I20!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68a6e710-13f5-4469-9209-cf555d6cf245_862x118.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4I20!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68a6e710-13f5-4469-9209-cf555d6cf245_862x118.png 424w, https://substackcdn.com/image/fetch/$s_!4I20!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68a6e710-13f5-4469-9209-cf555d6cf245_862x118.png 848w, https://substackcdn.com/image/fetch/$s_!4I20!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68a6e710-13f5-4469-9209-cf555d6cf245_862x118.png 1272w, https://substackcdn.com/image/fetch/$s_!4I20!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68a6e710-13f5-4469-9209-cf555d6cf245_862x118.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4I20!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68a6e710-13f5-4469-9209-cf555d6cf245_862x118.png" width="862" height="118" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/68a6e710-13f5-4469-9209-cf555d6cf245_862x118.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:118,&quot;width&quot;:862,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24683,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68a6e710-13f5-4469-9209-cf555d6cf245_862x118.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4I20!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68a6e710-13f5-4469-9209-cf555d6cf245_862x118.png 424w, https://substackcdn.com/image/fetch/$s_!4I20!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68a6e710-13f5-4469-9209-cf555d6cf245_862x118.png 848w, https://substackcdn.com/image/fetch/$s_!4I20!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68a6e710-13f5-4469-9209-cf555d6cf245_862x118.png 1272w, https://substackcdn.com/image/fetch/$s_!4I20!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68a6e710-13f5-4469-9209-cf555d6cf245_862x118.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h2>1.2 The Experiment at a Glance</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YHuZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3287a99-cc10-4d8d-9415-303dc7914ceb_703x735.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YHuZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3287a99-cc10-4d8d-9415-303dc7914ceb_703x735.png 424w, https://substackcdn.com/image/fetch/$s_!YHuZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3287a99-cc10-4d8d-9415-303dc7914ceb_703x735.png 848w, https://substackcdn.com/image/fetch/$s_!YHuZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3287a99-cc10-4d8d-9415-303dc7914ceb_703x735.png 1272w, https://substackcdn.com/image/fetch/$s_!YHuZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3287a99-cc10-4d8d-9415-303dc7914ceb_703x735.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YHuZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3287a99-cc10-4d8d-9415-303dc7914ceb_703x735.png" width="703" height="735" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a3287a99-cc10-4d8d-9415-303dc7914ceb_703x735.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:735,&quot;width&quot;:703,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:52882,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3287a99-cc10-4d8d-9415-303dc7914ceb_703x735.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YHuZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3287a99-cc10-4d8d-9415-303dc7914ceb_703x735.png 424w, https://substackcdn.com/image/fetch/$s_!YHuZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3287a99-cc10-4d8d-9415-303dc7914ceb_703x735.png 848w, https://substackcdn.com/image/fetch/$s_!YHuZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3287a99-cc10-4d8d-9415-303dc7914ceb_703x735.png 1272w, https://substackcdn.com/image/fetch/$s_!YHuZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3287a99-cc10-4d8d-9415-303dc7914ceb_703x735.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>1.3 Six Patterns in Their Natural Roles</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NIIg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75078496-c0dc-4e35-9e69-dc7dd927f46d_1121x611.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NIIg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75078496-c0dc-4e35-9e69-dc7dd927f46d_1121x611.png 424w, https://substackcdn.com/image/fetch/$s_!NIIg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75078496-c0dc-4e35-9e69-dc7dd927f46d_1121x611.png 848w, https://substackcdn.com/image/fetch/$s_!NIIg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75078496-c0dc-4e35-9e69-dc7dd927f46d_1121x611.png 1272w, https://substackcdn.com/image/fetch/$s_!NIIg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75078496-c0dc-4e35-9e69-dc7dd927f46d_1121x611.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NIIg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75078496-c0dc-4e35-9e69-dc7dd927f46d_1121x611.png" width="1121" height="611" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/75078496-c0dc-4e35-9e69-dc7dd927f46d_1121x611.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:611,&quot;width&quot;:1121,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1029163,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75078496-c0dc-4e35-9e69-dc7dd927f46d_1121x611.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!NIIg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75078496-c0dc-4e35-9e69-dc7dd927f46d_1121x611.png 424w, https://substackcdn.com/image/fetch/$s_!NIIg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75078496-c0dc-4e35-9e69-dc7dd927f46d_1121x611.png 848w, https://substackcdn.com/image/fetch/$s_!NIIg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75078496-c0dc-4e35-9e69-dc7dd927f46d_1121x611.png 1272w, https://substackcdn.com/image/fetch/$s_!NIIg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75078496-c0dc-4e35-9e69-dc7dd927f46d_1121x611.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The Six Patterns were detailed in the recent: <a href="https://interestingengineering.substack.com/p/the-prompt-is-still-the-work-dynamic">The Prompt Is Still The Work - Dynamic Workflows in Claude Code</a>. Each pattern maps directly and naturally to a stage. These are not forced assignments &#8212; the experiment was designed around these patterns, not the other way around.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Mx7G!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfb00a8d-a5d6-4f2f-a211-6202e5d45539_913x647.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Mx7G!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfb00a8d-a5d6-4f2f-a211-6202e5d45539_913x647.png 424w, https://substackcdn.com/image/fetch/$s_!Mx7G!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfb00a8d-a5d6-4f2f-a211-6202e5d45539_913x647.png 848w, https://substackcdn.com/image/fetch/$s_!Mx7G!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfb00a8d-a5d6-4f2f-a211-6202e5d45539_913x647.png 1272w, https://substackcdn.com/image/fetch/$s_!Mx7G!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfb00a8d-a5d6-4f2f-a211-6202e5d45539_913x647.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Mx7G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfb00a8d-a5d6-4f2f-a211-6202e5d45539_913x647.png" width="913" height="647" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bfb00a8d-a5d6-4f2f-a211-6202e5d45539_913x647.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:647,&quot;width&quot;:913,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:96683,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfb00a8d-a5d6-4f2f-a211-6202e5d45539_913x647.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Mx7G!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfb00a8d-a5d6-4f2f-a211-6202e5d45539_913x647.png 424w, https://substackcdn.com/image/fetch/$s_!Mx7G!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfb00a8d-a5d6-4f2f-a211-6202e5d45539_913x647.png 848w, https://substackcdn.com/image/fetch/$s_!Mx7G!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfb00a8d-a5d6-4f2f-a211-6202e5d45539_913x647.png 1272w, https://substackcdn.com/image/fetch/$s_!Mx7G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfb00a8d-a5d6-4f2f-a211-6202e5d45539_913x647.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>1.4 The Benchmark</h2><h3>1.4.1 Why the Benchmark Matters More Than the Workflow</h3><p>The workflow generates designs, tests them, and picks a winner. But winning means whatever the benchmark says it means. A vague gold_answer produces inconsistent scoring. An easy benchmark task makes all designs look identical and the tournament picks noise. The benchmark is the specification &#8212; and specification remains the skilled work.</p><h2>1.4.2 benchmark/task.md</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!561R!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95fd915c-4d89-4a7b-b566-effc56976e96_865x315.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!561R!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95fd915c-4d89-4a7b-b566-effc56976e96_865x315.png 424w, https://substackcdn.com/image/fetch/$s_!561R!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95fd915c-4d89-4a7b-b566-effc56976e96_865x315.png 848w, https://substackcdn.com/image/fetch/$s_!561R!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95fd915c-4d89-4a7b-b566-effc56976e96_865x315.png 1272w, https://substackcdn.com/image/fetch/$s_!561R!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95fd915c-4d89-4a7b-b566-effc56976e96_865x315.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!561R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95fd915c-4d89-4a7b-b566-effc56976e96_865x315.png" width="865" height="315" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/95fd915c-4d89-4a7b-b566-effc56976e96_865x315.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:315,&quot;width&quot;:865,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:34091,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95fd915c-4d89-4a7b-b566-effc56976e96_865x315.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!561R!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95fd915c-4d89-4a7b-b566-effc56976e96_865x315.png 424w, https://substackcdn.com/image/fetch/$s_!561R!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95fd915c-4d89-4a7b-b566-effc56976e96_865x315.png 848w, https://substackcdn.com/image/fetch/$s_!561R!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95fd915c-4d89-4a7b-b566-effc56976e96_865x315.png 1272w, https://substackcdn.com/image/fetch/$s_!561R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95fd915c-4d89-4a7b-b566-effc56976e96_865x315.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>1.4.3 benchmark/gold_answer.md &#8212; Five Binary Criteria</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bJD9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a9fc2c0-612f-428c-853a-d15f78883f1a_863x197.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bJD9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a9fc2c0-612f-428c-853a-d15f78883f1a_863x197.png 424w, https://substackcdn.com/image/fetch/$s_!bJD9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a9fc2c0-612f-428c-853a-d15f78883f1a_863x197.png 848w, https://substackcdn.com/image/fetch/$s_!bJD9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a9fc2c0-612f-428c-853a-d15f78883f1a_863x197.png 1272w, https://substackcdn.com/image/fetch/$s_!bJD9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a9fc2c0-612f-428c-853a-d15f78883f1a_863x197.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bJD9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a9fc2c0-612f-428c-853a-d15f78883f1a_863x197.png" width="863" height="197" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3a9fc2c0-612f-428c-853a-d15f78883f1a_863x197.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:197,&quot;width&quot;:863,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24814,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a9fc2c0-612f-428c-853a-d15f78883f1a_863x197.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bJD9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a9fc2c0-612f-428c-853a-d15f78883f1a_863x197.png 424w, https://substackcdn.com/image/fetch/$s_!bJD9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a9fc2c0-612f-428c-853a-d15f78883f1a_863x197.png 848w, https://substackcdn.com/image/fetch/$s_!bJD9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a9fc2c0-612f-428c-853a-d15f78883f1a_863x197.png 1272w, https://substackcdn.com/image/fetch/$s_!bJD9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a9fc2c0-612f-428c-853a-d15f78883f1a_863x197.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-i51!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4dafa825-c5f2-4008-a77c-e40126ffea80_862x130.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-i51!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4dafa825-c5f2-4008-a77c-e40126ffea80_862x130.png 424w, https://substackcdn.com/image/fetch/$s_!-i51!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4dafa825-c5f2-4008-a77c-e40126ffea80_862x130.png 848w, https://substackcdn.com/image/fetch/$s_!-i51!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4dafa825-c5f2-4008-a77c-e40126ffea80_862x130.png 1272w, https://substackcdn.com/image/fetch/$s_!-i51!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4dafa825-c5f2-4008-a77c-e40126ffea80_862x130.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-i51!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4dafa825-c5f2-4008-a77c-e40126ffea80_862x130.png" width="862" height="130" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4dafa825-c5f2-4008-a77c-e40126ffea80_862x130.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:130,&quot;width&quot;:862,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:29763,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4dafa825-c5f2-4008-a77c-e40126ffea80_862x130.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-i51!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4dafa825-c5f2-4008-a77c-e40126ffea80_862x130.png 424w, https://substackcdn.com/image/fetch/$s_!-i51!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4dafa825-c5f2-4008-a77c-e40126ffea80_862x130.png 848w, https://substackcdn.com/image/fetch/$s_!-i51!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4dafa825-c5f2-4008-a77c-e40126ffea80_862x130.png 1272w, https://substackcdn.com/image/fetch/$s_!-i51!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4dafa825-c5f2-4008-a77c-e40126ffea80_862x130.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kOd8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F600bf954-18bb-4274-9105-eca0212b2d72_1101x617.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kOd8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F600bf954-18bb-4274-9105-eca0212b2d72_1101x617.png 424w, https://substackcdn.com/image/fetch/$s_!kOd8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F600bf954-18bb-4274-9105-eca0212b2d72_1101x617.png 848w, https://substackcdn.com/image/fetch/$s_!kOd8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F600bf954-18bb-4274-9105-eca0212b2d72_1101x617.png 1272w, https://substackcdn.com/image/fetch/$s_!kOd8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F600bf954-18bb-4274-9105-eca0212b2d72_1101x617.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kOd8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F600bf954-18bb-4274-9105-eca0212b2d72_1101x617.png" width="1101" height="617" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/600bf954-18bb-4274-9105-eca0212b2d72_1101x617.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:617,&quot;width&quot;:1101,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1021829,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F600bf954-18bb-4274-9105-eca0212b2d72_1101x617.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kOd8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F600bf954-18bb-4274-9105-eca0212b2d72_1101x617.png 424w, https://substackcdn.com/image/fetch/$s_!kOd8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F600bf954-18bb-4274-9105-eca0212b2d72_1101x617.png 848w, https://substackcdn.com/image/fetch/$s_!kOd8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F600bf954-18bb-4274-9105-eca0212b2d72_1101x617.png 1272w, https://substackcdn.com/image/fetch/$s_!kOd8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F600bf954-18bb-4274-9105-eca0212b2d72_1101x617.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>1.5 Three Metrics</h2><p><strong>I have revised the metrics to add reasonable consideration and build up from the original experiment - it evaluates not only the quality of output but measures it against design complexity:</strong></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ji6D!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34344c29-958b-4b6a-86a9-d4d8ff91fbb2_865x202.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ji6D!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34344c29-958b-4b6a-86a9-d4d8ff91fbb2_865x202.png 424w, https://substackcdn.com/image/fetch/$s_!ji6D!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34344c29-958b-4b6a-86a9-d4d8ff91fbb2_865x202.png 848w, https://substackcdn.com/image/fetch/$s_!ji6D!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34344c29-958b-4b6a-86a9-d4d8ff91fbb2_865x202.png 1272w, https://substackcdn.com/image/fetch/$s_!ji6D!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34344c29-958b-4b6a-86a9-d4d8ff91fbb2_865x202.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ji6D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34344c29-958b-4b6a-86a9-d4d8ff91fbb2_865x202.png" width="865" height="202" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/34344c29-958b-4b6a-86a9-d4d8ff91fbb2_865x202.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:202,&quot;width&quot;:865,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:28074,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34344c29-958b-4b6a-86a9-d4d8ff91fbb2_865x202.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ji6D!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34344c29-958b-4b6a-86a9-d4d8ff91fbb2_865x202.png 424w, https://substackcdn.com/image/fetch/$s_!ji6D!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34344c29-958b-4b6a-86a9-d4d8ff91fbb2_865x202.png 848w, https://substackcdn.com/image/fetch/$s_!ji6D!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34344c29-958b-4b6a-86a9-d4d8ff91fbb2_865x202.png 1272w, https://substackcdn.com/image/fetch/$s_!ji6D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34344c29-958b-4b6a-86a9-d4d8ff91fbb2_865x202.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>1.5.1 Why/How Sigma Explains the ASCRS Result</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RDPr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac17af44-2aba-4932-ac89-0b3fea3732bc_867x282.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RDPr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac17af44-2aba-4932-ac89-0b3fea3732bc_867x282.png 424w, https://substackcdn.com/image/fetch/$s_!RDPr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac17af44-2aba-4932-ac89-0b3fea3732bc_867x282.png 848w, https://substackcdn.com/image/fetch/$s_!RDPr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac17af44-2aba-4932-ac89-0b3fea3732bc_867x282.png 1272w, https://substackcdn.com/image/fetch/$s_!RDPr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac17af44-2aba-4932-ac89-0b3fea3732bc_867x282.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RDPr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac17af44-2aba-4932-ac89-0b3fea3732bc_867x282.png" width="867" height="282" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ac17af44-2aba-4932-ac89-0b3fea3732bc_867x282.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:282,&quot;width&quot;:867,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24243,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac17af44-2aba-4932-ac89-0b3fea3732bc_867x282.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RDPr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac17af44-2aba-4932-ac89-0b3fea3732bc_867x282.png 424w, https://substackcdn.com/image/fetch/$s_!RDPr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac17af44-2aba-4932-ac89-0b3fea3732bc_867x282.png 848w, https://substackcdn.com/image/fetch/$s_!RDPr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac17af44-2aba-4932-ac89-0b3fea3732bc_867x282.png 1272w, https://substackcdn.com/image/fetch/$s_!RDPr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac17af44-2aba-4932-ac89-0b3fea3732bc_867x282.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>1.6 Folder Structure</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yE9i!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0222dd1-1167-40b4-b41b-a22b348ee19d_866x417.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yE9i!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0222dd1-1167-40b4-b41b-a22b348ee19d_866x417.png 424w, https://substackcdn.com/image/fetch/$s_!yE9i!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0222dd1-1167-40b4-b41b-a22b348ee19d_866x417.png 848w, https://substackcdn.com/image/fetch/$s_!yE9i!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0222dd1-1167-40b4-b41b-a22b348ee19d_866x417.png 1272w, https://substackcdn.com/image/fetch/$s_!yE9i!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0222dd1-1167-40b4-b41b-a22b348ee19d_866x417.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yE9i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0222dd1-1167-40b4-b41b-a22b348ee19d_866x417.png" width="866" height="417" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b0222dd1-1167-40b4-b41b-a22b348ee19d_866x417.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:417,&quot;width&quot;:866,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:42485,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0222dd1-1167-40b4-b41b-a22b348ee19d_866x417.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yE9i!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0222dd1-1167-40b4-b41b-a22b348ee19d_866x417.png 424w, https://substackcdn.com/image/fetch/$s_!yE9i!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0222dd1-1167-40b4-b41b-a22b348ee19d_866x417.png 848w, https://substackcdn.com/image/fetch/$s_!yE9i!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0222dd1-1167-40b4-b41b-a22b348ee19d_866x417.png 1272w, https://substackcdn.com/image/fetch/$s_!yE9i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0222dd1-1167-40b4-b41b-a22b348ee19d_866x417.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>1.7 Claude Code Functions: What They Do</h2><p>Quick reminder. You do not write the JavaScript. The workflow generates it. But knowing what these functions do helps you write better prompts and interpret the raw script before you approve the run.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2Ure!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec06eb24-41f4-4437-854a-6bda7036951c_863x660.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2Ure!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec06eb24-41f4-4437-854a-6bda7036951c_863x660.png 424w, https://substackcdn.com/image/fetch/$s_!2Ure!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec06eb24-41f4-4437-854a-6bda7036951c_863x660.png 848w, https://substackcdn.com/image/fetch/$s_!2Ure!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec06eb24-41f4-4437-854a-6bda7036951c_863x660.png 1272w, https://substackcdn.com/image/fetch/$s_!2Ure!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec06eb24-41f4-4437-854a-6bda7036951c_863x660.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2Ure!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec06eb24-41f4-4437-854a-6bda7036951c_863x660.png" width="863" height="660" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec06eb24-41f4-4437-854a-6bda7036951c_863x660.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:660,&quot;width&quot;:863,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:91329,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec06eb24-41f4-4437-854a-6bda7036951c_863x660.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2Ure!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec06eb24-41f4-4437-854a-6bda7036951c_863x660.png 424w, https://substackcdn.com/image/fetch/$s_!2Ure!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec06eb24-41f4-4437-854a-6bda7036951c_863x660.png 848w, https://substackcdn.com/image/fetch/$s_!2Ure!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec06eb24-41f4-4437-854a-6bda7036951c_863x660.png 1272w, https://substackcdn.com/image/fetch/$s_!2Ure!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec06eb24-41f4-4437-854a-6bda7036951c_863x660.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>1.8 CLAUDE.md &#8212; Project Instructions</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Wpoe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6cd464d-ee74-454a-98e4-dfba129e690a_868x412.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Wpoe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6cd464d-ee74-454a-98e4-dfba129e690a_868x412.png 424w, https://substackcdn.com/image/fetch/$s_!Wpoe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6cd464d-ee74-454a-98e4-dfba129e690a_868x412.png 848w, https://substackcdn.com/image/fetch/$s_!Wpoe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6cd464d-ee74-454a-98e4-dfba129e690a_868x412.png 1272w, https://substackcdn.com/image/fetch/$s_!Wpoe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6cd464d-ee74-454a-98e4-dfba129e690a_868x412.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Wpoe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6cd464d-ee74-454a-98e4-dfba129e690a_868x412.png" width="868" height="412" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c6cd464d-ee74-454a-98e4-dfba129e690a_868x412.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:412,&quot;width&quot;:868,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:40184,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6cd464d-ee74-454a-98e4-dfba129e690a_868x412.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Wpoe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6cd464d-ee74-454a-98e4-dfba129e690a_868x412.png 424w, https://substackcdn.com/image/fetch/$s_!Wpoe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6cd464d-ee74-454a-98e4-dfba129e690a_868x412.png 848w, https://substackcdn.com/image/fetch/$s_!Wpoe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6cd464d-ee74-454a-98e4-dfba129e690a_868x412.png 1272w, https://substackcdn.com/image/fetch/$s_!Wpoe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6cd464d-ee74-454a-98e4-dfba129e690a_868x412.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>1.9 Prompts &#8212; Step by Step</h2><h3>Step 0: Create the Entire Project Environment</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OKtI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d8cab31-6e05-4393-b148-ff25b656c469_866x328.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OKtI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d8cab31-6e05-4393-b148-ff25b656c469_866x328.png 424w, https://substackcdn.com/image/fetch/$s_!OKtI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d8cab31-6e05-4393-b148-ff25b656c469_866x328.png 848w, https://substackcdn.com/image/fetch/$s_!OKtI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d8cab31-6e05-4393-b148-ff25b656c469_866x328.png 1272w, https://substackcdn.com/image/fetch/$s_!OKtI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d8cab31-6e05-4393-b148-ff25b656c469_866x328.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OKtI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d8cab31-6e05-4393-b148-ff25b656c469_866x328.png" width="866" height="328" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9d8cab31-6e05-4393-b148-ff25b656c469_866x328.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:328,&quot;width&quot;:866,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:58149,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d8cab31-6e05-4393-b148-ff25b656c469_866x328.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OKtI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d8cab31-6e05-4393-b148-ff25b656c469_866x328.png 424w, https://substackcdn.com/image/fetch/$s_!OKtI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d8cab31-6e05-4393-b148-ff25b656c469_866x328.png 848w, https://substackcdn.com/image/fetch/$s_!OKtI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d8cab31-6e05-4393-b148-ff25b656c469_866x328.png 1272w, https://substackcdn.com/image/fetch/$s_!OKtI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d8cab31-6e05-4393-b148-ff25b656c469_866x328.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Stage 1: Fan-out Design Generation</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KGBu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff33aa7b7-281f-425f-b444-3c0293d89e68_866x427.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KGBu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff33aa7b7-281f-425f-b444-3c0293d89e68_866x427.png 424w, https://substackcdn.com/image/fetch/$s_!KGBu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff33aa7b7-281f-425f-b444-3c0293d89e68_866x427.png 848w, https://substackcdn.com/image/fetch/$s_!KGBu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff33aa7b7-281f-425f-b444-3c0293d89e68_866x427.png 1272w, https://substackcdn.com/image/fetch/$s_!KGBu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff33aa7b7-281f-425f-b444-3c0293d89e68_866x427.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KGBu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff33aa7b7-281f-425f-b444-3c0293d89e68_866x427.png" width="866" height="427" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f33aa7b7-281f-425f-b444-3c0293d89e68_866x427.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:427,&quot;width&quot;:866,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:81657,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff33aa7b7-281f-425f-b444-3c0293d89e68_866x427.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KGBu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff33aa7b7-281f-425f-b444-3c0293d89e68_866x427.png 424w, https://substackcdn.com/image/fetch/$s_!KGBu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff33aa7b7-281f-425f-b444-3c0293d89e68_866x427.png 848w, https://substackcdn.com/image/fetch/$s_!KGBu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff33aa7b7-281f-425f-b444-3c0293d89e68_866x427.png 1272w, https://substackcdn.com/image/fetch/$s_!KGBu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff33aa7b7-281f-425f-b444-3c0293d89e68_866x427.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Stage 1a: Design Verification (Gate Before Stage 2)</h3><p>Run this immediately after Stage 1 completes, before Stage 2 begins. It checks whether each generated design actually matches its specified H-equivalent structure. If any design is flagged, fix it before proceeding &#8212; a mismatched design wastes all downstream token spend.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cuyA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62248b38-cfc7-431f-8073-11ec38c25810_867x361.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cuyA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62248b38-cfc7-431f-8073-11ec38c25810_867x361.png 424w, https://substackcdn.com/image/fetch/$s_!cuyA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62248b38-cfc7-431f-8073-11ec38c25810_867x361.png 848w, https://substackcdn.com/image/fetch/$s_!cuyA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62248b38-cfc7-431f-8073-11ec38c25810_867x361.png 1272w, https://substackcdn.com/image/fetch/$s_!cuyA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62248b38-cfc7-431f-8073-11ec38c25810_867x361.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cuyA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62248b38-cfc7-431f-8073-11ec38c25810_867x361.png" width="867" height="361" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/62248b38-cfc7-431f-8073-11ec38c25810_867x361.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:361,&quot;width&quot;:867,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:65703,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62248b38-cfc7-431f-8073-11ec38c25810_867x361.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cuyA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62248b38-cfc7-431f-8073-11ec38c25810_867x361.png 424w, https://substackcdn.com/image/fetch/$s_!cuyA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62248b38-cfc7-431f-8073-11ec38c25810_867x361.png 848w, https://substackcdn.com/image/fetch/$s_!cuyA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62248b38-cfc7-431f-8073-11ec38c25810_867x361.png 1272w, https://substackcdn.com/image/fetch/$s_!cuyA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62248b38-cfc7-431f-8073-11ec38c25810_867x361.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Stage 2: Classify-and-Act</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Jntk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d6215aa-ffdc-4864-8438-7aec3d160696_865x127.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Jntk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d6215aa-ffdc-4864-8438-7aec3d160696_865x127.png 424w, https://substackcdn.com/image/fetch/$s_!Jntk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d6215aa-ffdc-4864-8438-7aec3d160696_865x127.png 848w, https://substackcdn.com/image/fetch/$s_!Jntk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d6215aa-ffdc-4864-8438-7aec3d160696_865x127.png 1272w, https://substackcdn.com/image/fetch/$s_!Jntk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d6215aa-ffdc-4864-8438-7aec3d160696_865x127.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Jntk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d6215aa-ffdc-4864-8438-7aec3d160696_865x127.png" width="865" height="127" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3d6215aa-ffdc-4864-8438-7aec3d160696_865x127.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:127,&quot;width&quot;:865,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:17096,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d6215aa-ffdc-4864-8438-7aec3d160696_865x127.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Jntk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d6215aa-ffdc-4864-8438-7aec3d160696_865x127.png 424w, https://substackcdn.com/image/fetch/$s_!Jntk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d6215aa-ffdc-4864-8438-7aec3d160696_865x127.png 848w, https://substackcdn.com/image/fetch/$s_!Jntk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d6215aa-ffdc-4864-8438-7aec3d160696_865x127.png 1272w, https://substackcdn.com/image/fetch/$s_!Jntk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d6215aa-ffdc-4864-8438-7aec3d160696_865x127.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Stage 3: Adversarial Verification</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ii56!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d7d7d2-6eed-4544-9dea-130f8a89cf0e_866x181.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ii56!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d7d7d2-6eed-4544-9dea-130f8a89cf0e_866x181.png 424w, https://substackcdn.com/image/fetch/$s_!ii56!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d7d7d2-6eed-4544-9dea-130f8a89cf0e_866x181.png 848w, https://substackcdn.com/image/fetch/$s_!ii56!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d7d7d2-6eed-4544-9dea-130f8a89cf0e_866x181.png 1272w, https://substackcdn.com/image/fetch/$s_!ii56!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d7d7d2-6eed-4544-9dea-130f8a89cf0e_866x181.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ii56!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d7d7d2-6eed-4544-9dea-130f8a89cf0e_866x181.png" width="866" height="181" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/49d7d7d2-6eed-4544-9dea-130f8a89cf0e_866x181.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:181,&quot;width&quot;:866,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:35242,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d7d7d2-6eed-4544-9dea-130f8a89cf0e_866x181.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ii56!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d7d7d2-6eed-4544-9dea-130f8a89cf0e_866x181.png 424w, https://substackcdn.com/image/fetch/$s_!ii56!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d7d7d2-6eed-4544-9dea-130f8a89cf0e_866x181.png 848w, https://substackcdn.com/image/fetch/$s_!ii56!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d7d7d2-6eed-4544-9dea-130f8a89cf0e_866x181.png 1272w, https://substackcdn.com/image/fetch/$s_!ii56!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d7d7d2-6eed-4544-9dea-130f8a89cf0e_866x181.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Stage 4: Generate-and-Filter Test Scenarios</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!c92h!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68bb9a42-d7f2-4fcb-9395-0c67f95601de_868x202.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!c92h!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68bb9a42-d7f2-4fcb-9395-0c67f95601de_868x202.png 424w, https://substackcdn.com/image/fetch/$s_!c92h!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68bb9a42-d7f2-4fcb-9395-0c67f95601de_868x202.png 848w, https://substackcdn.com/image/fetch/$s_!c92h!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68bb9a42-d7f2-4fcb-9395-0c67f95601de_868x202.png 1272w, https://substackcdn.com/image/fetch/$s_!c92h!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68bb9a42-d7f2-4fcb-9395-0c67f95601de_868x202.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!c92h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68bb9a42-d7f2-4fcb-9395-0c67f95601de_868x202.png" width="868" height="202" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/68bb9a42-d7f2-4fcb-9395-0c67f95601de_868x202.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:202,&quot;width&quot;:868,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:38821,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68bb9a42-d7f2-4fcb-9395-0c67f95601de_868x202.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!c92h!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68bb9a42-d7f2-4fcb-9395-0c67f95601de_868x202.png 424w, https://substackcdn.com/image/fetch/$s_!c92h!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68bb9a42-d7f2-4fcb-9395-0c67f95601de_868x202.png 848w, https://substackcdn.com/image/fetch/$s_!c92h!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68bb9a42-d7f2-4fcb-9395-0c67f95601de_868x202.png 1272w, https://substackcdn.com/image/fetch/$s_!c92h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68bb9a42-d7f2-4fcb-9395-0c67f95601de_868x202.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Stage 5: Tournament</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DWBc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95f347d1-3c4c-4a48-acea-ca119cfda9c6_865x203.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DWBc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95f347d1-3c4c-4a48-acea-ca119cfda9c6_865x203.png 424w, https://substackcdn.com/image/fetch/$s_!DWBc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95f347d1-3c4c-4a48-acea-ca119cfda9c6_865x203.png 848w, https://substackcdn.com/image/fetch/$s_!DWBc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95f347d1-3c4c-4a48-acea-ca119cfda9c6_865x203.png 1272w, https://substackcdn.com/image/fetch/$s_!DWBc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95f347d1-3c4c-4a48-acea-ca119cfda9c6_865x203.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DWBc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95f347d1-3c4c-4a48-acea-ca119cfda9c6_865x203.png" width="865" height="203" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/95f347d1-3c4c-4a48-acea-ca119cfda9c6_865x203.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:203,&quot;width&quot;:865,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:38307,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95f347d1-3c4c-4a48-acea-ca119cfda9c6_865x203.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DWBc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95f347d1-3c4c-4a48-acea-ca119cfda9c6_865x203.png 424w, https://substackcdn.com/image/fetch/$s_!DWBc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95f347d1-3c4c-4a48-acea-ca119cfda9c6_865x203.png 848w, https://substackcdn.com/image/fetch/$s_!DWBc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95f347d1-3c4c-4a48-acea-ca119cfda9c6_865x203.png 1272w, https://substackcdn.com/image/fetch/$s_!DWBc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95f347d1-3c4c-4a48-acea-ca119cfda9c6_865x203.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Stage 6: Loop Until Done</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Dz5D!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ab202b3-8f62-4125-a5ec-3fe3f242488c_865x222.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Dz5D!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ab202b3-8f62-4125-a5ec-3fe3f242488c_865x222.png 424w, https://substackcdn.com/image/fetch/$s_!Dz5D!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ab202b3-8f62-4125-a5ec-3fe3f242488c_865x222.png 848w, https://substackcdn.com/image/fetch/$s_!Dz5D!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ab202b3-8f62-4125-a5ec-3fe3f242488c_865x222.png 1272w, https://substackcdn.com/image/fetch/$s_!Dz5D!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ab202b3-8f62-4125-a5ec-3fe3f242488c_865x222.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Dz5D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ab202b3-8f62-4125-a5ec-3fe3f242488c_865x222.png" width="865" height="222" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1ab202b3-8f62-4125-a5ec-3fe3f242488c_865x222.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:222,&quot;width&quot;:865,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:40355,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ab202b3-8f62-4125-a5ec-3fe3f242488c_865x222.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Dz5D!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ab202b3-8f62-4125-a5ec-3fe3f242488c_865x222.png 424w, https://substackcdn.com/image/fetch/$s_!Dz5D!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ab202b3-8f62-4125-a5ec-3fe3f242488c_865x222.png 848w, https://substackcdn.com/image/fetch/$s_!Dz5D!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ab202b3-8f62-4125-a5ec-3fe3f242488c_865x222.png 1272w, https://substackcdn.com/image/fetch/$s_!Dz5D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ab202b3-8f62-4125-a5ec-3fe3f242488c_865x222.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>1.9.7 Consolidated Prompt &#8212; Five Designs</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dZkF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbb6f36c-8ed0-4e5c-bca8-2466b074d7ca_867x627.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dZkF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbb6f36c-8ed0-4e5c-bca8-2466b074d7ca_867x627.png 424w, https://substackcdn.com/image/fetch/$s_!dZkF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbb6f36c-8ed0-4e5c-bca8-2466b074d7ca_867x627.png 848w, https://substackcdn.com/image/fetch/$s_!dZkF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbb6f36c-8ed0-4e5c-bca8-2466b074d7ca_867x627.png 1272w, https://substackcdn.com/image/fetch/$s_!dZkF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbb6f36c-8ed0-4e5c-bca8-2466b074d7ca_867x627.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dZkF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbb6f36c-8ed0-4e5c-bca8-2466b074d7ca_867x627.png" width="867" height="627" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dbb6f36c-8ed0-4e5c-bca8-2466b074d7ca_867x627.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:627,&quot;width&quot;:867,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:116448,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbb6f36c-8ed0-4e5c-bca8-2466b074d7ca_867x627.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dZkF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbb6f36c-8ed0-4e5c-bca8-2466b074d7ca_867x627.png 424w, https://substackcdn.com/image/fetch/$s_!dZkF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbb6f36c-8ed0-4e5c-bca8-2466b074d7ca_867x627.png 848w, https://substackcdn.com/image/fetch/$s_!dZkF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbb6f36c-8ed0-4e5c-bca8-2466b074d7ca_867x627.png 1272w, https://substackcdn.com/image/fetch/$s_!dZkF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbb6f36c-8ed0-4e5c-bca8-2466b074d7ca_867x627.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>1.10 Token Budget &#8212; Five Designs</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WTJ0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa440f25b-a4ac-4c57-8dea-11b13ee7ec86_863x393.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WTJ0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa440f25b-a4ac-4c57-8dea-11b13ee7ec86_863x393.png 424w, https://substackcdn.com/image/fetch/$s_!WTJ0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa440f25b-a4ac-4c57-8dea-11b13ee7ec86_863x393.png 848w, https://substackcdn.com/image/fetch/$s_!WTJ0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa440f25b-a4ac-4c57-8dea-11b13ee7ec86_863x393.png 1272w, https://substackcdn.com/image/fetch/$s_!WTJ0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa440f25b-a4ac-4c57-8dea-11b13ee7ec86_863x393.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WTJ0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa440f25b-a4ac-4c57-8dea-11b13ee7ec86_863x393.png" width="863" height="393" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a440f25b-a4ac-4c57-8dea-11b13ee7ec86_863x393.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:393,&quot;width&quot;:863,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:40289,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa440f25b-a4ac-4c57-8dea-11b13ee7ec86_863x393.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WTJ0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa440f25b-a4ac-4c57-8dea-11b13ee7ec86_863x393.png 424w, https://substackcdn.com/image/fetch/$s_!WTJ0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa440f25b-a4ac-4c57-8dea-11b13ee7ec86_863x393.png 848w, https://substackcdn.com/image/fetch/$s_!WTJ0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa440f25b-a4ac-4c57-8dea-11b13ee7ec86_863x393.png 1272w, https://substackcdn.com/image/fetch/$s_!WTJ0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa440f25b-a4ac-4c57-8dea-11b13ee7ec86_863x393.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>1.11 What to Expect</h2><p>I have been surprised by the results of many of the experiments I have run. But when run, and analyzed - they were really logical. The hypothesis, that a good eval system should be able to prove or disprove either way. So this is mine, here. I have made it basic by going back to priors (the evals being key): A successful run produces reports/recommendation.md showing Design B as the winner with Sigma around 0.769, and Design E near the bottom with Sigma around 0.229 despite a similar Alpha score. The adversarial stage flags Design E for at least one failure mode (most likely self-preferential bias &#8212; the swarm orchestrator typically reviews sub-agent outputs without an independent checker). The loop refines Design B by adding an explicit completion condition if the adversary flagged it, pushing Sigma above 0.65.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qPEv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87d57d17-b867-4545-80d5-e4d4a7d9c074_863x178.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qPEv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87d57d17-b867-4545-80d5-e4d4a7d9c074_863x178.png 424w, https://substackcdn.com/image/fetch/$s_!qPEv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87d57d17-b867-4545-80d5-e4d4a7d9c074_863x178.png 848w, https://substackcdn.com/image/fetch/$s_!qPEv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87d57d17-b867-4545-80d5-e4d4a7d9c074_863x178.png 1272w, https://substackcdn.com/image/fetch/$s_!qPEv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87d57d17-b867-4545-80d5-e4d4a7d9c074_863x178.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qPEv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87d57d17-b867-4545-80d5-e4d4a7d9c074_863x178.png" width="863" height="178" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/87d57d17-b867-4545-80d5-e4d4a7d9c074_863x178.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:178,&quot;width&quot;:863,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:30702,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87d57d17-b867-4545-80d5-e4d4a7d9c074_863x178.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qPEv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87d57d17-b867-4545-80d5-e4d4a7d9c074_863x178.png 424w, https://substackcdn.com/image/fetch/$s_!qPEv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87d57d17-b867-4545-80d5-e4d4a7d9c074_863x178.png 848w, https://substackcdn.com/image/fetch/$s_!qPEv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87d57d17-b867-4545-80d5-e4d4a7d9c074_863x178.png 1272w, https://substackcdn.com/image/fetch/$s_!qPEv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87d57d17-b867-4545-80d5-e4d4a7d9c074_863x178.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Q2Iz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F598dc72f-f972-4756-86d2-e59ad79603ec_870x36.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Q2Iz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F598dc72f-f972-4756-86d2-e59ad79603ec_870x36.png 424w, https://substackcdn.com/image/fetch/$s_!Q2Iz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F598dc72f-f972-4756-86d2-e59ad79603ec_870x36.png 848w, https://substackcdn.com/image/fetch/$s_!Q2Iz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F598dc72f-f972-4756-86d2-e59ad79603ec_870x36.png 1272w, https://substackcdn.com/image/fetch/$s_!Q2Iz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F598dc72f-f972-4756-86d2-e59ad79603ec_870x36.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Q2Iz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F598dc72f-f972-4756-86d2-e59ad79603ec_870x36.png" width="870" height="36" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/598dc72f-f972-4756-86d2-e59ad79603ec_870x36.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:36,&quot;width&quot;:870,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:4417,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F598dc72f-f972-4756-86d2-e59ad79603ec_870x36.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Q2Iz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F598dc72f-f972-4756-86d2-e59ad79603ec_870x36.png 424w, https://substackcdn.com/image/fetch/$s_!Q2Iz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F598dc72f-f972-4756-86d2-e59ad79603ec_870x36.png 848w, https://substackcdn.com/image/fetch/$s_!Q2Iz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F598dc72f-f972-4756-86d2-e59ad79603ec_870x36.png 1272w, https://substackcdn.com/image/fetch/$s_!Q2Iz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F598dc72f-f972-4756-86d2-e59ad79603ec_870x36.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h2>2.1 Conditions for Proceeding</h2><p>Do not run the ten-design experiment until the five-design run has produced a clean result. Specifically: Design B won or placed in the top two, Design E placed in the bottom two, and the reports/recommendation.md was written correctly. If those conditions are not met, the benchmark or scoring rules need adjustment before scaling up.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uOJL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdfef5ce-333a-4e2b-acd8-0b3a42f2e3d8_867x87.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uOJL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdfef5ce-333a-4e2b-acd8-0b3a42f2e3d8_867x87.png 424w, https://substackcdn.com/image/fetch/$s_!uOJL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdfef5ce-333a-4e2b-acd8-0b3a42f2e3d8_867x87.png 848w, https://substackcdn.com/image/fetch/$s_!uOJL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdfef5ce-333a-4e2b-acd8-0b3a42f2e3d8_867x87.png 1272w, https://substackcdn.com/image/fetch/$s_!uOJL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdfef5ce-333a-4e2b-acd8-0b3a42f2e3d8_867x87.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uOJL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdfef5ce-333a-4e2b-acd8-0b3a42f2e3d8_867x87.png" width="867" height="87" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cdfef5ce-333a-4e2b-acd8-0b3a42f2e3d8_867x87.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:87,&quot;width&quot;:867,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:14627,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdfef5ce-333a-4e2b-acd8-0b3a42f2e3d8_867x87.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uOJL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdfef5ce-333a-4e2b-acd8-0b3a42f2e3d8_867x87.png 424w, https://substackcdn.com/image/fetch/$s_!uOJL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdfef5ce-333a-4e2b-acd8-0b3a42f2e3d8_867x87.png 848w, https://substackcdn.com/image/fetch/$s_!uOJL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdfef5ce-333a-4e2b-acd8-0b3a42f2e3d8_867x87.png 1272w, https://substackcdn.com/image/fetch/$s_!uOJL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdfef5ce-333a-4e2b-acd8-0b3a42f2e3d8_867x87.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h2>2.2 What the Ten-Design Run Adds</h2><p>The five designs covered the structural extremes of the H1&#8211;H10 taxonomy. The ten designs add the intermediate architectures &#8212; the ones that produced the most surprising results in the original series: H3 (fell below H1), H4 (collapsed), H5 (strong self-revision), and H10 (catastrophic failure). These are the most interesting designs to test in an automated evaluation because their outcomes were counterintuitive.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!C7ge!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca105674-d374-4b98-981b-de1eeba9ca2b_868x597.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!C7ge!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca105674-d374-4b98-981b-de1eeba9ca2b_868x597.png 424w, https://substackcdn.com/image/fetch/$s_!C7ge!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca105674-d374-4b98-981b-de1eeba9ca2b_868x597.png 848w, https://substackcdn.com/image/fetch/$s_!C7ge!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca105674-d374-4b98-981b-de1eeba9ca2b_868x597.png 1272w, https://substackcdn.com/image/fetch/$s_!C7ge!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca105674-d374-4b98-981b-de1eeba9ca2b_868x597.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!C7ge!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca105674-d374-4b98-981b-de1eeba9ca2b_868x597.png" width="868" height="597" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ca105674-d374-4b98-981b-de1eeba9ca2b_868x597.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:597,&quot;width&quot;:868,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:75042,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca105674-d374-4b98-981b-de1eeba9ca2b_868x597.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!C7ge!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca105674-d374-4b98-981b-de1eeba9ca2b_868x597.png 424w, https://substackcdn.com/image/fetch/$s_!C7ge!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca105674-d374-4b98-981b-de1eeba9ca2b_868x597.png 848w, https://substackcdn.com/image/fetch/$s_!C7ge!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca105674-d374-4b98-981b-de1eeba9ca2b_868x597.png 1272w, https://substackcdn.com/image/fetch/$s_!C7ge!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca105674-d374-4b98-981b-de1eeba9ca2b_868x597.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>2.3 Updated Folder Structure</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!u-vi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80cb010-ffdc-4cfd-813c-39a341398187_867x422.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!u-vi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80cb010-ffdc-4cfd-813c-39a341398187_867x422.png 424w, https://substackcdn.com/image/fetch/$s_!u-vi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80cb010-ffdc-4cfd-813c-39a341398187_867x422.png 848w, https://substackcdn.com/image/fetch/$s_!u-vi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80cb010-ffdc-4cfd-813c-39a341398187_867x422.png 1272w, https://substackcdn.com/image/fetch/$s_!u-vi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80cb010-ffdc-4cfd-813c-39a341398187_867x422.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!u-vi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80cb010-ffdc-4cfd-813c-39a341398187_867x422.png" width="867" height="422" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a80cb010-ffdc-4cfd-813c-39a341398187_867x422.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:422,&quot;width&quot;:867,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:43756,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80cb010-ffdc-4cfd-813c-39a341398187_867x422.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!u-vi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80cb010-ffdc-4cfd-813c-39a341398187_867x422.png 424w, https://substackcdn.com/image/fetch/$s_!u-vi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80cb010-ffdc-4cfd-813c-39a341398187_867x422.png 848w, https://substackcdn.com/image/fetch/$s_!u-vi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80cb010-ffdc-4cfd-813c-39a341398187_867x422.png 1272w, https://substackcdn.com/image/fetch/$s_!u-vi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80cb010-ffdc-4cfd-813c-39a341398187_867x422.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>2.4 Updated Eval Thresholds</h2><p>The ten-design run uses higher quality thresholds than the five-design run. The larger design space should produce a better winner &#8212; the minimum Alpha to enter the tournament rises from 0.60 to 0.65, and the Sigma target rises from 0.65 to 0.70. This also means Design G (H4 equivalent) and Design J (H10 equivalent) are likely to be eliminated before the tournament bracket if their Alpha falls below 0.65, which matches the original series finding.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-gxC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbadef05e-7f1d-4b5b-a4d1-fbc627a5115e_866x292.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-gxC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbadef05e-7f1d-4b5b-a4d1-fbc627a5115e_866x292.png 424w, https://substackcdn.com/image/fetch/$s_!-gxC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbadef05e-7f1d-4b5b-a4d1-fbc627a5115e_866x292.png 848w, https://substackcdn.com/image/fetch/$s_!-gxC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbadef05e-7f1d-4b5b-a4d1-fbc627a5115e_866x292.png 1272w, https://substackcdn.com/image/fetch/$s_!-gxC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbadef05e-7f1d-4b5b-a4d1-fbc627a5115e_866x292.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-gxC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbadef05e-7f1d-4b5b-a4d1-fbc627a5115e_866x292.png" width="866" height="292" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/badef05e-7f1d-4b5b-a4d1-fbc627a5115e_866x292.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:292,&quot;width&quot;:866,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:25606,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbadef05e-7f1d-4b5b-a4d1-fbc627a5115e_866x292.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-gxC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbadef05e-7f1d-4b5b-a4d1-fbc627a5115e_866x292.png 424w, https://substackcdn.com/image/fetch/$s_!-gxC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbadef05e-7f1d-4b5b-a4d1-fbc627a5115e_866x292.png 848w, https://substackcdn.com/image/fetch/$s_!-gxC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbadef05e-7f1d-4b5b-a4d1-fbc627a5115e_866x292.png 1272w, https://substackcdn.com/image/fetch/$s_!-gxC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbadef05e-7f1d-4b5b-a4d1-fbc627a5115e_866x292.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-0Rd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F598fc957-eeea-402b-8de6-83b78d91bfd8_862x105.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-0Rd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F598fc957-eeea-402b-8de6-83b78d91bfd8_862x105.png 424w, https://substackcdn.com/image/fetch/$s_!-0Rd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F598fc957-eeea-402b-8de6-83b78d91bfd8_862x105.png 848w, https://substackcdn.com/image/fetch/$s_!-0Rd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F598fc957-eeea-402b-8de6-83b78d91bfd8_862x105.png 1272w, https://substackcdn.com/image/fetch/$s_!-0Rd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F598fc957-eeea-402b-8de6-83b78d91bfd8_862x105.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-0Rd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F598fc957-eeea-402b-8de6-83b78d91bfd8_862x105.png" width="862" height="105" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/598fc957-eeea-402b-8de6-83b78d91bfd8_862x105.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:105,&quot;width&quot;:862,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:16663,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F598fc957-eeea-402b-8de6-83b78d91bfd8_862x105.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-0Rd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F598fc957-eeea-402b-8de6-83b78d91bfd8_862x105.png 424w, https://substackcdn.com/image/fetch/$s_!-0Rd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F598fc957-eeea-402b-8de6-83b78d91bfd8_862x105.png 848w, https://substackcdn.com/image/fetch/$s_!-0Rd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F598fc957-eeea-402b-8de6-83b78d91bfd8_862x105.png 1272w, https://substackcdn.com/image/fetch/$s_!-0Rd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F598fc957-eeea-402b-8de6-83b78d91bfd8_862x105.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h2>2.5 Consolidated Prompt &#8212; Ten Designs</h2><p>This is the only prompt for Part 2. It assumes the benchmark files exist from Part 1, the Part 1 workflow ran successfully, and you have confirmed Design B won. Run Step 0 only to update CLAUDE.md with the new thresholds &#8212; do not recreate benchmark/ files.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!anhp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61e8d84a-90a4-4472-a8cc-a28b83ccc9cf_863x198.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!anhp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61e8d84a-90a4-4472-a8cc-a28b83ccc9cf_863x198.png 424w, https://substackcdn.com/image/fetch/$s_!anhp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61e8d84a-90a4-4472-a8cc-a28b83ccc9cf_863x198.png 848w, https://substackcdn.com/image/fetch/$s_!anhp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61e8d84a-90a4-4472-a8cc-a28b83ccc9cf_863x198.png 1272w, https://substackcdn.com/image/fetch/$s_!anhp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61e8d84a-90a4-4472-a8cc-a28b83ccc9cf_863x198.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!anhp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61e8d84a-90a4-4472-a8cc-a28b83ccc9cf_863x198.png" width="863" height="198" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/61e8d84a-90a4-4472-a8cc-a28b83ccc9cf_863x198.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:198,&quot;width&quot;:863,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:17570,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61e8d84a-90a4-4472-a8cc-a28b83ccc9cf_863x198.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!anhp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61e8d84a-90a4-4472-a8cc-a28b83ccc9cf_863x198.png 424w, https://substackcdn.com/image/fetch/$s_!anhp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61e8d84a-90a4-4472-a8cc-a28b83ccc9cf_863x198.png 848w, https://substackcdn.com/image/fetch/$s_!anhp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61e8d84a-90a4-4472-a8cc-a28b83ccc9cf_863x198.png 1272w, https://substackcdn.com/image/fetch/$s_!anhp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61e8d84a-90a4-4472-a8cc-a28b83ccc9cf_863x198.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><strong>FULL RUN 10</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!G7f-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9488bf50-5dde-458e-9f96-194f66e65c59_865x726.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!G7f-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9488bf50-5dde-458e-9f96-194f66e65c59_865x726.png 424w, https://substackcdn.com/image/fetch/$s_!G7f-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9488bf50-5dde-458e-9f96-194f66e65c59_865x726.png 848w, https://substackcdn.com/image/fetch/$s_!G7f-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9488bf50-5dde-458e-9f96-194f66e65c59_865x726.png 1272w, https://substackcdn.com/image/fetch/$s_!G7f-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9488bf50-5dde-458e-9f96-194f66e65c59_865x726.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!G7f-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9488bf50-5dde-458e-9f96-194f66e65c59_865x726.png" width="865" height="726" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9488bf50-5dde-458e-9f96-194f66e65c59_865x726.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:726,&quot;width&quot;:865,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:135586,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9488bf50-5dde-458e-9f96-194f66e65c59_865x726.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!G7f-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9488bf50-5dde-458e-9f96-194f66e65c59_865x726.png 424w, https://substackcdn.com/image/fetch/$s_!G7f-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9488bf50-5dde-458e-9f96-194f66e65c59_865x726.png 848w, https://substackcdn.com/image/fetch/$s_!G7f-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9488bf50-5dde-458e-9f96-194f66e65c59_865x726.png 1272w, https://substackcdn.com/image/fetch/$s_!G7f-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9488bf50-5dde-458e-9f96-194f66e65c59_865x726.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>&#8594;What Difference would there be if we work on Loops ala &#8220;Boris Cherny&#8221;</h3><p>Well, everything that was in my 40-line prompt moves into CLAUDE.md. The <strong>prompt becomes the goal and the orientation only</strong>.</p><pre><code><code>/goal sigma &gt;= 0.70

ultracode: read CLAUDE.md for all rules, scoring,
and constraints. Designs A-E exist in designs/.
Benchmark files exist in benchmark/. Extend to
10 designs and find the best. Save as
harness_lab_10.js.
</code></code></pre><p>Thats it! That is the entire prompt.</p><div><hr></div><p><strong>What makes this work is what is in CLAUDE.md</strong></p><p>CLAUDE.md now carries everything my 40-line prompt previously specified:</p><pre><code><code># CLAUDE.md &#8212; Harness Lab Auto

## Goal
Find the harness design with Sigma &gt;= 0.70.
Sigma = Alpha / (1 + Delta).
Alpha = criteria_met / 5.
Delta = (agents x steps) / 10.

## What exists
designs/ contains A-E from prior run.
benchmark/ contains task.md, gold_answer.md,
rubric.md.

## What to produce
10 designs total (add F-J to existing A-E).
Full evaluation of all 10.
Winning design with scorecard.
Write all outputs to results_10/ and reports/.

## Design range for F-J
F: tool-augmented (2+ external tool calls,
   tools feed back into reasoning)
G: independent chains, no reconciliation,
   contradictions unresolved
H: self-revision loop, minimum 2 passes
I: explicit model routing (Sonnet for reasoning,
   Haiku for processing)
J: meta-harness with outer evaluator loop,
   max 3 re-runs

## Evaluation rules
Score only against gold_answer.md criteria.
Binary: 0 or 1 per criterion. No partial credit.
Adversarial probe: 3 failure modes only.
Minimum Alpha 0.65 to enter tournament.
Always include Alpha, Delta, and Sigma in output.

## Constraints
Never modify benchmark/ files.
Never recommend suspending a medication.
Token budget: 30k. Stop cleanly if exceeded.
</code></code></pre><div><hr></div><p><strong>What the agent then does on its own</strong></p><pre><code><code>CHERNY PROMPT              AGENT DECIDES
&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;

/goal sigma &gt;= 0.70    -&gt;  I need to generate F-J
                           I need to classify all 10
                           I need to probe for failures
                           I need discriminating scenarios
                           I need a tournament
                           I need to refine the winner
                           I need to loop until done

CLAUDE.md defines       -&gt;  I know what sigma means
what winning means          I know the scoring rules
                            I know what designs to build
                            I know what constraints apply
</code></code></pre><p>The agent plans its own stages. If it decides to run classification before generation, or combine adversarial probing with scenario generation, it can. You do not care how it gets there &#8212; only that the result meets the goal.</p><div><hr></div><p><strong>The practical difference</strong></p><p>My detailed prompt would always produce the same six stages in the same order. If stage 3 turned out to be unnecessary for a particular run, it would run anyway.</p><p>The Cherny prompt lets the agent skip stages that are not needed, combine stages that are related, or add a stage you had not thought of. <strong>The constraint is the evaluation function, not the method.</strong></p><p><em><strong>The risk is that if CLAUDE.md is underspecified, the agent will make choices I did not intend and the result will be wrong in ways that are hard to trace. Not impossible. Logs help. That is why Cherny&#8217;s approach requires a very well-designed evaluation function. The intelligence moves from the prompt into CLAUDE.md &#8212; it does not disappear.</strong></em></p><h4><em><strong>&#8594; </strong></em>Applying the LFD Concept </h4><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://x.com/elvissun/status/2065035615800864954?s=20&quot;,&quot;full_text&quot;:&quot;https://t.co/q0JG6Tir16&quot;,&quot;username&quot;:&quot;elvissun&quot;,&quot;name&quot;:&quot;Elvis&quot;,&quot;profile_image_url&quot;:&quot;https://pbs.substack.com/profile_images/1886389973236011008/7EZHFw9k_normal.jpg&quot;,&quot;date&quot;:&quot;2026-06-11T11:37:11.000Z&quot;,&quot;photos&quot;:[],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:4,&quot;retweet_count&quot;:14,&quot;like_count&quot;:167,&quot;impression_count&quot;:13608,&quot;expanded_url&quot;:null,&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YxzO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99c62be5-ea57-4f1e-98c5-e29798d0459c_656x856.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YxzO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99c62be5-ea57-4f1e-98c5-e29798d0459c_656x856.png 424w, https://substackcdn.com/image/fetch/$s_!YxzO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99c62be5-ea57-4f1e-98c5-e29798d0459c_656x856.png 848w, https://substackcdn.com/image/fetch/$s_!YxzO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99c62be5-ea57-4f1e-98c5-e29798d0459c_656x856.png 1272w, https://substackcdn.com/image/fetch/$s_!YxzO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99c62be5-ea57-4f1e-98c5-e29798d0459c_656x856.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YxzO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99c62be5-ea57-4f1e-98c5-e29798d0459c_656x856.png" width="656" height="856" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/99c62be5-ea57-4f1e-98c5-e29798d0459c_656x856.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:856,&quot;width&quot;:656,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:292739,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99c62be5-ea57-4f1e-98c5-e29798d0459c_656x856.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YxzO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99c62be5-ea57-4f1e-98c5-e29798d0459c_656x856.png 424w, https://substackcdn.com/image/fetch/$s_!YxzO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99c62be5-ea57-4f1e-98c5-e29798d0459c_656x856.png 848w, https://substackcdn.com/image/fetch/$s_!YxzO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99c62be5-ea57-4f1e-98c5-e29798d0459c_656x856.png 1272w, https://substackcdn.com/image/fetch/$s_!YxzO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99c62be5-ea57-4f1e-98c5-e29798d0459c_656x856.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>In LFD, all intelligence is concentrated in the loss function definition (LFD).</strong></p><p>Applying LFD principles to the Harness Lab automation for better results with less manual prompt engineering would be moving further up from rigid, stage-by-stage prompting (which forces the agent into a fixed path) to <strong>defining what winning looks like and letting the agent figure out the how. That&#8217;s the core of Elvis&#8217;s &#8220;Loss Function Development&#8221; (LFD). It&#8217;s an ML term: instead of telling a model what to do step by step, I define what good looks like and let it find the path. Applied to Agentic Systems: instead of writing &#8220;first do X, then do Y, then do Z&#8221;, I would write &#8220;the goal is Sigma &gt;= 0.75 - here are the constraints, here is the scoring function, here is what disqualifies a solution.&#8221;. So I would:</strong></p><p>1. Strengthen the Loss Function (The Target + Eval) - My current setup uses Sigma (Alpha / (1 + Delta)) as a composite metric, with binary criteria against gold_answer.md, adversarial probing, and thresholds. Improvements/Tweaks aligned with LFD:</p><ul><li><p><strong>Make the eval large and blind</strong>: The benchmark is strong but risks being too small/finite (like Elvis&#8217;s early 28&#8211;200 item failures). Expand gold_answer.md or test scenarios to hundreds of varied cases (including traps from the original ASCRS series: cold chain, air freight overrides, etc.). Keep the full eval hidden during runs&#8212;only reveal misses or aggregate scores post-scoring. This prevents memorization/cheating.</p></li><li><p><strong>Blind + forced generality</strong>: Add instructions in CLAUDE.md like: &#8220;You cannot see or enumerate the full eval set. Any solution that appears to target specific known cases (e.g., via keyword lists or exact matches) must be rejected. Prioritize generality.&#8221;</p></li><li><p><strong>Target a direction, not just a threshold</strong>: Instead of just &#8220;Sigma &gt;= 0.75&#8221;, frame it as <strong>optimizing Sigma while minimizing vulnerabilities flagged in adversarial stages</strong>. This encourages ongoing improvement even after hitting the bar.</p></li></ul><p>This article already notes the benchmark is more important than the workflow&#8212;this is exactly right. I could beef up Constraints &amp; Instruments (The Harness). Elvis stresses that constraints without measurement tools are just vibes&#8212;the agent will ignore or creatively violate them. Thoughts on what to add/enhance in CLAUDE.md:</p><ul><li><p>Hard time &amp; spend caps: Explicit wall-clock (e.g., &#8220;Stop after X hours&#8221;), token budget, and API spend limits. Include CLI/tools so the agent can query: &#8220;Current spend? Time elapsed? Projected burn?&#8221;</p></li><li><p>Methodology &amp; surface constraints: Spell out allowed models, tools, no modifying benchmark files, domain rules (e.g., &#8220;Never recommend suspending medication&#8221;).</p></li><li><p>Instruments: Ensure the agent has easy ways to measure everything (pixel diffs if relevant, timing per step, complexity/Delta calculation, adversarial failure modes). Add self-reflection commands like &#8220;Log hypothesis, expected failure, diagnostic.&#8221;</p></li><li><p>Token budget realism: My 20k&#8211;30k estimates are already good control gears&#8212; which can be tied directly to the goal.</p></li></ul><p>This turns the folder structure + functions into a proper harness the agent can inspect and respect. <strong>Add Forced Entropy (Avoid Local Maxima/Stuck Loops)</strong>. This is a key gap in many detailed workflows (including early versions of my own experiments).In CLAUDE.md or the /goal prompt:</p><ul><li><p>On stall (no metric improvement): &#8220;Do not repeat the same idea harder. Make a non-obvious structural jump or explore a new design axis.&#8221;</p></li><li><p>Overfit check every cycle: &#8220;Reflect: Are we generalizing or memorizing artifacts? If the latter, remove one eval-shaped element (cap lists, widen scenarios, add noise).&#8221;</p></li><li><p>Iteration log: Force logging of hypotheses and decisions for cross-cycle reflection.</p></li><li><p>Diversity mandate: When generating new designs (F&#8211;J), require meaningful deviation from priors, not incremental tweaks.</p></li></ul><p>I believe the loop-until-done already helps, but entropy prevents the agent from just polishing one knob (e.g., endlessly tweaking a swarm without addressing coordination failures). Practical Prompting Upgrades for future projects:</p><ul><li><p><strong>Lean harder into /goal + context file</strong>: The consolidated 10-design prompt is alaready a great step in this direction. I could make the top-level prompt even shorter/minimal: just the goal, reference to CLAUDE.md, existing assets, and output requirements. <strong>Move all intelligence (rules, scoring, design specs, constraints) into CLAUDE.md or equivalent. The agent then owns stage planning, skipping, or combining (as in the Cherny example above)</strong>.</p></li><li><p><strong>Start with a &#8220;meta-goal generator&#8221;</strong>: Use Elvis&#8217;s open-sourced tool or prompt an agent once to refine your loss function for new experiments. Agents are good at writing good loss functions. See: </p><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://x.com/elvissun/status/2025920521871716562?s=20&quot;,&quot;full_text&quot;:&quot;https://t.co/DotZ3V6XhJ&quot;,&quot;username&quot;:&quot;elvissun&quot;,&quot;name&quot;:&quot;Elvis&quot;,&quot;profile_image_url&quot;:&quot;https://pbs.substack.com/profile_images/1886389973236011008/7EZHFw9k_normal.jpg&quot;,&quot;date&quot;:&quot;2026-02-23T13:07:46.000Z&quot;,&quot;photos&quot;:[],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:399,&quot;retweet_count&quot;:1622,&quot;like_count&quot;:12605,&quot;impression_count&quot;:5408254,&quot;expanded_url&quot;:null,&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div></li><li><p>Iterate on the loss function itself: Run short &#8220;probe&#8221; loops on your eval/benchmark first. If the agent cheats or finds shortcuts, tighten it before full runs (exactly as Elvis did across his 4 loops).</p></li><li><p>For harness discovery: Since my goal is rediscovering ASCRS findings (precision beats raw complexity on bounded tasks), make the task profile variables (ambiguity, domain depth, stakes, documentation quality) explicit in the loss function. Score designs against multiple task variants.</p></li></ul><p>Expected Benefits for The Harness Lab. So this is really incorporating fine tuning (once you&#8217;re comfortable and understand what is going on under the hood)</p><ul><li><p><strong>Less manual work</strong>: Fewer fixed stages; agent adapts dynamically.</p></li><li><p><strong>Better winners</strong>: Reduced risk of overfitting to the original H1&#8211;H10 insights. Stronger generalization (e.g., Design F/tool-augmented winning on easy benchmarks but getting probed on hard ones).</p></li><li><p><strong>Scalability:</strong> Run bigger design spaces or harder benchmarks with confidence. The outer loop (optimization toward Sigma + generality) compresses what used to take weeks.</p></li><li><p><strong>Risks to watch</strong> (and this is the biggest!): <strong>Underspecified CLAUDE.md leads to untraceable choices&#8212;keep logs detailed</strong>. Always validate the winner manually on fresh traps.</p></li></ul><p>To be fair, the work is already sophisticated courtesy of Claude functionalities (<strong>dynamic workflows, Sigma metric, tournament + adversarial</strong>). Treating the entire run as LFD&#8212;optimizing a well-instrumented, blind, constrained target with entropy&#8212;should make results more robust and surprising in good ways. It shifts effort from crafting perfect stage prompts to crafting the right objective function. </p><p><strong>LFD Applied:</strong></p><p>Here is the complete, ready-to-use LFD (Cherny-style) setup for the Harness Lab (so <strong>I  have 3 versions - the observable experiment one, &#8220;ala Boris cherny&#8221;, and LFD all found in this section here</strong>). </p><p>Folder Structure</p><pre><code><code>harness-lab/
&#9500;&#9472;&#9472; CLAUDE.md                    # The full loss function + harness (ALL intelligence)
&#9500;&#9472;&#9472; goal.md                      # Short top-level prompt (or paste directly)
&#9500;&#9472;&#9472; benchmark/                   # Sacred - protected
&#9474;   &#9500;&#9472;&#9472; task.md
&#9474;   &#9500;&#9472;&#9472; gold_answer.md           # 5+ binary criteria (expand for blindness)
&#9474;   &#9492;&#9472;&#9472; rubric.md                # Scoring instructions
&#9500;&#9472;&#9472; designs/                     # Seed + generated designs
&#9474;   &#9500;&#9472;&#9472; A.md                     # Minimal single-agent (H1)
&#9474;   &#9500;&#9472;&#9472; B.md                     # Structured single-agent (H2 - expected winner)
&#9474;   &#9500;&#9472;&#9472; C.md
&#9474;   &#9500;&#9472;&#9472; D.md
&#9474;   &#9500;&#9472;&#9472; E.md                     # Full swarm (H9)
&#9474;   &#9492;&#9472;&#9472; F.md ... J.md            # Generated in run
&#9500;&#9472;&#9472; results/                     # Per-design outputs + scorecards
&#9474;   &#9500;&#9472;&#9472; A/
&#9474;   &#9500;&#9472;&#9472; B/
&#9474;   &#9492;&#9472;&#9472; ...
&#9500;&#9472;&#9472; reports/                     # Aggregates
&#9474;   &#9500;&#9472;&#9472; scorecards.md
&#9474;   &#9500;&#9472;&#9472; tournament.md
&#9474;   &#9500;&#9472;&#9472; adversarial_findings.md
&#9474;   &#9492;&#9472;&#9472; recommendation.md        # Final winner + rationale
&#9500;&#9472;&#9472; logs/                        # Traceability (critical for LFD)
&#9474;   &#9500;&#9472;&#9472; run_log.md               # Decisions, spend, time
&#9474;   &#9492;&#9472;&#9472; hypotheses.md            # Reflections, overfitting checks
&#9500;&#9472;&#9472; tools/                       # Any custom functions/helpers
&#9492;&#9472;&#9472; harness_lab.js               # Generated workflow (latest)</code></code></pre><p>Key rules for this structure (in CLAUDE.md):</p><ul><li><p>Never modify anything in benchmark/.</p></li><li><p>Always log to logs/.</p></li><li><p>Write outputs to the correct folders.</p></li></ul><p>(1) Exact <strong>Top-Level Prompt (goal.md</strong> or paste directly)</p><pre><code><code>/goal Optimize to Sigma &gt;= 0.75 (maximize generality + robustness). 

ultracode: Read CLAUDE.md completely for ALL rules, metrics, constraints, design guidelines, evaluation logic, and instruments. 
Inventory the project (designs A-E exist, benchmark is sacred). 
Plan and execute autonomously: generate additional designs, evaluate, adversarially probe, tournament, refine with entropy if needed, and loop until goal or hard constraints hit. 
Produce complete reports/recommendation.md. Log everything.</code></code></pre><p><strong>(2) That&#8217;s the entire prompt. The agent decides the stages, order, skips, and combinations. Full Content of CLAUDE.md (Enhanced LFD Version)</strong></p><pre><code><code># CLAUDE.md &#8212; Harness Lab Loss Function + Harness

## Core Goal (Loss Function)
Find and refine the harness design that achieves **Sigma &gt;= 0.75** while maximizing generality and robustness.
- **Sigma = Alpha / (1 + Delta)**
- **Alpha** = criteria_met / total_criteria (binary 0 or 1 per item in gold_answer.md; no partial credit)
- **Delta** = complexity penalty = (num_agents &#215; num_steps &#215; coordination_overhead) / normalization_factor (use 10 as baseline)
Prioritize high Alpha with low Delta. Higher Sigma wins.

Target: Sigma &gt;= 0.75, strong adversarial performance, evidence of generality (not overfitting to known cases).

## What Exists
- designs/: A.md&#8211;E.md (A = minimal single-agent/H1, B = structured single-agent/H2, E = full swarm/H9)
- benchmark/: task.md, gold_answer.md (5+ binary criteria), rubric.md
- Previous results/logs if present

## What to Produce
- Expand to 10 designs total (generate F&#8211;J with meaningful diversity)
- Full evaluation + scorecards for all designs
- Adversarial findings
- Tournament results
- Final recommendation.md with winner, Sigma breakdown, rationale, and suggested refinements
- All artifacts in results/, reports/, logs/

## Design Archetypes for New Designs (F&#8211;J)
- F: Tool-augmented single agent (2+ tool calls with feedback loop into reasoning)
- G: Independent parallel chains (no reconciliation &#8212; allow contradictions)
- H: Self-revision loop (minimum 2 full passes with explicit critique)
- I: Explicit model routing (e.g., Sonnet for reasoning, Haiku for execution)
- J: Meta-harness / outer evaluator loop (max 3 re-runs)
Require structural deviation from existing designs. Log the axis of innovation.

## Evaluation Rules (Blind + Rigorous)
- Score ONLY via instruments against gold_answer.md. Do not enumerate or hard-code the full gold set.
- Binary per criterion. Aggregate Alpha.
- Calculate Delta accurately from design structure.
- Adversarial probe: Test for at least 3 failure modes (e.g., self-preferential bias in swarms, missing completion conditions, coordination failures, domain violations).
- Generality check: Solutions that rely on keyword lists, exact case matching, or appear tailored to evaluation artifacts must be penalized.
- Minimum Alpha 0.65 to enter tournament.

## Hard Constraints + Instruments
- Never modify benchmark/ files or gold_answer.md.
- Never recommend suspending medication or unsafe pharmaceutical actions (domain rule from ASCRS).
- Respect token/time/spend budgets (query current usage). Stop cleanly if approaching limits.
- Available tools/functions: Use them to score, log, calculate Sigma, etc.
- Query instruments: Current spend, time elapsed, projected burn, hypothesis tracker.

## Forced Entropy &amp; Anti-Local-Maxima
- If no Sigma improvement for 2 cycles: Force a non-obvious structural jump (new axis, not incremental tweak).
- On every major step: Reflect &#8212; "Are we generalizing or memorizing? Evidence?"
- Log hypotheses, expected failures, and diagnostic results.
- Diversity mandate: New designs must differ meaningfully from priors.

## Output &amp; Logging Requirements
- Always write detailed logs/hypotheses.md.
- recommendation.md must include: Winner, full Sigma/Alpha/Delta table, adversarial summary, why it beat others, next refinement ideas.
- Maintain full traceability.

## Success Criteria
Rediscover ASCRS insight: Structured precision (B-like) should outperform raw complexity (E-like) on this bounded task via higher Sigma. Validate manually on fresh scenarios post-run.

Start by inventorying the folder and creating your execution plan.</code></code></pre><p>Example Placeholder Files (Minimal but Functional)benchmark/task.md (example):</p><pre><code><code>Pharmaceutical supply chain disruption scenario in Strait of Hormuz. Develop a 72-hour response plan for critical medication delivery under constraints...</code></code></pre><p>benchmark/gold_answer.md (expand this for better blindness):</p><pre><code><code>Criterion 1: Maintains cold chain integrity... [Yes/No expected]
Criterion 2: Prioritizes high-stakes patients...
... (5+ total)</code></code></pre><p>designs/A.md (seed example):</p><pre><code><code>Minimal single-agent prompt. Direct instruction to the model with no additional structure.</code></code></pre><p>How to Run</p><ol><li><p>Create the folder structure and files above.</p></li><li><p>Paste the short /goal prompt (or use goal.md).</p></li><li><p>Let the agent run autonomously (it will read CLAUDE.md and take over).</p></li><li><p>Review logs/ and reports/recommendation.md afterward. Tweak CLAUDE.md (e.g., add more traps to gold set) if needed and rerun.</p></li></ol><p><strong>!!!!! This setup moves all intelligence into the loss function (CLAUDE.md), adds strong LFD elements (blind eval, instruments, entropy, hard constraints), and makes runs more adaptive and robust</strong>. </p><h2>2.6 Token Budget &#8212; Ten Designs</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rl36!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F517f84e2-de57-46e7-849c-878df20a8c79_866x393.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rl36!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F517f84e2-de57-46e7-849c-878df20a8c79_866x393.png 424w, https://substackcdn.com/image/fetch/$s_!rl36!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F517f84e2-de57-46e7-849c-878df20a8c79_866x393.png 848w, https://substackcdn.com/image/fetch/$s_!rl36!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F517f84e2-de57-46e7-849c-878df20a8c79_866x393.png 1272w, https://substackcdn.com/image/fetch/$s_!rl36!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F517f84e2-de57-46e7-849c-878df20a8c79_866x393.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rl36!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F517f84e2-de57-46e7-849c-878df20a8c79_866x393.png" width="866" height="393" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/517f84e2-de57-46e7-849c-878df20a8c79_866x393.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:393,&quot;width&quot;:866,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:41716,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F517f84e2-de57-46e7-849c-878df20a8c79_866x393.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rl36!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F517f84e2-de57-46e7-849c-878df20a8c79_866x393.png 424w, https://substackcdn.com/image/fetch/$s_!rl36!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F517f84e2-de57-46e7-849c-878df20a8c79_866x393.png 848w, https://substackcdn.com/image/fetch/$s_!rl36!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F517f84e2-de57-46e7-849c-878df20a8c79_866x393.png 1272w, https://substackcdn.com/image/fetch/$s_!rl36!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F517f84e2-de57-46e7-849c-878df20a8c79_866x393.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VnGV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8a0222-7909-441a-8f53-d293ed9e1437_862x131.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VnGV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8a0222-7909-441a-8f53-d293ed9e1437_862x131.png 424w, https://substackcdn.com/image/fetch/$s_!VnGV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8a0222-7909-441a-8f53-d293ed9e1437_862x131.png 848w, https://substackcdn.com/image/fetch/$s_!VnGV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8a0222-7909-441a-8f53-d293ed9e1437_862x131.png 1272w, https://substackcdn.com/image/fetch/$s_!VnGV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8a0222-7909-441a-8f53-d293ed9e1437_862x131.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VnGV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8a0222-7909-441a-8f53-d293ed9e1437_862x131.png" width="862" height="131" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1e8a0222-7909-441a-8f53-d293ed9e1437_862x131.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:131,&quot;width&quot;:862,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:26347,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8a0222-7909-441a-8f53-d293ed9e1437_862x131.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VnGV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8a0222-7909-441a-8f53-d293ed9e1437_862x131.png 424w, https://substackcdn.com/image/fetch/$s_!VnGV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8a0222-7909-441a-8f53-d293ed9e1437_862x131.png 848w, https://substackcdn.com/image/fetch/$s_!VnGV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8a0222-7909-441a-8f53-d293ed9e1437_862x131.png 1272w, https://substackcdn.com/image/fetch/$s_!VnGV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8a0222-7909-441a-8f53-d293ed9e1437_862x131.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0Lw3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33f8e785-6241-48e1-a442-049e96939ac8_783x60.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0Lw3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33f8e785-6241-48e1-a442-049e96939ac8_783x60.png 424w, https://substackcdn.com/image/fetch/$s_!0Lw3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33f8e785-6241-48e1-a442-049e96939ac8_783x60.png 848w, https://substackcdn.com/image/fetch/$s_!0Lw3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33f8e785-6241-48e1-a442-049e96939ac8_783x60.png 1272w, https://substackcdn.com/image/fetch/$s_!0Lw3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33f8e785-6241-48e1-a442-049e96939ac8_783x60.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0Lw3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33f8e785-6241-48e1-a442-049e96939ac8_783x60.png" width="783" height="60" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/33f8e785-6241-48e1-a442-049e96939ac8_783x60.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:60,&quot;width&quot;:783,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:6443,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33f8e785-6241-48e1-a442-049e96939ac8_783x60.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0Lw3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33f8e785-6241-48e1-a442-049e96939ac8_783x60.png 424w, https://substackcdn.com/image/fetch/$s_!0Lw3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33f8e785-6241-48e1-a442-049e96939ac8_783x60.png 848w, https://substackcdn.com/image/fetch/$s_!0Lw3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33f8e785-6241-48e1-a442-049e96939ac8_783x60.png 1272w, https://substackcdn.com/image/fetch/$s_!0Lw3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33f8e785-6241-48e1-a442-049e96939ac8_783x60.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h2>3.1 The Three Experiments at a Glance</h2><p>Three runs were completed. The table below shows what was set, what changed, and what came out. Read this before anything else.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jzGn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4a9c193-1d31-4998-a97a-ba94336aa5e8_865x312.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jzGn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4a9c193-1d31-4998-a97a-ba94336aa5e8_865x312.png 424w, https://substackcdn.com/image/fetch/$s_!jzGn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4a9c193-1d31-4998-a97a-ba94336aa5e8_865x312.png 848w, https://substackcdn.com/image/fetch/$s_!jzGn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4a9c193-1d31-4998-a97a-ba94336aa5e8_865x312.png 1272w, https://substackcdn.com/image/fetch/$s_!jzGn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4a9c193-1d31-4998-a97a-ba94336aa5e8_865x312.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jzGn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4a9c193-1d31-4998-a97a-ba94336aa5e8_865x312.png" width="865" height="312" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b4a9c193-1d31-4998-a97a-ba94336aa5e8_865x312.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:312,&quot;width&quot;:865,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:29931,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4a9c193-1d31-4998-a97a-ba94336aa5e8_865x312.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jzGn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4a9c193-1d31-4998-a97a-ba94336aa5e8_865x312.png 424w, https://substackcdn.com/image/fetch/$s_!jzGn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4a9c193-1d31-4998-a97a-ba94336aa5e8_865x312.png 848w, https://substackcdn.com/image/fetch/$s_!jzGn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4a9c193-1d31-4998-a97a-ba94336aa5e8_865x312.png 1272w, https://substackcdn.com/image/fetch/$s_!jzGn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4a9c193-1d31-4998-a97a-ba94336aa5e8_865x312.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>My folder structure changed slightlly to facilitate the Rerun:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Eyma!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4bbcf0b-86bf-4255-9009-c6529bd9cee5_902x377.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Eyma!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4bbcf0b-86bf-4255-9009-c6529bd9cee5_902x377.png 424w, https://substackcdn.com/image/fetch/$s_!Eyma!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4bbcf0b-86bf-4255-9009-c6529bd9cee5_902x377.png 848w, https://substackcdn.com/image/fetch/$s_!Eyma!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4bbcf0b-86bf-4255-9009-c6529bd9cee5_902x377.png 1272w, https://substackcdn.com/image/fetch/$s_!Eyma!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4bbcf0b-86bf-4255-9009-c6529bd9cee5_902x377.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Eyma!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4bbcf0b-86bf-4255-9009-c6529bd9cee5_902x377.png" width="902" height="377" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a4bbcf0b-86bf-4255-9009-c6529bd9cee5_902x377.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:377,&quot;width&quot;:902,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:50282,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4bbcf0b-86bf-4255-9009-c6529bd9cee5_902x377.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Eyma!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4bbcf0b-86bf-4255-9009-c6529bd9cee5_902x377.png 424w, https://substackcdn.com/image/fetch/$s_!Eyma!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4bbcf0b-86bf-4255-9009-c6529bd9cee5_902x377.png 848w, https://substackcdn.com/image/fetch/$s_!Eyma!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4bbcf0b-86bf-4255-9009-c6529bd9cee5_902x377.png 1272w, https://substackcdn.com/image/fetch/$s_!Eyma!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4bbcf0b-86bf-4255-9009-c6529bd9cee5_902x377.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cb9D!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a284122-dc77-4fcb-96a0-c05c4e9a9a62_1097x592.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cb9D!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a284122-dc77-4fcb-96a0-c05c4e9a9a62_1097x592.png 424w, https://substackcdn.com/image/fetch/$s_!cb9D!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a284122-dc77-4fcb-96a0-c05c4e9a9a62_1097x592.png 848w, https://substackcdn.com/image/fetch/$s_!cb9D!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a284122-dc77-4fcb-96a0-c05c4e9a9a62_1097x592.png 1272w, https://substackcdn.com/image/fetch/$s_!cb9D!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a284122-dc77-4fcb-96a0-c05c4e9a9a62_1097x592.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cb9D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a284122-dc77-4fcb-96a0-c05c4e9a9a62_1097x592.png" width="1097" height="592" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6a284122-dc77-4fcb-96a0-c05c4e9a9a62_1097x592.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:592,&quot;width&quot;:1097,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:748431,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a284122-dc77-4fcb-96a0-c05c4e9a9a62_1097x592.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cb9D!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a284122-dc77-4fcb-96a0-c05c4e9a9a62_1097x592.png 424w, https://substackcdn.com/image/fetch/$s_!cb9D!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a284122-dc77-4fcb-96a0-c05c4e9a9a62_1097x592.png 848w, https://substackcdn.com/image/fetch/$s_!cb9D!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a284122-dc77-4fcb-96a0-c05c4e9a9a62_1097x592.png 1272w, https://substackcdn.com/image/fetch/$s_!cb9D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a284122-dc77-4fcb-96a0-c05c4e9a9a62_1097x592.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>3.2 What Happened to Design B</h2><p>Design B was the Part 1 winner and the H2 equivalent &#8212; the architecture the original series found most effective. Its fate across the three runs shows why the Alpha threshold is the most sensitive parameter in the experiment.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zb9m!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd44dab12-07da-4a8e-9c90-1cbe9f2ee648_865x410.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zb9m!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd44dab12-07da-4a8e-9c90-1cbe9f2ee648_865x410.png 424w, https://substackcdn.com/image/fetch/$s_!zb9m!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd44dab12-07da-4a8e-9c90-1cbe9f2ee648_865x410.png 848w, https://substackcdn.com/image/fetch/$s_!zb9m!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd44dab12-07da-4a8e-9c90-1cbe9f2ee648_865x410.png 1272w, https://substackcdn.com/image/fetch/$s_!zb9m!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd44dab12-07da-4a8e-9c90-1cbe9f2ee648_865x410.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zb9m!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd44dab12-07da-4a8e-9c90-1cbe9f2ee648_865x410.png" width="865" height="410" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d44dab12-07da-4a8e-9c90-1cbe9f2ee648_865x410.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:410,&quot;width&quot;:865,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:21102,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd44dab12-07da-4a8e-9c90-1cbe9f2ee648_865x410.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zb9m!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd44dab12-07da-4a8e-9c90-1cbe9f2ee648_865x410.png 424w, https://substackcdn.com/image/fetch/$s_!zb9m!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd44dab12-07da-4a8e-9c90-1cbe9f2ee648_865x410.png 848w, https://substackcdn.com/image/fetch/$s_!zb9m!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd44dab12-07da-4a8e-9c90-1cbe9f2ee648_865x410.png 1272w, https://substackcdn.com/image/fetch/$s_!zb9m!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd44dab12-07da-4a8e-9c90-1cbe9f2ee648_865x410.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KiWS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F253899a6-ab73-45da-9652-3a2e325a90f2_867x287.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KiWS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F253899a6-ab73-45da-9652-3a2e325a90f2_867x287.png 424w, https://substackcdn.com/image/fetch/$s_!KiWS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F253899a6-ab73-45da-9652-3a2e325a90f2_867x287.png 848w, https://substackcdn.com/image/fetch/$s_!KiWS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F253899a6-ab73-45da-9652-3a2e325a90f2_867x287.png 1272w, https://substackcdn.com/image/fetch/$s_!KiWS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F253899a6-ab73-45da-9652-3a2e325a90f2_867x287.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KiWS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F253899a6-ab73-45da-9652-3a2e325a90f2_867x287.png" width="867" height="287" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/253899a6-ab73-45da-9652-3a2e325a90f2_867x287.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:287,&quot;width&quot;:867,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:17060,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F253899a6-ab73-45da-9652-3a2e325a90f2_867x287.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KiWS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F253899a6-ab73-45da-9652-3a2e325a90f2_867x287.png 424w, https://substackcdn.com/image/fetch/$s_!KiWS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F253899a6-ab73-45da-9652-3a2e325a90f2_867x287.png 848w, https://substackcdn.com/image/fetch/$s_!KiWS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F253899a6-ab73-45da-9652-3a2e325a90f2_867x287.png 1272w, https://substackcdn.com/image/fetch/$s_!KiWS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F253899a6-ab73-45da-9652-3a2e325a90f2_867x287.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>3.3 Why Design F Won</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!p4_I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7d64f30-1ad5-4d3e-be47-3ad30f4d7053_1142x625.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!p4_I!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7d64f30-1ad5-4d3e-be47-3ad30f4d7053_1142x625.png 424w, https://substackcdn.com/image/fetch/$s_!p4_I!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7d64f30-1ad5-4d3e-be47-3ad30f4d7053_1142x625.png 848w, https://substackcdn.com/image/fetch/$s_!p4_I!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7d64f30-1ad5-4d3e-be47-3ad30f4d7053_1142x625.png 1272w, https://substackcdn.com/image/fetch/$s_!p4_I!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7d64f30-1ad5-4d3e-be47-3ad30f4d7053_1142x625.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!p4_I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7d64f30-1ad5-4d3e-be47-3ad30f4d7053_1142x625.png" width="1142" height="625" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f7d64f30-1ad5-4d3e-be47-3ad30f4d7053_1142x625.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:625,&quot;width&quot;:1142,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:866693,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7d64f30-1ad5-4d3e-be47-3ad30f4d7053_1142x625.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!p4_I!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7d64f30-1ad5-4d3e-be47-3ad30f4d7053_1142x625.png 424w, https://substackcdn.com/image/fetch/$s_!p4_I!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7d64f30-1ad5-4d3e-be47-3ad30f4d7053_1142x625.png 848w, https://substackcdn.com/image/fetch/$s_!p4_I!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7d64f30-1ad5-4d3e-be47-3ad30f4d7053_1142x625.png 1272w, https://substackcdn.com/image/fetch/$s_!p4_I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7d64f30-1ad5-4d3e-be47-3ad30f4d7053_1142x625.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Design F is the H3 equivalent &#8212; <strong>tool-augmented single agent</strong>. In the original series H3 scored 0.600 and fell below the H1 baseline. Here it won all three stages it entered. The reason is in the numbers, not the architecture.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!l0Fi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8572c0de-a2fb-4075-8b5a-add6f7357071_871x491.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!l0Fi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8572c0de-a2fb-4075-8b5a-add6f7357071_871x491.png 424w, https://substackcdn.com/image/fetch/$s_!l0Fi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8572c0de-a2fb-4075-8b5a-add6f7357071_871x491.png 848w, https://substackcdn.com/image/fetch/$s_!l0Fi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8572c0de-a2fb-4075-8b5a-add6f7357071_871x491.png 1272w, https://substackcdn.com/image/fetch/$s_!l0Fi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8572c0de-a2fb-4075-8b5a-add6f7357071_871x491.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!l0Fi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8572c0de-a2fb-4075-8b5a-add6f7357071_871x491.png" width="871" height="491" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8572c0de-a2fb-4075-8b5a-add6f7357071_871x491.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:491,&quot;width&quot;:871,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:36156,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8572c0de-a2fb-4075-8b5a-add6f7357071_871x491.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!l0Fi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8572c0de-a2fb-4075-8b5a-add6f7357071_871x491.png 424w, https://substackcdn.com/image/fetch/$s_!l0Fi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8572c0de-a2fb-4075-8b5a-add6f7357071_871x491.png 848w, https://substackcdn.com/image/fetch/$s_!l0Fi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8572c0de-a2fb-4075-8b5a-add6f7357071_871x491.png 1272w, https://substackcdn.com/image/fetch/$s_!l0Fi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8572c0de-a2fb-4075-8b5a-add6f7357071_871x491.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>3.4 Why the Same Architecture Ranked Differently</h2><p>H3 scored below H1 in the original series. Design F (H3 equivalent) won this experiment. Same architecture type, opposite result. <strong>The benchmark explains the difference entirely.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!z2XU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62f4ddcd-aaba-4e20-9603-ab38eb3a916e_863x546.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!z2XU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62f4ddcd-aaba-4e20-9603-ab38eb3a916e_863x546.png 424w, https://substackcdn.com/image/fetch/$s_!z2XU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62f4ddcd-aaba-4e20-9603-ab38eb3a916e_863x546.png 848w, https://substackcdn.com/image/fetch/$s_!z2XU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62f4ddcd-aaba-4e20-9603-ab38eb3a916e_863x546.png 1272w, https://substackcdn.com/image/fetch/$s_!z2XU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62f4ddcd-aaba-4e20-9603-ab38eb3a916e_863x546.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!z2XU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62f4ddcd-aaba-4e20-9603-ab38eb3a916e_863x546.png" width="863" height="546" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/62f4ddcd-aaba-4e20-9603-ab38eb3a916e_863x546.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:546,&quot;width&quot;:863,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:45828,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62f4ddcd-aaba-4e20-9603-ab38eb3a916e_863x546.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!z2XU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62f4ddcd-aaba-4e20-9603-ab38eb3a916e_863x546.png 424w, https://substackcdn.com/image/fetch/$s_!z2XU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62f4ddcd-aaba-4e20-9603-ab38eb3a916e_863x546.png 848w, https://substackcdn.com/image/fetch/$s_!z2XU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62f4ddcd-aaba-4e20-9603-ab38eb3a916e_863x546.png 1272w, https://substackcdn.com/image/fetch/$s_!z2XU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62f4ddcd-aaba-4e20-9603-ab38eb3a916e_863x546.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>3.5 What the Adversarial Stage Caught</h2><p>Design F scored Alpha=1.0 in both Part 2 and the Rerun. The tournament declared it champion. The adversarial agent found something the score could not show.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rFjw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b1fb9d9-f39d-4881-af59-2fe54adc8c7c_865x532.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rFjw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b1fb9d9-f39d-4881-af59-2fe54adc8c7c_865x532.png 424w, https://substackcdn.com/image/fetch/$s_!rFjw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b1fb9d9-f39d-4881-af59-2fe54adc8c7c_865x532.png 848w, https://substackcdn.com/image/fetch/$s_!rFjw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b1fb9d9-f39d-4881-af59-2fe54adc8c7c_865x532.png 1272w, https://substackcdn.com/image/fetch/$s_!rFjw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b1fb9d9-f39d-4881-af59-2fe54adc8c7c_865x532.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rFjw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b1fb9d9-f39d-4881-af59-2fe54adc8c7c_865x532.png" width="865" height="532" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3b1fb9d9-f39d-4881-af59-2fe54adc8c7c_865x532.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:532,&quot;width&quot;:865,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:43886,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b1fb9d9-f39d-4881-af59-2fe54adc8c7c_865x532.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rFjw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b1fb9d9-f39d-4881-af59-2fe54adc8c7c_865x532.png 424w, https://substackcdn.com/image/fetch/$s_!rFjw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b1fb9d9-f39d-4881-af59-2fe54adc8c7c_865x532.png 848w, https://substackcdn.com/image/fetch/$s_!rFjw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b1fb9d9-f39d-4881-af59-2fe54adc8c7c_865x532.png 1272w, https://substackcdn.com/image/fetch/$s_!rFjw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b1fb9d9-f39d-4881-af59-2fe54adc8c7c_865x532.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>3.6 The Four Findings in Plain Terms</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ysDb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95117052-7465-48f1-aeab-447f8884ee8f_1083x581.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ysDb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95117052-7465-48f1-aeab-447f8884ee8f_1083x581.png 424w, https://substackcdn.com/image/fetch/$s_!ysDb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95117052-7465-48f1-aeab-447f8884ee8f_1083x581.png 848w, https://substackcdn.com/image/fetch/$s_!ysDb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95117052-7465-48f1-aeab-447f8884ee8f_1083x581.png 1272w, https://substackcdn.com/image/fetch/$s_!ysDb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95117052-7465-48f1-aeab-447f8884ee8f_1083x581.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ysDb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95117052-7465-48f1-aeab-447f8884ee8f_1083x581.png" width="1083" height="581" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/95117052-7465-48f1-aeab-447f8884ee8f_1083x581.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:581,&quot;width&quot;:1083,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:955735,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95117052-7465-48f1-aeab-447f8884ee8f_1083x581.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ysDb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95117052-7465-48f1-aeab-447f8884ee8f_1083x581.png 424w, https://substackcdn.com/image/fetch/$s_!ysDb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95117052-7465-48f1-aeab-447f8884ee8f_1083x581.png 848w, https://substackcdn.com/image/fetch/$s_!ysDb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95117052-7465-48f1-aeab-447f8884ee8f_1083x581.png 1272w, https://substackcdn.com/image/fetch/$s_!ysDb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95117052-7465-48f1-aeab-447f8884ee8f_1083x581.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7P13!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9c87a65-df55-4a52-814c-9516fff76d84_862x397.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7P13!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9c87a65-df55-4a52-814c-9516fff76d84_862x397.png 424w, https://substackcdn.com/image/fetch/$s_!7P13!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9c87a65-df55-4a52-814c-9516fff76d84_862x397.png 848w, https://substackcdn.com/image/fetch/$s_!7P13!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9c87a65-df55-4a52-814c-9516fff76d84_862x397.png 1272w, https://substackcdn.com/image/fetch/$s_!7P13!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9c87a65-df55-4a52-814c-9516fff76d84_862x397.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7P13!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9c87a65-df55-4a52-814c-9516fff76d84_862x397.png" width="862" height="397" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c9c87a65-df55-4a52-814c-9516fff76d84_862x397.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:397,&quot;width&quot;:862,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:49002,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9c87a65-df55-4a52-814c-9516fff76d84_862x397.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7P13!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9c87a65-df55-4a52-814c-9516fff76d84_862x397.png 424w, https://substackcdn.com/image/fetch/$s_!7P13!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9c87a65-df55-4a52-814c-9516fff76d84_862x397.png 848w, https://substackcdn.com/image/fetch/$s_!7P13!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9c87a65-df55-4a52-814c-9516fff76d84_862x397.png 1272w, https://substackcdn.com/image/fetch/$s_!7P13!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9c87a65-df55-4a52-814c-9516fff76d84_862x397.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!y_W7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68f64ee1-70be-475a-bd21-e6010b7208d0_866x37.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!y_W7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68f64ee1-70be-475a-bd21-e6010b7208d0_866x37.png 424w, https://substackcdn.com/image/fetch/$s_!y_W7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68f64ee1-70be-475a-bd21-e6010b7208d0_866x37.png 848w, https://substackcdn.com/image/fetch/$s_!y_W7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68f64ee1-70be-475a-bd21-e6010b7208d0_866x37.png 1272w, https://substackcdn.com/image/fetch/$s_!y_W7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68f64ee1-70be-475a-bd21-e6010b7208d0_866x37.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!y_W7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68f64ee1-70be-475a-bd21-e6010b7208d0_866x37.png" width="866" height="37" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/68f64ee1-70be-475a-bd21-e6010b7208d0_866x37.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:37,&quot;width&quot;:866,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3791,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68f64ee1-70be-475a-bd21-e6010b7208d0_866x37.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!y_W7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68f64ee1-70be-475a-bd21-e6010b7208d0_866x37.png 424w, https://substackcdn.com/image/fetch/$s_!y_W7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68f64ee1-70be-475a-bd21-e6010b7208d0_866x37.png 848w, https://substackcdn.com/image/fetch/$s_!y_W7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68f64ee1-70be-475a-bd21-e6010b7208d0_866x37.png 1272w, https://substackcdn.com/image/fetch/$s_!y_W7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68f64ee1-70be-475a-bd21-e6010b7208d0_866x37.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>4.1 What the Automated Results Should Tell You</h3><p>The experiment is designed to reproduce three findings from the original H1&#8211;H10 series. If all three appear in the automated results, <strong>the workflow is functioning correctly and the benchmark is valid.</strong> If they do not appear, either the benchmark needs adjustment or the workflow has a specification gap.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hdnt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F023ad77b-2355-45c6-992d-46c326af6eec_867x308.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hdnt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F023ad77b-2355-45c6-992d-46c326af6eec_867x308.png 424w, https://substackcdn.com/image/fetch/$s_!hdnt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F023ad77b-2355-45c6-992d-46c326af6eec_867x308.png 848w, https://substackcdn.com/image/fetch/$s_!hdnt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F023ad77b-2355-45c6-992d-46c326af6eec_867x308.png 1272w, https://substackcdn.com/image/fetch/$s_!hdnt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F023ad77b-2355-45c6-992d-46c326af6eec_867x308.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hdnt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F023ad77b-2355-45c6-992d-46c326af6eec_867x308.png" width="867" height="308" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/023ad77b-2355-45c6-992d-46c326af6eec_867x308.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:308,&quot;width&quot;:867,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:42077,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F023ad77b-2355-45c6-992d-46c326af6eec_867x308.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hdnt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F023ad77b-2355-45c6-992d-46c326af6eec_867x308.png 424w, https://substackcdn.com/image/fetch/$s_!hdnt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F023ad77b-2355-45c6-992d-46c326af6eec_867x308.png 848w, https://substackcdn.com/image/fetch/$s_!hdnt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F023ad77b-2355-45c6-992d-46c326af6eec_867x308.png 1272w, https://substackcdn.com/image/fetch/$s_!hdnt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F023ad77b-2355-45c6-992d-46c326af6eec_867x308.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>4.2 What Sigma Adds That Alpha Could Not</h3><p>The original H1&#8211;H10 series measured Alpha and Kappa but had no complexity penalty metric. The H2 &gt; H9 result was observed but not fully explained: H9 achieved similar quality but lost. Sigma provides the explanation post-hoc &#8212; H9&#8217;s Delta of 2.5 reduces its Sigma to less than a third of H2&#8217;s Sigma despite similar Alpha. The automated experiment makes this comparison explicit and generates it systematically rather than as a retrospective observation.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!urFt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4219a103-e70f-459f-bc6a-2260a8c1f64a_867x246.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!urFt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4219a103-e70f-459f-bc6a-2260a8c1f64a_867x246.png 424w, https://substackcdn.com/image/fetch/$s_!urFt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4219a103-e70f-459f-bc6a-2260a8c1f64a_867x246.png 848w, https://substackcdn.com/image/fetch/$s_!urFt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4219a103-e70f-459f-bc6a-2260a8c1f64a_867x246.png 1272w, https://substackcdn.com/image/fetch/$s_!urFt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4219a103-e70f-459f-bc6a-2260a8c1f64a_867x246.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!urFt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4219a103-e70f-459f-bc6a-2260a8c1f64a_867x246.png" width="867" height="246" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4219a103-e70f-459f-bc6a-2260a8c1f64a_867x246.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:246,&quot;width&quot;:867,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:20513,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4219a103-e70f-459f-bc6a-2260a8c1f64a_867x246.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!urFt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4219a103-e70f-459f-bc6a-2260a8c1f64a_867x246.png 424w, https://substackcdn.com/image/fetch/$s_!urFt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4219a103-e70f-459f-bc6a-2260a8c1f64a_867x246.png 848w, https://substackcdn.com/image/fetch/$s_!urFt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4219a103-e70f-459f-bc6a-2260a8c1f64a_867x246.png 1272w, https://substackcdn.com/image/fetch/$s_!urFt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4219a103-e70f-459f-bc6a-2260a8c1f64a_867x246.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>4.3 Automated vs Manual: The Real Comparison</h3><p>The weeks spent on the original H1&#8211;H10 series were not wasted effort. <strong>They produced the benchmark, the gold answer, the scoring rubric, and the insight that precision beats parallelism on bounded tasks</strong>. The automated experiment cannot produce those &#8212; it consumes them.</p><p><strong>What the automated experiment adds is reproducibility and scale. The same benchmark can now be run against any new harness design in minutes rather than sessions. The /goal and Loop Until Done patterns mean the evaluation continues until a quality threshold is met</strong>, not until the developer decides to stop. And the Sigma metric, introduced here, can be applied retrospectively to the original H1&#8211;H10 results to explain them more precisely.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vQx1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9087d42-e50a-4670-9a17-45c036823ee5_867x130.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vQx1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9087d42-e50a-4670-9a17-45c036823ee5_867x130.png 424w, https://substackcdn.com/image/fetch/$s_!vQx1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9087d42-e50a-4670-9a17-45c036823ee5_867x130.png 848w, https://substackcdn.com/image/fetch/$s_!vQx1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9087d42-e50a-4670-9a17-45c036823ee5_867x130.png 1272w, https://substackcdn.com/image/fetch/$s_!vQx1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9087d42-e50a-4670-9a17-45c036823ee5_867x130.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vQx1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9087d42-e50a-4670-9a17-45c036823ee5_867x130.png" width="867" height="130" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e9087d42-e50a-4670-9a17-45c036823ee5_867x130.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:130,&quot;width&quot;:867,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:28929,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9087d42-e50a-4670-9a17-45c036823ee5_867x130.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vQx1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9087d42-e50a-4670-9a17-45c036823ee5_867x130.png 424w, https://substackcdn.com/image/fetch/$s_!vQx1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9087d42-e50a-4670-9a17-45c036823ee5_867x130.png 848w, https://substackcdn.com/image/fetch/$s_!vQx1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9087d42-e50a-4670-9a17-45c036823ee5_867x130.png 1272w, https://substackcdn.com/image/fetch/$s_!vQx1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9087d42-e50a-4670-9a17-45c036823ee5_867x130.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>4.4 What to Do With the Results</h3><p>Before acting on the results, it helps to understand what the three experiments were actually doing together. Each run was not an isolated test &#8212; it was a sensitivity check. When you change a parameter and run again, you are asking: does this result hold, or was it a product of the settings I chose?</p><h3>Think of the threshold as a filter on a job interview</h3><p>The Alpha minimum threshold works like a minimum qualification requirement for a job. Set it at 0.60 and most candidates get an interview. Set it at 0.65 and stricter standards apply &#8212; some candidates who would have competed well never get a chance. The question is not which threshold is correct. The question is whether your winner would win under either setting.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!H5AW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F470a55eb-6246-4b9f-af57-5ea6dbf4d4e4_863x331.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!H5AW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F470a55eb-6246-4b9f-af57-5ea6dbf4d4e4_863x331.png 424w, https://substackcdn.com/image/fetch/$s_!H5AW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F470a55eb-6246-4b9f-af57-5ea6dbf4d4e4_863x331.png 848w, https://substackcdn.com/image/fetch/$s_!H5AW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F470a55eb-6246-4b9f-af57-5ea6dbf4d4e4_863x331.png 1272w, https://substackcdn.com/image/fetch/$s_!H5AW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F470a55eb-6246-4b9f-af57-5ea6dbf4d4e4_863x331.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!H5AW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F470a55eb-6246-4b9f-af57-5ea6dbf4d4e4_863x331.png" width="863" height="331" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/470a55eb-6246-4b9f-af57-5ea6dbf4d4e4_863x331.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:331,&quot;width&quot;:863,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:30457,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F470a55eb-6246-4b9f-af57-5ea6dbf4d4e4_863x331.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!H5AW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F470a55eb-6246-4b9f-af57-5ea6dbf4d4e4_863x331.png 424w, https://substackcdn.com/image/fetch/$s_!H5AW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F470a55eb-6246-4b9f-af57-5ea6dbf4d4e4_863x331.png 848w, https://substackcdn.com/image/fetch/$s_!H5AW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F470a55eb-6246-4b9f-af57-5ea6dbf4d4e4_863x331.png 1272w, https://substackcdn.com/image/fetch/$s_!H5AW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F470a55eb-6246-4b9f-af57-5ea6dbf4d4e4_863x331.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Think of the benchmark as the exam question</h3><p>The benchmark task is the question every design has to answer. If the question is easy, designs that are good at covering ground broadly will score well. If the question has deliberate traps, designs that reason carefully and precisely will score well. The same student can top the class on one exam and fail another &#8212; not because the student changed, but because the exam changed.</p><p>This is exactly what happened across the original series and this experiment. The original series used a hard exam with traps. H2 (careful, precise structure) aced it. H3 (uses tools, covers ground broadly) failed it. This experiment used a simpler exam with no traps. H3/Design F aced it. H2/Design B only partially passed. Neither result is wrong. Both are correct answers to different questions.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Bs6F!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c998f73-1463-42ea-b409-e7b89106631c_870x266.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Bs6F!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c998f73-1463-42ea-b409-e7b89106631c_870x266.png 424w, https://substackcdn.com/image/fetch/$s_!Bs6F!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c998f73-1463-42ea-b409-e7b89106631c_870x266.png 848w, https://substackcdn.com/image/fetch/$s_!Bs6F!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c998f73-1463-42ea-b409-e7b89106631c_870x266.png 1272w, https://substackcdn.com/image/fetch/$s_!Bs6F!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c998f73-1463-42ea-b409-e7b89106631c_870x266.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Bs6F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c998f73-1463-42ea-b409-e7b89106631c_870x266.png" width="870" height="266" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7c998f73-1463-42ea-b409-e7b89106631c_870x266.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:266,&quot;width&quot;:870,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:18682,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c998f73-1463-42ea-b409-e7b89106631c_870x266.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Bs6F!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c998f73-1463-42ea-b409-e7b89106631c_870x266.png 424w, https://substackcdn.com/image/fetch/$s_!Bs6F!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c998f73-1463-42ea-b409-e7b89106631c_870x266.png 848w, https://substackcdn.com/image/fetch/$s_!Bs6F!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c998f73-1463-42ea-b409-e7b89106631c_870x266.png 1272w, https://substackcdn.com/image/fetch/$s_!Bs6F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c998f73-1463-42ea-b409-e7b89106631c_870x266.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wU5k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fcfbb8a-7e44-4cf6-bb93-a030d88b3a3c_1157x630.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wU5k!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fcfbb8a-7e44-4cf6-bb93-a030d88b3a3c_1157x630.png 424w, https://substackcdn.com/image/fetch/$s_!wU5k!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fcfbb8a-7e44-4cf6-bb93-a030d88b3a3c_1157x630.png 848w, https://substackcdn.com/image/fetch/$s_!wU5k!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fcfbb8a-7e44-4cf6-bb93-a030d88b3a3c_1157x630.png 1272w, https://substackcdn.com/image/fetch/$s_!wU5k!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fcfbb8a-7e44-4cf6-bb93-a030d88b3a3c_1157x630.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wU5k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fcfbb8a-7e44-4cf6-bb93-a030d88b3a3c_1157x630.png" width="1157" height="630" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6fcfbb8a-7e44-4cf6-bb93-a030d88b3a3c_1157x630.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:630,&quot;width&quot;:1157,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1166786,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fcfbb8a-7e44-4cf6-bb93-a030d88b3a3c_1157x630.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wU5k!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fcfbb8a-7e44-4cf6-bb93-a030d88b3a3c_1157x630.png 424w, https://substackcdn.com/image/fetch/$s_!wU5k!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fcfbb8a-7e44-4cf6-bb93-a030d88b3a3c_1157x630.png 848w, https://substackcdn.com/image/fetch/$s_!wU5k!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fcfbb8a-7e44-4cf6-bb93-a030d88b3a3c_1157x630.png 1272w, https://substackcdn.com/image/fetch/$s_!wU5k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fcfbb8a-7e44-4cf6-bb93-a030d88b3a3c_1157x630.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Think of the three runs as turning a dial</h3><p>Running the same workflow with one parameter changed is more valuable than running it once with the best possible settings. Each run answers a different version of the question. <strong>Part 1 asked: does the workflow rank correctly on a small field? Part 2 asked: does it still work when the field is larger and the bar is higher? The Rerun asked: was Design F&#8217;s win genuine, or an artefact of Design B being excluded?</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!A351!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b8eb420-0fb4-42dd-aa04-80647eb36cee_1125x602.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!A351!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b8eb420-0fb4-42dd-aa04-80647eb36cee_1125x602.png 424w, https://substackcdn.com/image/fetch/$s_!A351!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b8eb420-0fb4-42dd-aa04-80647eb36cee_1125x602.png 848w, https://substackcdn.com/image/fetch/$s_!A351!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b8eb420-0fb4-42dd-aa04-80647eb36cee_1125x602.png 1272w, https://substackcdn.com/image/fetch/$s_!A351!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b8eb420-0fb4-42dd-aa04-80647eb36cee_1125x602.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!A351!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b8eb420-0fb4-42dd-aa04-80647eb36cee_1125x602.png" width="1125" height="602" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2b8eb420-0fb4-42dd-aa04-80647eb36cee_1125x602.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:602,&quot;width&quot;:1125,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:925919,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b8eb420-0fb4-42dd-aa04-80647eb36cee_1125x602.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!A351!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b8eb420-0fb4-42dd-aa04-80647eb36cee_1125x602.png 424w, https://substackcdn.com/image/fetch/$s_!A351!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b8eb420-0fb4-42dd-aa04-80647eb36cee_1125x602.png 848w, https://substackcdn.com/image/fetch/$s_!A351!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b8eb420-0fb4-42dd-aa04-80647eb36cee_1125x602.png 1272w, https://substackcdn.com/image/fetch/$s_!A351!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b8eb420-0fb4-42dd-aa04-80647eb36cee_1125x602.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Design F won Parts 2 and the Rerun with identical Sigma. That means the result is robust &#8212; it does not depend on the threshold setting. Design B&#8217;s performance varied across runs &#8212; that means B&#8217;s result is threshold-sensitive. Knowing the difference between a robust finding and a threshold-sensitive one is what multiple runs are for.</p><h3>Three practical things to do</h3><p>&#8226; Compare the Sigma ranking from these runs against the original Alpha ranking from H1&#8211;H10. Broadly they should agree &#8212; H2 and H7 were the top performers in the original series, and their equivalents (B and I) should rank above H9/E on Sigma. Where they diverge, check whether the benchmark rewarded a different capability than the original task did.</p><p>&#8226; Use the winning design specification as a starting point, not a final answer. Design F won but carries a conditional Alpha &#8212; the constraint-bypass vulnerability the adversarial stage found. Apply the zero-Delta patch (add the no_suspension_recommended field) before treating F as a production design. A machine-generated winner that cleared a quality threshold is a useful first draft, not a finished architecture.</p><p>&#8226; Run the workflow again with a harder benchmark &#8212; one that includes deliberate traps modelled on the original ASCRS design (the PO-2853 cold chain requirement, the PO-2869 air freight override). This is the most direct way to test whether Design B or Design F holds up when the task actually requires careful precision rather than broad coverage. That run would resolve the central open question: which architecture wins when the benchmark is genuinely hard?</p><h3>4.5 A Framework for Harness Selection in Practice</h3><p><strong>These experiments tested workflow mechanics and benchmark sensitivity &#8212; not the full complexity of a real deployment. The original ASCRS series used a hard, trap-laden task with precise criteria. This experiment used a simplified version of the same scenario.</strong> The winning architecture changed completely between them. That gap is not a failure of the methodology. It is the most practical finding of the series.</p><p><strong>In real deployments, the task profile varies significantly depending on the organisation, the industry, the quality of documentation, and where in the value chain the agent is operating.</strong> The four variables below determine which harness architecture is likely to perform best before you run a single experiment. I apply them as general guidelines. Suitability really depends on application requirements. But I have found these useful. Generally, I try not to overcomplicate. Replicability and consistent Evaluation measures, make this necessary. However if I am unsure, and especially if starting with clean-slate, unburdened-legacy-workflows, allowing loops to run, with /goal and/or ultracode on xhigh is very helpful</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZeZS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62b4ab80-d4e9-4fa4-b188-b2bb7d5c8d82_706x671.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZeZS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62b4ab80-d4e9-4fa4-b188-b2bb7d5c8d82_706x671.png 424w, https://substackcdn.com/image/fetch/$s_!ZeZS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62b4ab80-d4e9-4fa4-b188-b2bb7d5c8d82_706x671.png 848w, https://substackcdn.com/image/fetch/$s_!ZeZS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62b4ab80-d4e9-4fa4-b188-b2bb7d5c8d82_706x671.png 1272w, https://substackcdn.com/image/fetch/$s_!ZeZS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62b4ab80-d4e9-4fa4-b188-b2bb7d5c8d82_706x671.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZeZS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62b4ab80-d4e9-4fa4-b188-b2bb7d5c8d82_706x671.png" width="706" height="671" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/62b4ab80-d4e9-4fa4-b188-b2bb7d5c8d82_706x671.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:671,&quot;width&quot;:706,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:56182,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62b4ab80-d4e9-4fa4-b188-b2bb7d5c8d82_706x671.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZeZS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62b4ab80-d4e9-4fa4-b188-b2bb7d5c8d82_706x671.png 424w, https://substackcdn.com/image/fetch/$s_!ZeZS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62b4ab80-d4e9-4fa4-b188-b2bb7d5c8d82_706x671.png 848w, https://substackcdn.com/image/fetch/$s_!ZeZS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62b4ab80-d4e9-4fa4-b188-b2bb7d5c8d82_706x671.png 1272w, https://substackcdn.com/image/fetch/$s_!ZeZS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62b4ab80-d4e9-4fa4-b188-b2bb7d5c8d82_706x671.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!44CL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19cfc9e5-d8e7-4675-9339-aac3698a9151_1096x592.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!44CL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19cfc9e5-d8e7-4675-9339-aac3698a9151_1096x592.png 424w, https://substackcdn.com/image/fetch/$s_!44CL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19cfc9e5-d8e7-4675-9339-aac3698a9151_1096x592.png 848w, https://substackcdn.com/image/fetch/$s_!44CL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19cfc9e5-d8e7-4675-9339-aac3698a9151_1096x592.png 1272w, https://substackcdn.com/image/fetch/$s_!44CL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19cfc9e5-d8e7-4675-9339-aac3698a9151_1096x592.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!44CL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19cfc9e5-d8e7-4675-9339-aac3698a9151_1096x592.png" width="1096" height="592" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/19cfc9e5-d8e7-4675-9339-aac3698a9151_1096x592.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:592,&quot;width&quot;:1096,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1023280,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19cfc9e5-d8e7-4675-9339-aac3698a9151_1096x592.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!44CL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19cfc9e5-d8e7-4675-9339-aac3698a9151_1096x592.png 424w, https://substackcdn.com/image/fetch/$s_!44CL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19cfc9e5-d8e7-4675-9339-aac3698a9151_1096x592.png 848w, https://substackcdn.com/image/fetch/$s_!44CL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19cfc9e5-d8e7-4675-9339-aac3698a9151_1096x592.png 1272w, https://substackcdn.com/image/fetch/$s_!44CL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19cfc9e5-d8e7-4675-9339-aac3698a9151_1096x592.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h3>Where the Experiments Sit on This Framework</h3><p>The original ASCRS series sat at the hard end of all four variables: a complex, trap-laden task requiring specialist pharmaceutical supply chain knowledge, against a precisely authored gold answer with six gate criteria, at regulatory-consequence stakes. H2 won because that task profile rewards precision above everything else.</p><p>The automated experiment here sat at the easy end: a simplified version of the same scenario, five binary criteria, no traps, no deliberate constraints requiring specialist knowledge. Design F won because that task profile rewards broad coverage above precision. The same organisation running the same workflow on both task profiles would get different recommended architectures &#8212; correctly.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NkzD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf8092a-9d9a-4255-8db2-5246120e4bfe_708x301.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NkzD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf8092a-9d9a-4255-8db2-5246120e4bfe_708x301.png 424w, https://substackcdn.com/image/fetch/$s_!NkzD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf8092a-9d9a-4255-8db2-5246120e4bfe_708x301.png 848w, https://substackcdn.com/image/fetch/$s_!NkzD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf8092a-9d9a-4255-8db2-5246120e4bfe_708x301.png 1272w, https://substackcdn.com/image/fetch/$s_!NkzD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf8092a-9d9a-4255-8db2-5246120e4bfe_708x301.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NkzD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf8092a-9d9a-4255-8db2-5246120e4bfe_708x301.png" width="708" height="301" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8cf8092a-9d9a-4255-8db2-5246120e4bfe_708x301.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:301,&quot;width&quot;:708,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:22924,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf8092a-9d9a-4255-8db2-5246120e4bfe_708x301.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NkzD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf8092a-9d9a-4255-8db2-5246120e4bfe_708x301.png 424w, https://substackcdn.com/image/fetch/$s_!NkzD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf8092a-9d9a-4255-8db2-5246120e4bfe_708x301.png 848w, https://substackcdn.com/image/fetch/$s_!NkzD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf8092a-9d9a-4255-8db2-5246120e4bfe_708x301.png 1272w, https://substackcdn.com/image/fetch/$s_!NkzD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf8092a-9d9a-4255-8db2-5246120e4bfe_708x301.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>The Practical Implication</h3><p>H2 remains the right starting point for any well-run organisation with clear processes and a bounded, well-specified task. As the task becomes more ambiguous, the domain deeper, the documentation thinner, or the stakes higher, the architecture needs to change &#8212; not because H2 is wrong, but because the task is asking something different. The experiments here did not disprove the original finding. They showed the boundaries of where it applies.</p><p>The framework above is not a prescriptive ranking. It is a set of questions to ask and consider before running an experiment. Answer the four variables honestly for your specific task and organisation and you will have a defensible starting hypothesis for which harness architecture to test first &#8212; rather than discovering it only after running ten designs through a tournament.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YVvC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76939965-c451-4353-8473-a8c63a85a1a6_867x68.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YVvC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76939965-c451-4353-8473-a8c63a85a1a6_867x68.png 424w, https://substackcdn.com/image/fetch/$s_!YVvC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76939965-c451-4353-8473-a8c63a85a1a6_867x68.png 848w, https://substackcdn.com/image/fetch/$s_!YVvC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76939965-c451-4353-8473-a8c63a85a1a6_867x68.png 1272w, https://substackcdn.com/image/fetch/$s_!YVvC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76939965-c451-4353-8473-a8c63a85a1a6_867x68.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YVvC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76939965-c451-4353-8473-a8c63a85a1a6_867x68.png" width="867" height="68" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/76939965-c451-4353-8473-a8c63a85a1a6_867x68.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:68,&quot;width&quot;:867,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7649,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76939965-c451-4353-8473-a8c63a85a1a6_867x68.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YVvC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76939965-c451-4353-8473-a8c63a85a1a6_867x68.png 424w, https://substackcdn.com/image/fetch/$s_!YVvC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76939965-c451-4353-8473-a8c63a85a1a6_867x68.png 848w, https://substackcdn.com/image/fetch/$s_!YVvC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76939965-c451-4353-8473-a8c63a85a1a6_867x68.png 1272w, https://substackcdn.com/image/fetch/$s_!YVvC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76939965-c451-4353-8473-a8c63a85a1a6_867x68.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>The following ASCII topology diagrams document the ten harness architectures from the original Harness Lab series, including their actual Alpha scores. These serve as the reference for comparing what the automated workflow generates in Designs A&#8211;J. If the generated designs differ structurally from these, Stage 1a (Design Verification) will flag the discrepancy.</p><h3>H1 &#8212; Minimal Single Prompt Alpha: 0.665 Baseline</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gAnz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe18be2c2-27c1-4ac1-a9c9-cd360b1ee1d4_862x255.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gAnz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe18be2c2-27c1-4ac1-a9c9-cd360b1ee1d4_862x255.png 424w, https://substackcdn.com/image/fetch/$s_!gAnz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe18be2c2-27c1-4ac1-a9c9-cd360b1ee1d4_862x255.png 848w, https://substackcdn.com/image/fetch/$s_!gAnz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe18be2c2-27c1-4ac1-a9c9-cd360b1ee1d4_862x255.png 1272w, https://substackcdn.com/image/fetch/$s_!gAnz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe18be2c2-27c1-4ac1-a9c9-cd360b1ee1d4_862x255.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gAnz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe18be2c2-27c1-4ac1-a9c9-cd360b1ee1d4_862x255.png" width="862" height="255" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e18be2c2-27c1-4ac1-a9c9-cd360b1ee1d4_862x255.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:255,&quot;width&quot;:862,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12969,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe18be2c2-27c1-4ac1-a9c9-cd360b1ee1d4_862x255.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gAnz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe18be2c2-27c1-4ac1-a9c9-cd360b1ee1d4_862x255.png 424w, https://substackcdn.com/image/fetch/$s_!gAnz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe18be2c2-27c1-4ac1-a9c9-cd360b1ee1d4_862x255.png 848w, https://substackcdn.com/image/fetch/$s_!gAnz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe18be2c2-27c1-4ac1-a9c9-cd360b1ee1d4_862x255.png 1272w, https://substackcdn.com/image/fetch/$s_!gAnz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe18be2c2-27c1-4ac1-a9c9-cd360b1ee1d4_862x255.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>H2 &#8212; Structured Prompt with Reasoning Scaffold Alpha: 0.920 WINNER</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jDep!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7529fc78-51d6-433f-80f0-724f322e604b_870x357.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jDep!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7529fc78-51d6-433f-80f0-724f322e604b_870x357.png 424w, https://substackcdn.com/image/fetch/$s_!jDep!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7529fc78-51d6-433f-80f0-724f322e604b_870x357.png 848w, https://substackcdn.com/image/fetch/$s_!jDep!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7529fc78-51d6-433f-80f0-724f322e604b_870x357.png 1272w, https://substackcdn.com/image/fetch/$s_!jDep!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7529fc78-51d6-433f-80f0-724f322e604b_870x357.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jDep!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7529fc78-51d6-433f-80f0-724f322e604b_870x357.png" width="870" height="357" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7529fc78-51d6-433f-80f0-724f322e604b_870x357.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:357,&quot;width&quot;:870,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:22444,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7529fc78-51d6-433f-80f0-724f322e604b_870x357.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jDep!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7529fc78-51d6-433f-80f0-724f322e604b_870x357.png 424w, https://substackcdn.com/image/fetch/$s_!jDep!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7529fc78-51d6-433f-80f0-724f322e604b_870x357.png 848w, https://substackcdn.com/image/fetch/$s_!jDep!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7529fc78-51d6-433f-80f0-724f322e604b_870x357.png 1272w, https://substackcdn.com/image/fetch/$s_!jDep!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7529fc78-51d6-433f-80f0-724f322e604b_870x357.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>H3 &#8212; Tool-Augmented Single Agent Alpha: 0.600 Below Baseline</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!a9cA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93c53912-841f-4eef-96d0-0bd33172dd04_867x352.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!a9cA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93c53912-841f-4eef-96d0-0bd33172dd04_867x352.png 424w, https://substackcdn.com/image/fetch/$s_!a9cA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93c53912-841f-4eef-96d0-0bd33172dd04_867x352.png 848w, https://substackcdn.com/image/fetch/$s_!a9cA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93c53912-841f-4eef-96d0-0bd33172dd04_867x352.png 1272w, https://substackcdn.com/image/fetch/$s_!a9cA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93c53912-841f-4eef-96d0-0bd33172dd04_867x352.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!a9cA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93c53912-841f-4eef-96d0-0bd33172dd04_867x352.png" width="867" height="352" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/93c53912-841f-4eef-96d0-0bd33172dd04_867x352.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:352,&quot;width&quot;:867,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:16197,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93c53912-841f-4eef-96d0-0bd33172dd04_867x352.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!a9cA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93c53912-841f-4eef-96d0-0bd33172dd04_867x352.png 424w, https://substackcdn.com/image/fetch/$s_!a9cA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93c53912-841f-4eef-96d0-0bd33172dd04_867x352.png 848w, https://substackcdn.com/image/fetch/$s_!a9cA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93c53912-841f-4eef-96d0-0bd33172dd04_867x352.png 1272w, https://substackcdn.com/image/fetch/$s_!a9cA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93c53912-841f-4eef-96d0-0bd33172dd04_867x352.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>H4 &#8212; Independent Chains Alpha: 0.440 FAILED (coherence collapse)</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4eXv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5556a05-ac41-44bb-8b04-c92440e98ff9_863x372.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4eXv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5556a05-ac41-44bb-8b04-c92440e98ff9_863x372.png 424w, https://substackcdn.com/image/fetch/$s_!4eXv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5556a05-ac41-44bb-8b04-c92440e98ff9_863x372.png 848w, https://substackcdn.com/image/fetch/$s_!4eXv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5556a05-ac41-44bb-8b04-c92440e98ff9_863x372.png 1272w, https://substackcdn.com/image/fetch/$s_!4eXv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5556a05-ac41-44bb-8b04-c92440e98ff9_863x372.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4eXv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5556a05-ac41-44bb-8b04-c92440e98ff9_863x372.png" width="863" height="372" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f5556a05-ac41-44bb-8b04-c92440e98ff9_863x372.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:372,&quot;width&quot;:863,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:20227,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5556a05-ac41-44bb-8b04-c92440e98ff9_863x372.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4eXv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5556a05-ac41-44bb-8b04-c92440e98ff9_863x372.png 424w, https://substackcdn.com/image/fetch/$s_!4eXv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5556a05-ac41-44bb-8b04-c92440e98ff9_863x372.png 848w, https://substackcdn.com/image/fetch/$s_!4eXv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5556a05-ac41-44bb-8b04-c92440e98ff9_863x372.png 1272w, https://substackcdn.com/image/fetch/$s_!4eXv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5556a05-ac41-44bb-8b04-c92440e98ff9_863x372.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>H5 &#8212; Self-Revision Loop Alpha: 0.840 Strong</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CuAt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35acb5ef-332f-4176-82ac-ff1da0d5a961_865x428.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CuAt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35acb5ef-332f-4176-82ac-ff1da0d5a961_865x428.png 424w, https://substackcdn.com/image/fetch/$s_!CuAt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35acb5ef-332f-4176-82ac-ff1da0d5a961_865x428.png 848w, https://substackcdn.com/image/fetch/$s_!CuAt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35acb5ef-332f-4176-82ac-ff1da0d5a961_865x428.png 1272w, https://substackcdn.com/image/fetch/$s_!CuAt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35acb5ef-332f-4176-82ac-ff1da0d5a961_865x428.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CuAt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35acb5ef-332f-4176-82ac-ff1da0d5a961_865x428.png" width="865" height="428" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/35acb5ef-332f-4176-82ac-ff1da0d5a961_865x428.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:428,&quot;width&quot;:865,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:17516,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35acb5ef-332f-4176-82ac-ff1da0d5a961_865x428.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CuAt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35acb5ef-332f-4176-82ac-ff1da0d5a961_865x428.png 424w, https://substackcdn.com/image/fetch/$s_!CuAt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35acb5ef-332f-4176-82ac-ff1da0d5a961_865x428.png 848w, https://substackcdn.com/image/fetch/$s_!CuAt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35acb5ef-332f-4176-82ac-ff1da0d5a961_865x428.png 1272w, https://substackcdn.com/image/fetch/$s_!CuAt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35acb5ef-332f-4176-82ac-ff1da0d5a961_865x428.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>H6 &#8212; Two-Agent Chain with Revision Handoff Alpha: 0.900 Strong</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8BNs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa497adc6-52ad-44ec-857c-1ded1b6f84f0_868x395.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8BNs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa497adc6-52ad-44ec-857c-1ded1b6f84f0_868x395.png 424w, https://substackcdn.com/image/fetch/$s_!8BNs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa497adc6-52ad-44ec-857c-1ded1b6f84f0_868x395.png 848w, https://substackcdn.com/image/fetch/$s_!8BNs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa497adc6-52ad-44ec-857c-1ded1b6f84f0_868x395.png 1272w, https://substackcdn.com/image/fetch/$s_!8BNs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa497adc6-52ad-44ec-857c-1ded1b6f84f0_868x395.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8BNs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa497adc6-52ad-44ec-857c-1ded1b6f84f0_868x395.png" width="868" height="395" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a497adc6-52ad-44ec-857c-1ded1b6f84f0_868x395.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:395,&quot;width&quot;:868,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:20231,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa497adc6-52ad-44ec-857c-1ded1b6f84f0_868x395.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8BNs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa497adc6-52ad-44ec-857c-1ded1b6f84f0_868x395.png 424w, https://substackcdn.com/image/fetch/$s_!8BNs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa497adc6-52ad-44ec-857c-1ded1b6f84f0_868x395.png 848w, https://substackcdn.com/image/fetch/$s_!8BNs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa497adc6-52ad-44ec-857c-1ded1b6f84f0_868x395.png 1272w, https://substackcdn.com/image/fetch/$s_!8BNs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa497adc6-52ad-44ec-857c-1ded1b6f84f0_868x395.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>H7 &#8212; Three-Agent with Model Routing Alpha: 0.840 Strong</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9gkF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03f5aae3-ec9e-438e-85fe-79802cbad1cf_867x353.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9gkF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03f5aae3-ec9e-438e-85fe-79802cbad1cf_867x353.png 424w, https://substackcdn.com/image/fetch/$s_!9gkF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03f5aae3-ec9e-438e-85fe-79802cbad1cf_867x353.png 848w, https://substackcdn.com/image/fetch/$s_!9gkF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03f5aae3-ec9e-438e-85fe-79802cbad1cf_867x353.png 1272w, https://substackcdn.com/image/fetch/$s_!9gkF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03f5aae3-ec9e-438e-85fe-79802cbad1cf_867x353.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9gkF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03f5aae3-ec9e-438e-85fe-79802cbad1cf_867x353.png" width="867" height="353" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/03f5aae3-ec9e-438e-85fe-79802cbad1cf_867x353.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:353,&quot;width&quot;:867,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:27031,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03f5aae3-ec9e-438e-85fe-79802cbad1cf_867x353.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9gkF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03f5aae3-ec9e-438e-85fe-79802cbad1cf_867x353.png 424w, https://substackcdn.com/image/fetch/$s_!9gkF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03f5aae3-ec9e-438e-85fe-79802cbad1cf_867x353.png 848w, https://substackcdn.com/image/fetch/$s_!9gkF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03f5aae3-ec9e-438e-85fe-79802cbad1cf_867x353.png 1272w, https://substackcdn.com/image/fetch/$s_!9gkF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03f5aae3-ec9e-438e-85fe-79802cbad1cf_867x353.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>H8 &#8212; Parallel Agents with Aggregation Alpha: 0.815 Plateau</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!H-g8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a721bea-05fd-4976-8c63-22e693ebc797_863x357.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!H-g8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a721bea-05fd-4976-8c63-22e693ebc797_863x357.png 424w, https://substackcdn.com/image/fetch/$s_!H-g8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a721bea-05fd-4976-8c63-22e693ebc797_863x357.png 848w, https://substackcdn.com/image/fetch/$s_!H-g8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a721bea-05fd-4976-8c63-22e693ebc797_863x357.png 1272w, https://substackcdn.com/image/fetch/$s_!H-g8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a721bea-05fd-4976-8c63-22e693ebc797_863x357.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!H-g8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a721bea-05fd-4976-8c63-22e693ebc797_863x357.png" width="863" height="357" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2a721bea-05fd-4976-8c63-22e693ebc797_863x357.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:357,&quot;width&quot;:863,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:17822,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a721bea-05fd-4976-8c63-22e693ebc797_863x357.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!H-g8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a721bea-05fd-4976-8c63-22e693ebc797_863x357.png 424w, https://substackcdn.com/image/fetch/$s_!H-g8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a721bea-05fd-4976-8c63-22e693ebc797_863x357.png 848w, https://substackcdn.com/image/fetch/$s_!H-g8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a721bea-05fd-4976-8c63-22e693ebc797_863x357.png 1272w, https://substackcdn.com/image/fetch/$s_!H-g8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a721bea-05fd-4976-8c63-22e693ebc797_863x357.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>H9 &#8212; Five-Agent Orchestration Swarm Alpha: 0.815 Plateau (predicted to win)</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vzFL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6b1a14c-0cf1-44ac-9c6c-4b0ee9b0563b_865x316.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vzFL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6b1a14c-0cf1-44ac-9c6c-4b0ee9b0563b_865x316.png 424w, https://substackcdn.com/image/fetch/$s_!vzFL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6b1a14c-0cf1-44ac-9c6c-4b0ee9b0563b_865x316.png 848w, https://substackcdn.com/image/fetch/$s_!vzFL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6b1a14c-0cf1-44ac-9c6c-4b0ee9b0563b_865x316.png 1272w, https://substackcdn.com/image/fetch/$s_!vzFL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6b1a14c-0cf1-44ac-9c6c-4b0ee9b0563b_865x316.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vzFL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6b1a14c-0cf1-44ac-9c6c-4b0ee9b0563b_865x316.png" width="865" height="316" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a6b1a14c-0cf1-44ac-9c6c-4b0ee9b0563b_865x316.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:316,&quot;width&quot;:865,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:21953,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6b1a14c-0cf1-44ac-9c6c-4b0ee9b0563b_865x316.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vzFL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6b1a14c-0cf1-44ac-9c6c-4b0ee9b0563b_865x316.png 424w, https://substackcdn.com/image/fetch/$s_!vzFL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6b1a14c-0cf1-44ac-9c6c-4b0ee9b0563b_865x316.png 848w, https://substackcdn.com/image/fetch/$s_!vzFL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6b1a14c-0cf1-44ac-9c6c-4b0ee9b0563b_865x316.png 1272w, https://substackcdn.com/image/fetch/$s_!vzFL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6b1a14c-0cf1-44ac-9c6c-4b0ee9b0563b_865x316.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>H10 &#8212; Meta-Harness with Outer Evaluation Loop Alpha: 0.230 FAILED</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Cxsx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3142583c-fd55-4c3c-8dae-0176af10bf00_871x447.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Cxsx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3142583c-fd55-4c3c-8dae-0176af10bf00_871x447.png 424w, https://substackcdn.com/image/fetch/$s_!Cxsx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3142583c-fd55-4c3c-8dae-0176af10bf00_871x447.png 848w, https://substackcdn.com/image/fetch/$s_!Cxsx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3142583c-fd55-4c3c-8dae-0176af10bf00_871x447.png 1272w, https://substackcdn.com/image/fetch/$s_!Cxsx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3142583c-fd55-4c3c-8dae-0176af10bf00_871x447.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Cxsx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3142583c-fd55-4c3c-8dae-0176af10bf00_871x447.png" width="871" height="447" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3142583c-fd55-4c3c-8dae-0176af10bf00_871x447.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:447,&quot;width&quot;:871,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:28271,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3142583c-fd55-4c3c-8dae-0176af10bf00_871x447.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Cxsx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3142583c-fd55-4c3c-8dae-0176af10bf00_871x447.png 424w, https://substackcdn.com/image/fetch/$s_!Cxsx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3142583c-fd55-4c3c-8dae-0176af10bf00_871x447.png 848w, https://substackcdn.com/image/fetch/$s_!Cxsx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3142583c-fd55-4c3c-8dae-0176af10bf00_871x447.png 1272w, https://substackcdn.com/image/fetch/$s_!Cxsx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3142583c-fd55-4c3c-8dae-0176af10bf00_871x447.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>A.1 How These Map to the Automated Designs</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fpdu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6393c0f8-c7f4-48c3-aac2-be3458ee643d_787x692.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fpdu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6393c0f8-c7f4-48c3-aac2-be3458ee643d_787x692.png 424w, https://substackcdn.com/image/fetch/$s_!fpdu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6393c0f8-c7f4-48c3-aac2-be3458ee643d_787x692.png 848w, https://substackcdn.com/image/fetch/$s_!fpdu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6393c0f8-c7f4-48c3-aac2-be3458ee643d_787x692.png 1272w, https://substackcdn.com/image/fetch/$s_!fpdu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6393c0f8-c7f4-48c3-aac2-be3458ee643d_787x692.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fpdu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6393c0f8-c7f4-48c3-aac2-be3458ee643d_787x692.png" width="787" height="692" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6393c0f8-c7f4-48c3-aac2-be3458ee643d_787x692.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:692,&quot;width&quot;:787,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:74253,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/201234450?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6393c0f8-c7f4-48c3-aac2-be3458ee643d_787x692.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fpdu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6393c0f8-c7f4-48c3-aac2-be3458ee643d_787x692.png 424w, https://substackcdn.com/image/fetch/$s_!fpdu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6393c0f8-c7f4-48c3-aac2-be3458ee643d_787x692.png 848w, https://substackcdn.com/image/fetch/$s_!fpdu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6393c0f8-c7f4-48c3-aac2-be3458ee643d_787x692.png 1272w, https://substackcdn.com/image/fetch/$s_!fpdu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6393c0f8-c7f4-48c3-aac2-be3458ee643d_787x692.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>References</h2><p>&#8226; Anthropic (2026). &#8216;<a href="https://claude.com/blog/a-harness-for-every-task-dynamic-workflows-in-claude-code">A harness for every task: dynamic workflows in Claude Code.</a>&#8217; Thariq Shihipar &amp; Sid Bidasaria. June 2026. claude.com/blog/a-harness-for-every-task-dynamic-workflows-in-claude-code</p><p>&#8226; Anthropic (2026). &#8216;<a href="https://code.claude.com/docs/en/workflows">Orchestrate subagents at scale with dynamic workflows</a>.&#8217; Claude Code documentation. code.claude.com/docs/en/workflows</p><p>&#8226; &#8216;<a href="https://interestingengineering.substack.com/p/the-prompt-is-still-the-work-dynamic">The Prompt Is Still the Work: Dynamic Workflows in Claude Code.</a>&#8217; Interesting Engineering++. interestingengineering.substack.com</p><p>&#8226;  <a href="https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated">ASCRS Harness Lab</a>. H1&#8211;H10 architecture evaluation against Hormuz Strait pharmaceutical supply chain disruption scenario. Alpha scores: H1=0.665, H2=0.920, H3=0.600, H4=0.440, H5=0.840, H6=0.900, H7=0.840, H8=0.815, H9=0.815, H10=0.230.</p><p>&#8226; &#8216;<a href="https://interestingengineering.substack.com/p/the-architecture-of-awareness-design">Architecture of Awareness Experiments.</a>&#8217; V1&#8211;V4 progressive agent designs for the ASCRS domain. </p><p>&#8226; &#8216;<a href="https://interestingengineering.substack.com/p/the-geometry-of-unpredictability">The Geometry of Unpredictability.</a>&#8217; Bounded vs unbounded agent tracks, circuit breaker at iteration 3 (4,042 tokens vs 91,379 tokens). Harness Engineering Series.</p><p>&#8226; &#8216;<a href="https://interestingengineering.substack.com/p/the-structure-is-the-intelligence">The Structure Is the Intelligence.&#8217;</a> StockPilot CMA Cycles 0&#8211;4, structural layer separation, Policy/State/SOPs/Action, 97% token reduction. Harness Engineering Series.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Prompt Is Still the Work: Dynamic Workflows in Claude Code]]></title><description><![CDATA[Six Patterns, a Live Compliance Audit, and Why Specification Discipline (Evals) Matters More at Scale]]></description><link>https://interestingengineering.substack.com/p/the-prompt-is-still-the-work-dynamic</link><guid isPermaLink="false">https://interestingengineering.substack.com/p/the-prompt-is-still-the-work-dynamic</guid><dc:creator><![CDATA[Interesting Engineering ++]]></dc:creator><pubDate>Sat, 06 Jun 2026 19:08:33 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ZOHb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d93218a-668d-435a-95be-6607189bd4ee_1132x610.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZOHb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d93218a-668d-435a-95be-6607189bd4ee_1132x610.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZOHb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d93218a-668d-435a-95be-6607189bd4ee_1132x610.png 424w, https://substackcdn.com/image/fetch/$s_!ZOHb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d93218a-668d-435a-95be-6607189bd4ee_1132x610.png 848w, https://substackcdn.com/image/fetch/$s_!ZOHb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d93218a-668d-435a-95be-6607189bd4ee_1132x610.png 1272w, https://substackcdn.com/image/fetch/$s_!ZOHb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d93218a-668d-435a-95be-6607189bd4ee_1132x610.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZOHb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d93218a-668d-435a-95be-6607189bd4ee_1132x610.png" width="1132" height="610" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6d93218a-668d-435a-95be-6607189bd4ee_1132x610.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:610,&quot;width&quot;:1132,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:975486,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d93218a-668d-435a-95be-6607189bd4ee_1132x610.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZOHb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d93218a-668d-435a-95be-6607189bd4ee_1132x610.png 424w, https://substackcdn.com/image/fetch/$s_!ZOHb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d93218a-668d-435a-95be-6607189bd4ee_1132x610.png 848w, https://substackcdn.com/image/fetch/$s_!ZOHb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d93218a-668d-435a-95be-6607189bd4ee_1132x610.png 1272w, https://substackcdn.com/image/fetch/$s_!ZOHb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d93218a-668d-435a-95be-6607189bd4ee_1132x610.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>1. Static vs Dynamic Harnesses?</h2><p><strong>Dynamic workflows</strong> are fascinating. Recently released, they give Claude Code the ability to write its own JavaScript orchestration harness at runtime. Previously, every multi-agent workflow required a <strong>pre-written coordination layer &#8212; either hand-coded or scaffolded via the Claude Agent SDK</strong>. The harness was fixed before execution began.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9CrI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42241044-76ba-4a9d-8428-1f1c2cd9b6b4_1158x627.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9CrI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42241044-76ba-4a9d-8428-1f1c2cd9b6b4_1158x627.png 424w, https://substackcdn.com/image/fetch/$s_!9CrI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42241044-76ba-4a9d-8428-1f1c2cd9b6b4_1158x627.png 848w, https://substackcdn.com/image/fetch/$s_!9CrI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42241044-76ba-4a9d-8428-1f1c2cd9b6b4_1158x627.png 1272w, https://substackcdn.com/image/fetch/$s_!9CrI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42241044-76ba-4a9d-8428-1f1c2cd9b6b4_1158x627.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9CrI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42241044-76ba-4a9d-8428-1f1c2cd9b6b4_1158x627.png" width="1158" height="627" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/42241044-76ba-4a9d-8428-1f1c2cd9b6b4_1158x627.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:627,&quot;width&quot;:1158,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1058298,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42241044-76ba-4a9d-8428-1f1c2cd9b6b4_1158x627.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9CrI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42241044-76ba-4a9d-8428-1f1c2cd9b6b4_1158x627.png 424w, https://substackcdn.com/image/fetch/$s_!9CrI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42241044-76ba-4a9d-8428-1f1c2cd9b6b4_1158x627.png 848w, https://substackcdn.com/image/fetch/$s_!9CrI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42241044-76ba-4a9d-8428-1f1c2cd9b6b4_1158x627.png 1272w, https://substackcdn.com/image/fetch/$s_!9CrI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42241044-76ba-4a9d-8428-1f1c2cd9b6b4_1158x627.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>What changed is where planning lives. <strong>Claude now reads the incoming task, writes a custom .js harness, and executes it using native functions for spawning subagents, assigning models, and managing worktrees.</strong> <strong>The harness becomes an output of the task, not a precondition for it.</strong></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2GYX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c735949-7431-4b14-af30-d0fffc97d181_942x468.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2GYX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c735949-7431-4b14-af30-d0fffc97d181_942x468.png 424w, https://substackcdn.com/image/fetch/$s_!2GYX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c735949-7431-4b14-af30-d0fffc97d181_942x468.png 848w, https://substackcdn.com/image/fetch/$s_!2GYX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c735949-7431-4b14-af30-d0fffc97d181_942x468.png 1272w, https://substackcdn.com/image/fetch/$s_!2GYX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c735949-7431-4b14-af30-d0fffc97d181_942x468.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2GYX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c735949-7431-4b14-af30-d0fffc97d181_942x468.png" width="942" height="468" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7c735949-7431-4b14-af30-d0fffc97d181_942x468.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:468,&quot;width&quot;:942,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:21080,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c735949-7431-4b14-af30-d0fffc97d181_942x468.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2GYX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c735949-7431-4b14-af30-d0fffc97d181_942x468.png 424w, https://substackcdn.com/image/fetch/$s_!2GYX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c735949-7431-4b14-af30-d0fffc97d181_942x468.png 848w, https://substackcdn.com/image/fetch/$s_!2GYX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c735949-7431-4b14-af30-d0fffc97d181_942x468.png 1272w, https://substackcdn.com/image/fetch/$s_!2GYX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c735949-7431-4b14-af30-d0fffc97d181_942x468.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Three failure modes motivate this design. <strong>Agentic laziness</strong>: Claude stops early and declares a complex task done. <strong>Self-preferential bias</strong>: Claude tends to confirm its own outputs when asked to verify them. <strong>Goal drift</strong>: across many turns and compaction steps, original constraints get lost. </p><p>Separate subagents with isolated context windows address all three structurally.</p><h2>2. My Prior Projects: Pattern Classification</h2><p>Anthropic defines six composition patterns. My prior Claude Code experiments already implemented versions of all six &#8212; statically. The table maps each project to its primary and secondary pattern, and the metric that measured success.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LOfZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadc25a16-e6a7-4399-8b4a-4cafb3ff597a_942x318.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LOfZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadc25a16-e6a7-4399-8b4a-4cafb3ff597a_942x318.png 424w, https://substackcdn.com/image/fetch/$s_!LOfZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadc25a16-e6a7-4399-8b4a-4cafb3ff597a_942x318.png 848w, https://substackcdn.com/image/fetch/$s_!LOfZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadc25a16-e6a7-4399-8b4a-4cafb3ff597a_942x318.png 1272w, https://substackcdn.com/image/fetch/$s_!LOfZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadc25a16-e6a7-4399-8b4a-4cafb3ff597a_942x318.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LOfZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadc25a16-e6a7-4399-8b4a-4cafb3ff597a_942x318.png" width="942" height="318" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/adc25a16-e6a7-4399-8b4a-4cafb3ff597a_942x318.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:318,&quot;width&quot;:942,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:48948,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadc25a16-e6a7-4399-8b4a-4cafb3ff597a_942x318.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LOfZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadc25a16-e6a7-4399-8b4a-4cafb3ff597a_942x318.png 424w, https://substackcdn.com/image/fetch/$s_!LOfZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadc25a16-e6a7-4399-8b4a-4cafb3ff597a_942x318.png 848w, https://substackcdn.com/image/fetch/$s_!LOfZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadc25a16-e6a7-4399-8b4a-4cafb3ff597a_942x318.png 1272w, https://substackcdn.com/image/fetch/$s_!LOfZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadc25a16-e6a7-4399-8b4a-4cafb3ff597a_942x318.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Three patterns emerge from the mapping. Tournament has been the primary evaluation pattern (H1&#8211;H10, ASCRS) but always run manually by the developer (me); dynamic workflows invert that: Claude runs the bracket. Loop Until Done appears in three of the five projects &#8212; it is the most consistently used pattern, and also the one most transparently improvable by automation (the stopping condition was always decided by you, not encoded). Generate-and-Filter has been the most implicit pattern &#8212; present in every one of my initial article draft cycles but never instrumented as a Claude Code workflow.</p><h2>3. So What Are The Six Patterns?</h2><p>Each pattern below includes a box-and-line diagram and a note on how my prior work maps to it.</p><h3>3.1 Fan-out-and-Synthesize</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ou8I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff72644dc-70d9-4769-a97a-4ce01078a581_1145x612.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ou8I!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff72644dc-70d9-4769-a97a-4ce01078a581_1145x612.png 424w, https://substackcdn.com/image/fetch/$s_!Ou8I!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff72644dc-70d9-4769-a97a-4ce01078a581_1145x612.png 848w, https://substackcdn.com/image/fetch/$s_!Ou8I!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff72644dc-70d9-4769-a97a-4ce01078a581_1145x612.png 1272w, https://substackcdn.com/image/fetch/$s_!Ou8I!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff72644dc-70d9-4769-a97a-4ce01078a581_1145x612.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ou8I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff72644dc-70d9-4769-a97a-4ce01078a581_1145x612.png" width="1145" height="612" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f72644dc-70d9-4769-a97a-4ce01078a581_1145x612.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:612,&quot;width&quot;:1145,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:971717,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff72644dc-70d9-4769-a97a-4ce01078a581_1145x612.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ou8I!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff72644dc-70d9-4769-a97a-4ce01078a581_1145x612.png 424w, https://substackcdn.com/image/fetch/$s_!Ou8I!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff72644dc-70d9-4769-a97a-4ce01078a581_1145x612.png 848w, https://substackcdn.com/image/fetch/$s_!Ou8I!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff72644dc-70d9-4769-a97a-4ce01078a581_1145x612.png 1272w, https://substackcdn.com/image/fetch/$s_!Ou8I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff72644dc-70d9-4769-a97a-4ce01078a581_1145x612.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Split a task into N independent subtasks. Run one agent per subtask in isolation. A synthesizer merges outputs &#8212; it acts as a barrier, waiting for all agents before proceeding.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!h8Qr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37d95b0c-7207-4d36-8364-0cd35ab6849e_942x418.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!h8Qr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37d95b0c-7207-4d36-8364-0cd35ab6849e_942x418.png 424w, https://substackcdn.com/image/fetch/$s_!h8Qr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37d95b0c-7207-4d36-8364-0cd35ab6849e_942x418.png 848w, https://substackcdn.com/image/fetch/$s_!h8Qr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37d95b0c-7207-4d36-8364-0cd35ab6849e_942x418.png 1272w, https://substackcdn.com/image/fetch/$s_!h8Qr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37d95b0c-7207-4d36-8364-0cd35ab6849e_942x418.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!h8Qr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37d95b0c-7207-4d36-8364-0cd35ab6849e_942x418.png" width="942" height="418" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/37d95b0c-7207-4d36-8364-0cd35ab6849e_942x418.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:418,&quot;width&quot;:942,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:13708,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37d95b0c-7207-4d36-8364-0cd35ab6849e_942x418.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!h8Qr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37d95b0c-7207-4d36-8364-0cd35ab6849e_942x418.png 424w, https://substackcdn.com/image/fetch/$s_!h8Qr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37d95b0c-7207-4d36-8364-0cd35ab6849e_942x418.png 848w, https://substackcdn.com/image/fetch/$s_!h8Qr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37d95b0c-7207-4d36-8364-0cd35ab6849e_942x418.png 1272w, https://substackcdn.com/image/fetch/$s_!h8Qr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37d95b0c-7207-4d36-8364-0cd35ab6849e_942x418.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Prior work:</strong> StockPilot&#8217;s final CMA architecture introduced sub-agent delegation &#8212; specific tasks handed off to targeted sub-agents rather than processed by a monolithic agent. That delegation is the fan-out element. However the 97% token reduction came primarily from structural layer separation: Policy, State, SOPs, and Action had been collapsed into one context re-read on every agent turn. Cycle 2 (replacing data-dump bash tools) was the single largest driver. The mechanism was selective loading, not domain isolation.</p><h3>3.2 Loop Until Done</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!q1bZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3468262-d45e-4b23-8678-72f77a18c62f_1138x612.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!q1bZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3468262-d45e-4b23-8678-72f77a18c62f_1138x612.png 424w, https://substackcdn.com/image/fetch/$s_!q1bZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3468262-d45e-4b23-8678-72f77a18c62f_1138x612.png 848w, https://substackcdn.com/image/fetch/$s_!q1bZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3468262-d45e-4b23-8678-72f77a18c62f_1138x612.png 1272w, https://substackcdn.com/image/fetch/$s_!q1bZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3468262-d45e-4b23-8678-72f77a18c62f_1138x612.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!q1bZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3468262-d45e-4b23-8678-72f77a18c62f_1138x612.png" width="1138" height="612" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b3468262-d45e-4b23-8678-72f77a18c62f_1138x612.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:612,&quot;width&quot;:1138,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:977695,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3468262-d45e-4b23-8678-72f77a18c62f_1138x612.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!q1bZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3468262-d45e-4b23-8678-72f77a18c62f_1138x612.png 424w, https://substackcdn.com/image/fetch/$s_!q1bZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3468262-d45e-4b23-8678-72f77a18c62f_1138x612.png 848w, https://substackcdn.com/image/fetch/$s_!q1bZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3468262-d45e-4b23-8678-72f77a18c62f_1138x612.png 1272w, https://substackcdn.com/image/fetch/$s_!q1bZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3468262-d45e-4b23-8678-72f77a18c62f_1138x612.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>An agent executes a pass, checks a stop condition, and re-spawns if not met. The loop ends on a state condition &#8212; no new findings, zero errors, convergence threshold reached &#8212; not a fixed iteration count.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kRNL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F308e48d0-c3cd-4295-af84-739d4996218e_942x398.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kRNL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F308e48d0-c3cd-4295-af84-739d4996218e_942x398.png 424w, https://substackcdn.com/image/fetch/$s_!kRNL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F308e48d0-c3cd-4295-af84-739d4996218e_942x398.png 848w, https://substackcdn.com/image/fetch/$s_!kRNL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F308e48d0-c3cd-4295-af84-739d4996218e_942x398.png 1272w, https://substackcdn.com/image/fetch/$s_!kRNL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F308e48d0-c3cd-4295-af84-739d4996218e_942x398.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kRNL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F308e48d0-c3cd-4295-af84-739d4996218e_942x398.png" width="942" height="398" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/308e48d0-c3cd-4295-af84-739d4996218e_942x398.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:398,&quot;width&quot;:942,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:10402,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F308e48d0-c3cd-4295-af84-739d4996218e_942x398.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kRNL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F308e48d0-c3cd-4295-af84-739d4996218e_942x398.png 424w, https://substackcdn.com/image/fetch/$s_!kRNL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F308e48d0-c3cd-4295-af84-739d4996218e_942x398.png 848w, https://substackcdn.com/image/fetch/$s_!kRNL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F308e48d0-c3cd-4295-af84-739d4996218e_942x398.png 1272w, https://substackcdn.com/image/fetch/$s_!kRNL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F308e48d0-c3cd-4295-af84-739d4996218e_942x398.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Prior work:</strong> The H1&#8211;H10 experiment series was a human-managed loop &#8212; I reviewed the results, upgraded the harness, re-ran. A Loop Until Done workflow makes the stopping logic autonomous. The Geometry of Unpredictability experiment directly tested this pattern: a bounded track with a circuit breaker at iteration 3 (4,042 tokens total) against an unbounded track that ran to 10 iterations (91,379 tokens). The circuit breaker was an external stopping condition; Loop Until Done encodes that condition inside the workflow itself.</p><h3>3.3 Tournament</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MEt1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef9d18c9-5ceb-483e-965d-d564145b37a6_1123x607.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MEt1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef9d18c9-5ceb-483e-965d-d564145b37a6_1123x607.png 424w, https://substackcdn.com/image/fetch/$s_!MEt1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef9d18c9-5ceb-483e-965d-d564145b37a6_1123x607.png 848w, https://substackcdn.com/image/fetch/$s_!MEt1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef9d18c9-5ceb-483e-965d-d564145b37a6_1123x607.png 1272w, https://substackcdn.com/image/fetch/$s_!MEt1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef9d18c9-5ceb-483e-965d-d564145b37a6_1123x607.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MEt1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef9d18c9-5ceb-483e-965d-d564145b37a6_1123x607.png" width="1123" height="607" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ef9d18c9-5ceb-483e-965d-d564145b37a6_1123x607.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:607,&quot;width&quot;:1123,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1002375,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef9d18c9-5ceb-483e-965d-d564145b37a6_1123x607.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MEt1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef9d18c9-5ceb-483e-965d-d564145b37a6_1123x607.png 424w, https://substackcdn.com/image/fetch/$s_!MEt1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef9d18c9-5ceb-483e-965d-d564145b37a6_1123x607.png 848w, https://substackcdn.com/image/fetch/$s_!MEt1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef9d18c9-5ceb-483e-965d-d564145b37a6_1123x607.png 1272w, https://substackcdn.com/image/fetch/$s_!MEt1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef9d18c9-5ceb-483e-965d-d564145b37a6_1123x607.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>N agents each attempt the same task independently, with no shared context. A judge agent runs pairwise comparisons &#8212; A vs B, winner vs C &#8212; until one remains. Comparative judgment is more reliable than absolute scoring because it avoids the self-preferential bias problem.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!U2O_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bcc630f-26ae-4e4e-ab9e-a5041527b6fa_942x422.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!U2O_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bcc630f-26ae-4e4e-ab9e-a5041527b6fa_942x422.png 424w, https://substackcdn.com/image/fetch/$s_!U2O_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bcc630f-26ae-4e4e-ab9e-a5041527b6fa_942x422.png 848w, https://substackcdn.com/image/fetch/$s_!U2O_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bcc630f-26ae-4e4e-ab9e-a5041527b6fa_942x422.png 1272w, https://substackcdn.com/image/fetch/$s_!U2O_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bcc630f-26ae-4e4e-ab9e-a5041527b6fa_942x422.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!U2O_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bcc630f-26ae-4e4e-ab9e-a5041527b6fa_942x422.png" width="942" height="422" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6bcc630f-26ae-4e4e-ab9e-a5041527b6fa_942x422.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:422,&quot;width&quot;:942,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:15921,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bcc630f-26ae-4e4e-ab9e-a5041527b6fa_942x422.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!U2O_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bcc630f-26ae-4e4e-ab9e-a5041527b6fa_942x422.png 424w, https://substackcdn.com/image/fetch/$s_!U2O_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bcc630f-26ae-4e4e-ab9e-a5041527b6fa_942x422.png 848w, https://substackcdn.com/image/fetch/$s_!U2O_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bcc630f-26ae-4e4e-ab9e-a5041527b6fa_942x422.png 1272w, https://substackcdn.com/image/fetch/$s_!U2O_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bcc630f-26ae-4e4e-ab9e-a5041527b6fa_942x422.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Prior work:</strong> The ASCRS experiment was a manual tournament: H1&#8211;H10 as independent architectures, gold_answer.md as the rubric, &#945; as the pairwise metric. H2 (&#945;=1.000) defeated H9 (&#945;=0.625). In a dynamic workflow, this bracket runs autonomously &#8212; no developer coordinates rounds.</p><h3>3.4 Adversarial Verification</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VbM1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabe6e9ad-a200-4575-8bf7-aa2ce9bfe111_1113x607.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VbM1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabe6e9ad-a200-4575-8bf7-aa2ce9bfe111_1113x607.png 424w, https://substackcdn.com/image/fetch/$s_!VbM1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabe6e9ad-a200-4575-8bf7-aa2ce9bfe111_1113x607.png 848w, https://substackcdn.com/image/fetch/$s_!VbM1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabe6e9ad-a200-4575-8bf7-aa2ce9bfe111_1113x607.png 1272w, https://substackcdn.com/image/fetch/$s_!VbM1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabe6e9ad-a200-4575-8bf7-aa2ce9bfe111_1113x607.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VbM1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabe6e9ad-a200-4575-8bf7-aa2ce9bfe111_1113x607.png" width="1113" height="607" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/abe6e9ad-a200-4575-8bf7-aa2ce9bfe111_1113x607.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:607,&quot;width&quot;:1113,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:989067,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabe6e9ad-a200-4575-8bf7-aa2ce9bfe111_1113x607.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VbM1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabe6e9ad-a200-4575-8bf7-aa2ce9bfe111_1113x607.png 424w, https://substackcdn.com/image/fetch/$s_!VbM1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabe6e9ad-a200-4575-8bf7-aa2ce9bfe111_1113x607.png 848w, https://substackcdn.com/image/fetch/$s_!VbM1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabe6e9ad-a200-4575-8bf7-aa2ce9bfe111_1113x607.png 1272w, https://substackcdn.com/image/fetch/$s_!VbM1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabe6e9ad-a200-4575-8bf7-aa2ce9bfe111_1113x607.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A producer agent generates output. A separate adversarial agent &#8212; with no access to the producer&#8217;s reasoning, only its output &#8212; specifically attempts to refute it. A judge resolves disputes. This structurally prevents self-preferential bias.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hrJ3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3be99f25-5049-44b6-925f-03cc2dad0dd8_938x418.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hrJ3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3be99f25-5049-44b6-925f-03cc2dad0dd8_938x418.png 424w, https://substackcdn.com/image/fetch/$s_!hrJ3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3be99f25-5049-44b6-925f-03cc2dad0dd8_938x418.png 848w, https://substackcdn.com/image/fetch/$s_!hrJ3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3be99f25-5049-44b6-925f-03cc2dad0dd8_938x418.png 1272w, https://substackcdn.com/image/fetch/$s_!hrJ3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3be99f25-5049-44b6-925f-03cc2dad0dd8_938x418.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hrJ3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3be99f25-5049-44b6-925f-03cc2dad0dd8_938x418.png" width="938" height="418" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3be99f25-5049-44b6-925f-03cc2dad0dd8_938x418.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:418,&quot;width&quot;:938,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:13845,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3be99f25-5049-44b6-925f-03cc2dad0dd8_938x418.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hrJ3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3be99f25-5049-44b6-925f-03cc2dad0dd8_938x418.png 424w, https://substackcdn.com/image/fetch/$s_!hrJ3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3be99f25-5049-44b6-925f-03cc2dad0dd8_938x418.png 848w, https://substackcdn.com/image/fetch/$s_!hrJ3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3be99f25-5049-44b6-925f-03cc2dad0dd8_938x418.png 1272w, https://substackcdn.com/image/fetch/$s_!hrJ3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3be99f25-5049-44b6-925f-03cc2dad0dd8_938x418.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Prior work (secondary):</strong> The Geometry of Unpredictability used SkillOpt as external adversarial validation &#8212; real empirical data that could confirm or challenge the failure taxonomy across four article sections. That integration was adversarial verification done editorially rather than as an instrumented agent pattern, which is why it sits as the secondary pattern for that project rather than the primary.</p><h3>3.5 Generate-and-Filter</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!j-qC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47027fa6-5f58-498e-8c7b-871295339403_1125x618.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!j-qC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47027fa6-5f58-498e-8c7b-871295339403_1125x618.png 424w, https://substackcdn.com/image/fetch/$s_!j-qC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47027fa6-5f58-498e-8c7b-871295339403_1125x618.png 848w, https://substackcdn.com/image/fetch/$s_!j-qC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47027fa6-5f58-498e-8c7b-871295339403_1125x618.png 1272w, https://substackcdn.com/image/fetch/$s_!j-qC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47027fa6-5f58-498e-8c7b-871295339403_1125x618.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!j-qC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47027fa6-5f58-498e-8c7b-871295339403_1125x618.png" width="1125" height="618" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/47027fa6-5f58-498e-8c7b-871295339403_1125x618.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:618,&quot;width&quot;:1125,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:990333,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47027fa6-5f58-498e-8c7b-871295339403_1125x618.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!j-qC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47027fa6-5f58-498e-8c7b-871295339403_1125x618.png 424w, https://substackcdn.com/image/fetch/$s_!j-qC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47027fa6-5f58-498e-8c7b-871295339403_1125x618.png 848w, https://substackcdn.com/image/fetch/$s_!j-qC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47027fa6-5f58-498e-8c7b-871295339403_1125x618.png 1272w, https://substackcdn.com/image/fetch/$s_!j-qC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47027fa6-5f58-498e-8c7b-871295339403_1125x618.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A generator agent produces a wide set of candidates. A filter agent applies a rubric, removes low-quality or duplicate entries, and returns only those meeting threshold. Quality is enforced by the filter, not the generator &#8212; this deliberately separates divergent and convergent thinking.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!I5SM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5364d3a-11d8-44dd-b9be-30c7d2f38726_941x420.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!I5SM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5364d3a-11d8-44dd-b9be-30c7d2f38726_941x420.png 424w, https://substackcdn.com/image/fetch/$s_!I5SM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5364d3a-11d8-44dd-b9be-30c7d2f38726_941x420.png 848w, https://substackcdn.com/image/fetch/$s_!I5SM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5364d3a-11d8-44dd-b9be-30c7d2f38726_941x420.png 1272w, https://substackcdn.com/image/fetch/$s_!I5SM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5364d3a-11d8-44dd-b9be-30c7d2f38726_941x420.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!I5SM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5364d3a-11d8-44dd-b9be-30c7d2f38726_941x420.png" width="941" height="420" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f5364d3a-11d8-44dd-b9be-30c7d2f38726_941x420.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:420,&quot;width&quot;:941,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:16964,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5364d3a-11d8-44dd-b9be-30c7d2f38726_941x420.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!I5SM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5364d3a-11d8-44dd-b9be-30c7d2f38726_941x420.png 424w, https://substackcdn.com/image/fetch/$s_!I5SM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5364d3a-11d8-44dd-b9be-30c7d2f38726_941x420.png 848w, https://substackcdn.com/image/fetch/$s_!I5SM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5364d3a-11d8-44dd-b9be-30c7d2f38726_941x420.png 1272w, https://substackcdn.com/image/fetch/$s_!I5SM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5364d3a-11d8-44dd-b9be-30c7d2f38726_941x420.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Prior work:</strong> If I think about it, every article draft involved implicit generation-and-filter &#8212; writing multiple section framings, filtering by my editorial criteria (relevant to case). It was never instrumented as a Claude Code workflow. This is the most underutilised pattern in the prior project set.</p><h3>3.6 Classify-and-Act</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3Dkx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52178160-08b6-4b4a-ac53-9ee00dbb2bd3_1123x598.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3Dkx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52178160-08b6-4b4a-ac53-9ee00dbb2bd3_1123x598.png 424w, https://substackcdn.com/image/fetch/$s_!3Dkx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52178160-08b6-4b4a-ac53-9ee00dbb2bd3_1123x598.png 848w, https://substackcdn.com/image/fetch/$s_!3Dkx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52178160-08b6-4b4a-ac53-9ee00dbb2bd3_1123x598.png 1272w, https://substackcdn.com/image/fetch/$s_!3Dkx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52178160-08b6-4b4a-ac53-9ee00dbb2bd3_1123x598.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3Dkx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52178160-08b6-4b4a-ac53-9ee00dbb2bd3_1123x598.png" width="1123" height="598" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/52178160-08b6-4b4a-ac53-9ee00dbb2bd3_1123x598.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:598,&quot;width&quot;:1123,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1006831,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52178160-08b6-4b4a-ac53-9ee00dbb2bd3_1123x598.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3Dkx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52178160-08b6-4b4a-ac53-9ee00dbb2bd3_1123x598.png 424w, https://substackcdn.com/image/fetch/$s_!3Dkx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52178160-08b6-4b4a-ac53-9ee00dbb2bd3_1123x598.png 848w, https://substackcdn.com/image/fetch/$s_!3Dkx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52178160-08b6-4b4a-ac53-9ee00dbb2bd3_1123x598.png 1272w, https://substackcdn.com/image/fetch/$s_!3Dkx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52178160-08b6-4b4a-ac53-9ee00dbb2bd3_1123x598.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A classifier agent reads the input, assigns a category, and routes to the appropriate handler. Routing can be by task type, complexity, or model tier. Output classification also works: produce first, then classify for downstream routing.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!P2w8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa993b344-adf7-4541-909c-e2af7743556e_942x382.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!P2w8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa993b344-adf7-4541-909c-e2af7743556e_942x382.png 424w, https://substackcdn.com/image/fetch/$s_!P2w8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa993b344-adf7-4541-909c-e2af7743556e_942x382.png 848w, https://substackcdn.com/image/fetch/$s_!P2w8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa993b344-adf7-4541-909c-e2af7743556e_942x382.png 1272w, https://substackcdn.com/image/fetch/$s_!P2w8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa993b344-adf7-4541-909c-e2af7743556e_942x382.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!P2w8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa993b344-adf7-4541-909c-e2af7743556e_942x382.png" width="942" height="382" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a993b344-adf7-4541-909c-e2af7743556e_942x382.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:382,&quot;width&quot;:942,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12826,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa993b344-adf7-4541-909c-e2af7743556e_942x382.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!P2w8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa993b344-adf7-4541-909c-e2af7743556e_942x382.png 424w, https://substackcdn.com/image/fetch/$s_!P2w8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa993b344-adf7-4541-909c-e2af7743556e_942x382.png 848w, https://substackcdn.com/image/fetch/$s_!P2w8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa993b344-adf7-4541-909c-e2af7743556e_942x382.png 1272w, https://substackcdn.com/image/fetch/$s_!P2w8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa993b344-adf7-4541-909c-e2af7743556e_942x382.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Prior work (primary):</strong> Reversa used explicit model routing by task complexity: stronger models (claude-sonnet-4-5) for Extraction and Evaluation, faster models (gemini-2.5-flash) for Test Generation and Verification. Each agent was assigned based on the reasoning demands of its step &#8212; not by implicit construct detection but by deliberate model selection. This is Classify-and-Act applied at the model-tier level. The sequential pipeline structure (<strong>Extractor &#8594; Translator &#8594; Test Generator &#8594; Verifier &#8594; Evaluator</strong>) is the secondary fan-out element &#8212; each stage handing off to a specialist rather than running in parallel.</p><h2>4. Trying This Out On A Simple Domain: The Compliance Audit Pipeline</h2><h3>4.1 What It Is</h3><p>It&#8217;s possible to synthesize all patterns. The Compliance Audit pipeline is a practical, tractable project that exercises all six patterns naturally and maps directly to a real business need. The core problem: given a compliance checklist (10 controls drawn from GDPR, SOC 2, and internal policy) and a folder of evidence documents, identify which controls lack adequate evidence, generate remediation actions, and loop until all controls are satisfied.</p><p>The gap between what a compliance standard requires and what evidence actually exists is measurable, has a natural zero-state (all controls evidenced), and drifts over time as policies expire and audits lapse. It is the same structural problem as the Ritonavir analogy: the standard says Form II, the evidence keeps showing Form I.</p><p>The project starts small: a checklist.md with 10 controls and an evidence/ folder where three controls are intentionally unsatisfied &#8212; one stale document, two missing files. The compliance score (controls_passed / total_controls) replaces the &#945; metric. Scale is added in later experiments by expanding the control set or adding multiple product lines.</p><h3>4.2 How All Six Patterns Compose into a Pipeline</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ho2U!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc9b5a4a-b0cb-41ff-9857-194966642602_942x602.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ho2U!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc9b5a4a-b0cb-41ff-9857-194966642602_942x602.png 424w, https://substackcdn.com/image/fetch/$s_!ho2U!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc9b5a4a-b0cb-41ff-9857-194966642602_942x602.png 848w, https://substackcdn.com/image/fetch/$s_!ho2U!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc9b5a4a-b0cb-41ff-9857-194966642602_942x602.png 1272w, https://substackcdn.com/image/fetch/$s_!ho2U!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc9b5a4a-b0cb-41ff-9857-194966642602_942x602.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ho2U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc9b5a4a-b0cb-41ff-9857-194966642602_942x602.png" width="942" height="602" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc9b5a4a-b0cb-41ff-9857-194966642602_942x602.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:602,&quot;width&quot;:942,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:35225,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc9b5a4a-b0cb-41ff-9857-194966642602_942x602.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ho2U!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc9b5a4a-b0cb-41ff-9857-194966642602_942x602.png 424w, https://substackcdn.com/image/fetch/$s_!ho2U!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc9b5a4a-b0cb-41ff-9857-194966642602_942x602.png 848w, https://substackcdn.com/image/fetch/$s_!ho2U!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc9b5a4a-b0cb-41ff-9857-194966642602_942x602.png 1272w, https://substackcdn.com/image/fetch/$s_!ho2U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc9b5a4a-b0cb-41ff-9857-194966642602_942x602.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0Uby!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F896efe9d-392e-496f-a553-ee2c20ee74d3_943x647.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0Uby!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F896efe9d-392e-496f-a553-ee2c20ee74d3_943x647.png 424w, https://substackcdn.com/image/fetch/$s_!0Uby!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F896efe9d-392e-496f-a553-ee2c20ee74d3_943x647.png 848w, https://substackcdn.com/image/fetch/$s_!0Uby!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F896efe9d-392e-496f-a553-ee2c20ee74d3_943x647.png 1272w, https://substackcdn.com/image/fetch/$s_!0Uby!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F896efe9d-392e-496f-a553-ee2c20ee74d3_943x647.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0Uby!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F896efe9d-392e-496f-a553-ee2c20ee74d3_943x647.png" width="943" height="647" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/896efe9d-392e-496f-a553-ee2c20ee74d3_943x647.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:647,&quot;width&quot;:943,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:42463,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F896efe9d-392e-496f-a553-ee2c20ee74d3_943x647.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0Uby!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F896efe9d-392e-496f-a553-ee2c20ee74d3_943x647.png 424w, https://substackcdn.com/image/fetch/$s_!0Uby!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F896efe9d-392e-496f-a553-ee2c20ee74d3_943x647.png 848w, https://substackcdn.com/image/fetch/$s_!0Uby!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F896efe9d-392e-496f-a553-ee2c20ee74d3_943x647.png 1272w, https://substackcdn.com/image/fetch/$s_!0Uby!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F896efe9d-392e-496f-a553-ee2c20ee74d3_943x647.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The pipeline is intentionally linear at the stage level &#8212; each pattern feeds the next. Within stages 1 and 2 there is parallelism (fan-out across controls). Stages 3, 4, and 5 are sequential quality gates. Stage 6 is the control loop. This structure is easy to reason about and straightforward to instrument in Claude Code.</p><h2>5. Understanding the Project Structure</h2><h3>5.0 What Each File Actually Does</h3><p>Before looking at the folder layout, it helps to understand the role of each component. This is where readers unfamiliar with Claude Code often get confused &#8212; the files serve very different purposes and the names do not make that obvious.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6D4u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64ba40b8-631c-42a1-9a09-bc7422f86c7f_942x651.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6D4u!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64ba40b8-631c-42a1-9a09-bc7422f86c7f_942x651.png 424w, https://substackcdn.com/image/fetch/$s_!6D4u!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64ba40b8-631c-42a1-9a09-bc7422f86c7f_942x651.png 848w, https://substackcdn.com/image/fetch/$s_!6D4u!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64ba40b8-631c-42a1-9a09-bc7422f86c7f_942x651.png 1272w, https://substackcdn.com/image/fetch/$s_!6D4u!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64ba40b8-631c-42a1-9a09-bc7422f86c7f_942x651.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6D4u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64ba40b8-631c-42a1-9a09-bc7422f86c7f_942x651.png" width="942" height="651" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/64ba40b8-631c-42a1-9a09-bc7422f86c7f_942x651.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:651,&quot;width&quot;:942,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:103046,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64ba40b8-631c-42a1-9a09-bc7422f86c7f_942x651.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6D4u!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64ba40b8-631c-42a1-9a09-bc7422f86c7f_942x651.png 424w, https://substackcdn.com/image/fetch/$s_!6D4u!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64ba40b8-631c-42a1-9a09-bc7422f86c7f_942x651.png 848w, https://substackcdn.com/image/fetch/$s_!6D4u!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64ba40b8-631c-42a1-9a09-bc7422f86c7f_942x651.png 1272w, https://substackcdn.com/image/fetch/$s_!6D4u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64ba40b8-631c-42a1-9a09-bc7422f86c7f_942x651.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A common point of confusion: CLAUDE.md is not a compliance document. It is a project instruction file for the AI. A real business would already have policies and checklists in SharePoint or a GRC tool &#8212; CLAUDE.md is the one additional file they would write specifically for Claude Code, telling it how to handle their project.</p><h3>5.1 The Difference from Prior Projects</h3><p>In the ASCRS Harness Lab and the Harness Engineering experiments, I was the orchestrator. I reviewed the output of H1, decided to run H2, reviewed that result across a Claude.ai session, upgraded the harness, ran the next stage. The coordination intelligence lived in my judgment and in any prior Claude.ai conversation history.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pQlA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb34396e3-01b7-46c2-ad55-fcf5c5c6ca42_941x391.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pQlA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb34396e3-01b7-46c2-ad55-fcf5c5c6ca42_941x391.png 424w, https://substackcdn.com/image/fetch/$s_!pQlA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb34396e3-01b7-46c2-ad55-fcf5c5c6ca42_941x391.png 848w, https://substackcdn.com/image/fetch/$s_!pQlA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb34396e3-01b7-46c2-ad55-fcf5c5c6ca42_941x391.png 1272w, https://substackcdn.com/image/fetch/$s_!pQlA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb34396e3-01b7-46c2-ad55-fcf5c5c6ca42_941x391.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pQlA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb34396e3-01b7-46c2-ad55-fcf5c5c6ca42_941x391.png" width="941" height="391" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b34396e3-01b7-46c2-ad55-fcf5c5c6ca42_941x391.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:391,&quot;width&quot;:941,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:20153,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb34396e3-01b7-46c2-ad55-fcf5c5c6ca42_941x391.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pQlA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb34396e3-01b7-46c2-ad55-fcf5c5c6ca42_941x391.png 424w, https://substackcdn.com/image/fetch/$s_!pQlA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb34396e3-01b7-46c2-ad55-fcf5c5c6ca42_941x391.png 848w, https://substackcdn.com/image/fetch/$s_!pQlA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb34396e3-01b7-46c2-ad55-fcf5c5c6ca42_941x391.png 1272w, https://substackcdn.com/image/fetch/$s_!pQlA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb34396e3-01b7-46c2-ad55-fcf5c5c6ca42_941x391.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Dynamic workflows move that coordination into the .js file. Claude Code runs all stages &#8212; fan-out, adversarial check, tournament, loop &#8212; autonomously, without you intervening between them. The session you managed manually across multiple Claude.ai windows is now a single Claude Code run from one prompt.</p><h3>5.2 Learning Scenario vs Real Business</h3><p>The prompts in Section 7 cover both situations. The distinction is only in Step 0.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TtE0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16aa7ef1-c709-4dc9-824a-323688e2e803_941x352.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TtE0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16aa7ef1-c709-4dc9-824a-323688e2e803_941x352.png 424w, https://substackcdn.com/image/fetch/$s_!TtE0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16aa7ef1-c709-4dc9-824a-323688e2e803_941x352.png 848w, https://substackcdn.com/image/fetch/$s_!TtE0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16aa7ef1-c709-4dc9-824a-323688e2e803_941x352.png 1272w, https://substackcdn.com/image/fetch/$s_!TtE0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16aa7ef1-c709-4dc9-824a-323688e2e803_941x352.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TtE0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16aa7ef1-c709-4dc9-824a-323688e2e803_941x352.png" width="941" height="352" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/16aa7ef1-c709-4dc9-824a-323688e2e803_941x352.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:352,&quot;width&quot;:941,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:50519,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16aa7ef1-c709-4dc9-824a-323688e2e803_941x352.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TtE0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16aa7ef1-c709-4dc9-824a-323688e2e803_941x352.png 424w, https://substackcdn.com/image/fetch/$s_!TtE0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16aa7ef1-c709-4dc9-824a-323688e2e803_941x352.png 848w, https://substackcdn.com/image/fetch/$s_!TtE0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16aa7ef1-c709-4dc9-824a-323688e2e803_941x352.png 1272w, https://substackcdn.com/image/fetch/$s_!TtE0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16aa7ef1-c709-4dc9-824a-323688e2e803_941x352.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>5.3 Folder Structure</h3><p>Create a dedicated subfolder. Claude Code reads CLAUDE.md at the project level and saves workflows relative to the active workspace. Opening in the parent folder risks picking up a different CLAUDE.md or none at all.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IHSV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd26b50a1-87e0-4cd7-a0c8-c3f2bf3bd02b_1152x610.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IHSV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd26b50a1-87e0-4cd7-a0c8-c3f2bf3bd02b_1152x610.png 424w, https://substackcdn.com/image/fetch/$s_!IHSV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd26b50a1-87e0-4cd7-a0c8-c3f2bf3bd02b_1152x610.png 848w, https://substackcdn.com/image/fetch/$s_!IHSV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd26b50a1-87e0-4cd7-a0c8-c3f2bf3bd02b_1152x610.png 1272w, https://substackcdn.com/image/fetch/$s_!IHSV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd26b50a1-87e0-4cd7-a0c8-c3f2bf3bd02b_1152x610.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IHSV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd26b50a1-87e0-4cd7-a0c8-c3f2bf3bd02b_1152x610.png" width="1152" height="610" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d26b50a1-87e0-4cd7-a0c8-c3f2bf3bd02b_1152x610.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:610,&quot;width&quot;:1152,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1060217,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd26b50a1-87e0-4cd7-a0c8-c3f2bf3bd02b_1152x610.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!IHSV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd26b50a1-87e0-4cd7-a0c8-c3f2bf3bd02b_1152x610.png 424w, https://substackcdn.com/image/fetch/$s_!IHSV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd26b50a1-87e0-4cd7-a0c8-c3f2bf3bd02b_1152x610.png 848w, https://substackcdn.com/image/fetch/$s_!IHSV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd26b50a1-87e0-4cd7-a0c8-c3f2bf3bd02b_1152x610.png 1272w, https://substackcdn.com/image/fetch/$s_!IHSV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd26b50a1-87e0-4cd7-a0c8-c3f2bf3bd02b_1152x610.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ixJu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa122123-a836-4da8-9dd9-9d9a25ef5004_937x477.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ixJu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa122123-a836-4da8-9dd9-9d9a25ef5004_937x477.png 424w, https://substackcdn.com/image/fetch/$s_!ixJu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa122123-a836-4da8-9dd9-9d9a25ef5004_937x477.png 848w, https://substackcdn.com/image/fetch/$s_!ixJu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa122123-a836-4da8-9dd9-9d9a25ef5004_937x477.png 1272w, https://substackcdn.com/image/fetch/$s_!ixJu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa122123-a836-4da8-9dd9-9d9a25ef5004_937x477.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ixJu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa122123-a836-4da8-9dd9-9d9a25ef5004_937x477.png" width="937" height="477" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fa122123-a836-4da8-9dd9-9d9a25ef5004_937x477.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:477,&quot;width&quot;:937,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:45991,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa122123-a836-4da8-9dd9-9d9a25ef5004_937x477.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ixJu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa122123-a836-4da8-9dd9-9d9a25ef5004_937x477.png 424w, https://substackcdn.com/image/fetch/$s_!ixJu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa122123-a836-4da8-9dd9-9d9a25ef5004_937x477.png 848w, https://substackcdn.com/image/fetch/$s_!ixJu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa122123-a836-4da8-9dd9-9d9a25ef5004_937x477.png 1272w, https://substackcdn.com/image/fetch/$s_!ixJu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa122123-a836-4da8-9dd9-9d9a25ef5004_937x477.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Open in the compliance-audit/ folder directly (not the parent). Claude Code reads the nearest CLAUDE.md it finds walking up from the workspace root. If you open in the parent folder, it may pick up a different CLAUDE.md or none at all.</p><h3>5.4 Saving and Reusing Workflows</h3><p>When a dynamic workflow finishes, press S in the workflow menu to save it to .claude/workflows/. Saved workflows can be called by name in future prompts. They are plain JavaScript files you can inspect, edit, and version-control. Each saved workflow becomes a named stage in your experiment log &#8212; equivalent to the H1&#8211;H10 harness notation from prior work.</p><h2>6. Model Routing: Sonnet Plans, Haiku Executes</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!THzf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02b26abc-163a-48e5-8ba1-02919eef6b3b_1121x586.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!THzf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02b26abc-163a-48e5-8ba1-02919eef6b3b_1121x586.png 424w, https://substackcdn.com/image/fetch/$s_!THzf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02b26abc-163a-48e5-8ba1-02919eef6b3b_1121x586.png 848w, https://substackcdn.com/image/fetch/$s_!THzf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02b26abc-163a-48e5-8ba1-02919eef6b3b_1121x586.png 1272w, https://substackcdn.com/image/fetch/$s_!THzf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02b26abc-163a-48e5-8ba1-02919eef6b3b_1121x586.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!THzf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02b26abc-163a-48e5-8ba1-02919eef6b3b_1121x586.png" width="1121" height="586" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/02b26abc-163a-48e5-8ba1-02919eef6b3b_1121x586.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:586,&quot;width&quot;:1121,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:854561,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02b26abc-163a-48e5-8ba1-02919eef6b3b_1121x586.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!THzf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02b26abc-163a-48e5-8ba1-02919eef6b3b_1121x586.png 424w, https://substackcdn.com/image/fetch/$s_!THzf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02b26abc-163a-48e5-8ba1-02919eef6b3b_1121x586.png 848w, https://substackcdn.com/image/fetch/$s_!THzf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02b26abc-163a-48e5-8ba1-02919eef6b3b_1121x586.png 1272w, https://substackcdn.com/image/fetch/$s_!THzf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02b26abc-163a-48e5-8ba1-02919eef6b3b_1121x586.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>6.1 How Model Switching Actually Works</h3><p>There is no UI toggle for switching models mid-workflow. The switching is in the workflow .js file itself. When Sonnet writes the harness, it can specify which model each subagent uses. Subagents that require heavy reasoning get Sonnet; subagents doing bulk, well-specified checks get Haiku. The orchestrator assigns models; you do not.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CUi1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F654d2422-37c0-4799-afdf-e6a9375d153c_943x585.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CUi1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F654d2422-37c0-4799-afdf-e6a9375d153c_943x585.png 424w, https://substackcdn.com/image/fetch/$s_!CUi1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F654d2422-37c0-4799-afdf-e6a9375d153c_943x585.png 848w, https://substackcdn.com/image/fetch/$s_!CUi1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F654d2422-37c0-4799-afdf-e6a9375d153c_943x585.png 1272w, https://substackcdn.com/image/fetch/$s_!CUi1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F654d2422-37c0-4799-afdf-e6a9375d153c_943x585.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CUi1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F654d2422-37c0-4799-afdf-e6a9375d153c_943x585.png" width="943" height="585" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/654d2422-37c0-4799-afdf-e6a9375d153c_943x585.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:585,&quot;width&quot;:943,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:33412,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F654d2422-37c0-4799-afdf-e6a9375d153c_943x585.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CUi1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F654d2422-37c0-4799-afdf-e6a9375d153c_943x585.png 424w, https://substackcdn.com/image/fetch/$s_!CUi1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F654d2422-37c0-4799-afdf-e6a9375d153c_943x585.png 848w, https://substackcdn.com/image/fetch/$s_!CUi1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F654d2422-37c0-4799-afdf-e6a9375d153c_943x585.png 1272w, https://substackcdn.com/image/fetch/$s_!CUi1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F654d2422-37c0-4799-afdf-e6a9375d153c_943x585.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In practice, you do not write this routing logic. You prompt Sonnet to write a workflow where bulk checks use Haiku and synthesis and judgment use Sonnet. Sonnet generates the .js file with the model assignments already in place. The routing instruction goes into your initial prompt, not into code you write.</p><h3>6.2 When to Route to Each Model</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Nip-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b69a2f5-c02b-45bd-a7df-3145fb3fd974_938x217.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Nip-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b69a2f5-c02b-45bd-a7df-3145fb3fd974_938x217.png 424w, https://substackcdn.com/image/fetch/$s_!Nip-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b69a2f5-c02b-45bd-a7df-3145fb3fd974_938x217.png 848w, https://substackcdn.com/image/fetch/$s_!Nip-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b69a2f5-c02b-45bd-a7df-3145fb3fd974_938x217.png 1272w, https://substackcdn.com/image/fetch/$s_!Nip-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b69a2f5-c02b-45bd-a7df-3145fb3fd974_938x217.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Nip-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b69a2f5-c02b-45bd-a7df-3145fb3fd974_938x217.png" width="938" height="217" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4b69a2f5-c02b-45bd-a7df-3145fb3fd974_938x217.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:217,&quot;width&quot;:938,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:33161,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b69a2f5-c02b-45bd-a7df-3145fb3fd974_938x217.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Nip-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b69a2f5-c02b-45bd-a7df-3145fb3fd974_938x217.png 424w, https://substackcdn.com/image/fetch/$s_!Nip-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b69a2f5-c02b-45bd-a7df-3145fb3fd974_938x217.png 848w, https://substackcdn.com/image/fetch/$s_!Nip-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b69a2f5-c02b-45bd-a7df-3145fb3fd974_938x217.png 1272w, https://substackcdn.com/image/fetch/$s_!Nip-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b69a2f5-c02b-45bd-a7df-3145fb3fd974_938x217.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>The ASCRS result (H2 defeating H9) was not a refutation of multi-agent design. It measured a greenfield document task where planning overhead dominated execution benefit. Routing addresses this: Sonnet is applied only where reasoning matters, Haiku handles volume. The cost scales linearly with the number of rules, not quadratically with agent count.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!01ZW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c77bad8-318b-4fac-8523-28024e3d16d7_1143x621.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!01ZW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c77bad8-318b-4fac-8523-28024e3d16d7_1143x621.png 424w, https://substackcdn.com/image/fetch/$s_!01ZW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c77bad8-318b-4fac-8523-28024e3d16d7_1143x621.png 848w, https://substackcdn.com/image/fetch/$s_!01ZW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c77bad8-318b-4fac-8523-28024e3d16d7_1143x621.png 1272w, https://substackcdn.com/image/fetch/$s_!01ZW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c77bad8-318b-4fac-8523-28024e3d16d7_1143x621.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!01ZW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c77bad8-318b-4fac-8523-28024e3d16d7_1143x621.png" width="1143" height="621" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0c77bad8-318b-4fac-8523-28024e3d16d7_1143x621.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:621,&quot;width&quot;:1143,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1008073,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c77bad8-318b-4fac-8523-28024e3d16d7_1143x621.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!01ZW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c77bad8-318b-4fac-8523-28024e3d16d7_1143x621.png 424w, https://substackcdn.com/image/fetch/$s_!01ZW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c77bad8-318b-4fac-8523-28024e3d16d7_1143x621.png 848w, https://substackcdn.com/image/fetch/$s_!01ZW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c77bad8-318b-4fac-8523-28024e3d16d7_1143x621.png 1272w, https://substackcdn.com/image/fetch/$s_!01ZW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c77bad8-318b-4fac-8523-28024e3d16d7_1143x621.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>6.3 Where Tokens Actually Accumulate</h3><p>The per-stage budget estimate is only part of the picture. The more important question is what accumulates across multiple ultracode calls in a session. The answer is not the Haiku agents &#8212; those are isolated and fresh each time. The risk is in three other places.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CrVb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ecbe6b5-e311-4e8b-be84-0cc291cd87c8_945x657.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CrVb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ecbe6b5-e311-4e8b-be84-0cc291cd87c8_945x657.png 424w, https://substackcdn.com/image/fetch/$s_!CrVb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ecbe6b5-e311-4e8b-be84-0cc291cd87c8_945x657.png 848w, https://substackcdn.com/image/fetch/$s_!CrVb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ecbe6b5-e311-4e8b-be84-0cc291cd87c8_945x657.png 1272w, https://substackcdn.com/image/fetch/$s_!CrVb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ecbe6b5-e311-4e8b-be84-0cc291cd87c8_945x657.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CrVb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ecbe6b5-e311-4e8b-be84-0cc291cd87c8_945x657.png" width="945" height="657" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4ecbe6b5-e311-4e8b-be84-0cc291cd87c8_945x657.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:657,&quot;width&quot;:945,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:41803,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ecbe6b5-e311-4e8b-be84-0cc291cd87c8_945x657.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CrVb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ecbe6b5-e311-4e8b-be84-0cc291cd87c8_945x657.png 424w, https://substackcdn.com/image/fetch/$s_!CrVb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ecbe6b5-e311-4e8b-be84-0cc291cd87c8_945x657.png 848w, https://substackcdn.com/image/fetch/$s_!CrVb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ecbe6b5-e311-4e8b-be84-0cc291cd87c8_945x657.png 1272w, https://substackcdn.com/image/fetch/$s_!CrVb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ecbe6b5-e311-4e8b-be84-0cc291cd87c8_945x657.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>6.4 Token Controls</h3><p>Five controls are available. The first three are in my prompts directly &#8212; the most reliable form of control because Claude Code treats token caps as hard limits, not suggestions.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hpRa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb656074-22de-4c9f-a3ec-98b2537e0768_943x536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hpRa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb656074-22de-4c9f-a3ec-98b2537e0768_943x536.png 424w, https://substackcdn.com/image/fetch/$s_!hpRa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb656074-22de-4c9f-a3ec-98b2537e0768_943x536.png 848w, https://substackcdn.com/image/fetch/$s_!hpRa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb656074-22de-4c9f-a3ec-98b2537e0768_943x536.png 1272w, https://substackcdn.com/image/fetch/$s_!hpRa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb656074-22de-4c9f-a3ec-98b2537e0768_943x536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hpRa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb656074-22de-4c9f-a3ec-98b2537e0768_943x536.png" width="943" height="536" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fb656074-22de-4c9f-a3ec-98b2537e0768_943x536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:536,&quot;width&quot;:943,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:90144,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb656074-22de-4c9f-a3ec-98b2537e0768_943x536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hpRa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb656074-22de-4c9f-a3ec-98b2537e0768_943x536.png 424w, https://substackcdn.com/image/fetch/$s_!hpRa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb656074-22de-4c9f-a3ec-98b2537e0768_943x536.png 848w, https://substackcdn.com/image/fetch/$s_!hpRa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb656074-22de-4c9f-a3ec-98b2537e0768_943x536.png 1272w, https://substackcdn.com/image/fetch/$s_!hpRa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb656074-22de-4c9f-a3ec-98b2537e0768_943x536.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>6.5 Token Caps Embedded in Each Stage Prompt</h3><p>The prompts in Section 7 include token caps and output format controls directly. The table below shows the cap per stage and the rationale.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7wzw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e7a4c1f-54a4-4244-947c-5cf11c8309cd_942x493.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7wzw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e7a4c1f-54a4-4244-947c-5cf11c8309cd_942x493.png 424w, https://substackcdn.com/image/fetch/$s_!7wzw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e7a4c1f-54a4-4244-947c-5cf11c8309cd_942x493.png 848w, https://substackcdn.com/image/fetch/$s_!7wzw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e7a4c1f-54a4-4244-947c-5cf11c8309cd_942x493.png 1272w, https://substackcdn.com/image/fetch/$s_!7wzw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e7a4c1f-54a4-4244-947c-5cf11c8309cd_942x493.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7wzw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e7a4c1f-54a4-4244-947c-5cf11c8309cd_942x493.png" width="942" height="493" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5e7a4c1f-54a4-4244-947c-5cf11c8309cd_942x493.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:493,&quot;width&quot;:942,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:73787,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e7a4c1f-54a4-4244-947c-5cf11c8309cd_942x493.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7wzw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e7a4c1f-54a4-4244-947c-5cf11c8309cd_942x493.png 424w, https://substackcdn.com/image/fetch/$s_!7wzw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e7a4c1f-54a4-4244-947c-5cf11c8309cd_942x493.png 848w, https://substackcdn.com/image/fetch/$s_!7wzw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e7a4c1f-54a4-4244-947c-5cf11c8309cd_942x493.png 1272w, https://substackcdn.com/image/fetch/$s_!7wzw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e7a4c1f-54a4-4244-947c-5cf11c8309cd_942x493.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>7. Prompts for Claude Code</h2><h3>7.1 What &#8216;ultracode&#8217; Means</h3><p>ultracode is a trigger word, not a separate mode. Anthropic built it into Claude Code as a shorthand signal meaning: write a dynamic workflow JavaScript harness for this task, rather than responding inline or writing a simple script. Without it, Claude Code may answer directly or write straightforward code. With it, Claude Code commits to building a multi-agent orchestration file.</p><p>You can also just ask Claude to &#8216;use a workflow&#8217; or &#8216;create a workflow&#8217; explicitly &#8212; ultracode is simply the most compact form. It is useful in short prompts where you want the harness behaviour without a long preamble.</p><h3>7.2 The /goal Command</h3><p>/goal sets a persistent objective anchor at the session level. You set it once at the start and Claude Code holds it in view across all subsequent prompts, flagging if any action would move away from it. It does not re-run anything automatically &#8212; it is orientation, not execution.</p><p>Loop Until Done is an active control loop inside a workflow. It re-spawns agents, checks a stop condition, and continues until that condition is met &#8212; autonomously, without you prompting again. The two are complementary, not alternatives.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!onDU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7deb3db-4c9c-41b1-af64-a18e2078cc99_942x430.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!onDU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7deb3db-4c9c-41b1-af64-a18e2078cc99_942x430.png 424w, https://substackcdn.com/image/fetch/$s_!onDU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7deb3db-4c9c-41b1-af64-a18e2078cc99_942x430.png 848w, https://substackcdn.com/image/fetch/$s_!onDU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7deb3db-4c9c-41b1-af64-a18e2078cc99_942x430.png 1272w, https://substackcdn.com/image/fetch/$s_!onDU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7deb3db-4c9c-41b1-af64-a18e2078cc99_942x430.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!onDU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7deb3db-4c9c-41b1-af64-a18e2078cc99_942x430.png" width="942" height="430" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f7deb3db-4c9c-41b1-af64-a18e2078cc99_942x430.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:430,&quot;width&quot;:942,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:23853,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7deb3db-4c9c-41b1-af64-a18e2078cc99_942x430.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!onDU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7deb3db-4c9c-41b1-af64-a18e2078cc99_942x430.png 424w, https://substackcdn.com/image/fetch/$s_!onDU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7deb3db-4c9c-41b1-af64-a18e2078cc99_942x430.png 848w, https://substackcdn.com/image/fetch/$s_!onDU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7deb3db-4c9c-41b1-af64-a18e2078cc99_942x430.png 1272w, https://substackcdn.com/image/fetch/$s_!onDU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7deb3db-4c9c-41b1-af64-a18e2078cc99_942x430.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For this case study: type &#8216;/goal compliance_score = 1.0&#8217; before running Stage 1. Each stage then runs with that objective visible. The Loop Until Done in Stage 5 is what drives toward it. If you skip a stage or run them out of order, Claude Code will note that the /goal is not yet satisfied.</p><h3>7.3 Prompts, in Order</h3><p>If you have no pre-existing files &#8212; which is the case for this case study &#8212; start at Step 0. That single prompt creates the entire project environment: CLAUDE.md, checklist.md, all ten evidence files, and the three intentional gaps. After Step 0 finishes, the folder is fully populated and Stages 1&#8211;5 can run against it. You do not write any files yourself.</p><p>If you are running against real documents, skip Step 0 and go directly to Stage 1. The workflow prompts are identical either way &#8212; only the content of the files differs.</p><h4>Step 0: Create the Entire Project Environment</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2b2f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F498dfa9f-911d-4257-af22-1b9343c3c40f_1121x586.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2b2f!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F498dfa9f-911d-4257-af22-1b9343c3c40f_1121x586.png 424w, https://substackcdn.com/image/fetch/$s_!2b2f!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F498dfa9f-911d-4257-af22-1b9343c3c40f_1121x586.png 848w, https://substackcdn.com/image/fetch/$s_!2b2f!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F498dfa9f-911d-4257-af22-1b9343c3c40f_1121x586.png 1272w, https://substackcdn.com/image/fetch/$s_!2b2f!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F498dfa9f-911d-4257-af22-1b9343c3c40f_1121x586.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2b2f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F498dfa9f-911d-4257-af22-1b9343c3c40f_1121x586.png" width="1121" height="586" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/498dfa9f-911d-4257-af22-1b9343c3c40f_1121x586.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:586,&quot;width&quot;:1121,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:854561,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F498dfa9f-911d-4257-af22-1b9343c3c40f_1121x586.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2b2f!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F498dfa9f-911d-4257-af22-1b9343c3c40f_1121x586.png 424w, https://substackcdn.com/image/fetch/$s_!2b2f!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F498dfa9f-911d-4257-af22-1b9343c3c40f_1121x586.png 848w, https://substackcdn.com/image/fetch/$s_!2b2f!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F498dfa9f-911d-4257-af22-1b9343c3c40f_1121x586.png 1272w, https://substackcdn.com/image/fetch/$s_!2b2f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F498dfa9f-911d-4257-af22-1b9343c3c40f_1121x586.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-bmP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa69b5189-4092-45f6-8409-b7059f59a762_1137x622.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-bmP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa69b5189-4092-45f6-8409-b7059f59a762_1137x622.png 424w, https://substackcdn.com/image/fetch/$s_!-bmP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa69b5189-4092-45f6-8409-b7059f59a762_1137x622.png 848w, https://substackcdn.com/image/fetch/$s_!-bmP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa69b5189-4092-45f6-8409-b7059f59a762_1137x622.png 1272w, https://substackcdn.com/image/fetch/$s_!-bmP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa69b5189-4092-45f6-8409-b7059f59a762_1137x622.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-bmP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa69b5189-4092-45f6-8409-b7059f59a762_1137x622.png" width="1137" height="622" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a69b5189-4092-45f6-8409-b7059f59a762_1137x622.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:622,&quot;width&quot;:1137,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:901002,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa69b5189-4092-45f6-8409-b7059f59a762_1137x622.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-bmP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa69b5189-4092-45f6-8409-b7059f59a762_1137x622.png 424w, https://substackcdn.com/image/fetch/$s_!-bmP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa69b5189-4092-45f6-8409-b7059f59a762_1137x622.png 848w, https://substackcdn.com/image/fetch/$s_!-bmP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa69b5189-4092-45f6-8409-b7059f59a762_1137x622.png 1272w, https://substackcdn.com/image/fetch/$s_!-bmP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa69b5189-4092-45f6-8409-b7059f59a762_1137x622.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aQPb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ed4ab8f-63a1-450a-9621-702112409c2d_941x387.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aQPb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ed4ab8f-63a1-450a-9621-702112409c2d_941x387.png 424w, https://substackcdn.com/image/fetch/$s_!aQPb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ed4ab8f-63a1-450a-9621-702112409c2d_941x387.png 848w, https://substackcdn.com/image/fetch/$s_!aQPb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ed4ab8f-63a1-450a-9621-702112409c2d_941x387.png 1272w, https://substackcdn.com/image/fetch/$s_!aQPb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ed4ab8f-63a1-450a-9621-702112409c2d_941x387.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aQPb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ed4ab8f-63a1-450a-9621-702112409c2d_941x387.png" width="941" height="387" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5ed4ab8f-63a1-450a-9621-702112409c2d_941x387.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:387,&quot;width&quot;:941,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:68069,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ed4ab8f-63a1-450a-9621-702112409c2d_941x387.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aQPb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ed4ab8f-63a1-450a-9621-702112409c2d_941x387.png 424w, https://substackcdn.com/image/fetch/$s_!aQPb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ed4ab8f-63a1-450a-9621-702112409c2d_941x387.png 848w, https://substackcdn.com/image/fetch/$s_!aQPb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ed4ab8f-63a1-450a-9621-702112409c2d_941x387.png 1272w, https://substackcdn.com/image/fetch/$s_!aQPb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ed4ab8f-63a1-450a-9621-702112409c2d_941x387.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Step 1: Fan-out-and-Synthesize</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!k8pk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde4e8a4c-2d0f-434d-867e-3b3ff0601157_783x288.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!k8pk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde4e8a4c-2d0f-434d-867e-3b3ff0601157_783x288.png 424w, https://substackcdn.com/image/fetch/$s_!k8pk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde4e8a4c-2d0f-434d-867e-3b3ff0601157_783x288.png 848w, https://substackcdn.com/image/fetch/$s_!k8pk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde4e8a4c-2d0f-434d-867e-3b3ff0601157_783x288.png 1272w, https://substackcdn.com/image/fetch/$s_!k8pk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde4e8a4c-2d0f-434d-867e-3b3ff0601157_783x288.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!k8pk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde4e8a4c-2d0f-434d-867e-3b3ff0601157_783x288.png" width="783" height="288" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/de4e8a4c-2d0f-434d-867e-3b3ff0601157_783x288.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:288,&quot;width&quot;:783,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:44797,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde4e8a4c-2d0f-434d-867e-3b3ff0601157_783x288.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!k8pk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde4e8a4c-2d0f-434d-867e-3b3ff0601157_783x288.png 424w, https://substackcdn.com/image/fetch/$s_!k8pk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde4e8a4c-2d0f-434d-867e-3b3ff0601157_783x288.png 848w, https://substackcdn.com/image/fetch/$s_!k8pk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde4e8a4c-2d0f-434d-867e-3b3ff0601157_783x288.png 1272w, https://substackcdn.com/image/fetch/$s_!k8pk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde4e8a4c-2d0f-434d-867e-3b3ff0601157_783x288.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Step 2: Adversarial Verification</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UTcV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ff666ce-2986-4373-871f-66ea3ef5141d_785x186.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UTcV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ff666ce-2986-4373-871f-66ea3ef5141d_785x186.png 424w, https://substackcdn.com/image/fetch/$s_!UTcV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ff666ce-2986-4373-871f-66ea3ef5141d_785x186.png 848w, https://substackcdn.com/image/fetch/$s_!UTcV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ff666ce-2986-4373-871f-66ea3ef5141d_785x186.png 1272w, https://substackcdn.com/image/fetch/$s_!UTcV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ff666ce-2986-4373-871f-66ea3ef5141d_785x186.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UTcV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ff666ce-2986-4373-871f-66ea3ef5141d_785x186.png" width="785" height="186" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9ff666ce-2986-4373-871f-66ea3ef5141d_785x186.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:186,&quot;width&quot;:785,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:29258,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ff666ce-2986-4373-871f-66ea3ef5141d_785x186.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UTcV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ff666ce-2986-4373-871f-66ea3ef5141d_785x186.png 424w, https://substackcdn.com/image/fetch/$s_!UTcV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ff666ce-2986-4373-871f-66ea3ef5141d_785x186.png 848w, https://substackcdn.com/image/fetch/$s_!UTcV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ff666ce-2986-4373-871f-66ea3ef5141d_785x186.png 1272w, https://substackcdn.com/image/fetch/$s_!UTcV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ff666ce-2986-4373-871f-66ea3ef5141d_785x186.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Step 3: Generate-and-Filter</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hGwp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f608a19-c34b-4509-bf95-a40236804383_941x197.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hGwp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f608a19-c34b-4509-bf95-a40236804383_941x197.png 424w, https://substackcdn.com/image/fetch/$s_!hGwp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f608a19-c34b-4509-bf95-a40236804383_941x197.png 848w, https://substackcdn.com/image/fetch/$s_!hGwp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f608a19-c34b-4509-bf95-a40236804383_941x197.png 1272w, https://substackcdn.com/image/fetch/$s_!hGwp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f608a19-c34b-4509-bf95-a40236804383_941x197.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hGwp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f608a19-c34b-4509-bf95-a40236804383_941x197.png" width="941" height="197" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9f608a19-c34b-4509-bf95-a40236804383_941x197.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:197,&quot;width&quot;:941,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:33061,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f608a19-c34b-4509-bf95-a40236804383_941x197.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hGwp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f608a19-c34b-4509-bf95-a40236804383_941x197.png 424w, https://substackcdn.com/image/fetch/$s_!hGwp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f608a19-c34b-4509-bf95-a40236804383_941x197.png 848w, https://substackcdn.com/image/fetch/$s_!hGwp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f608a19-c34b-4509-bf95-a40236804383_941x197.png 1272w, https://substackcdn.com/image/fetch/$s_!hGwp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f608a19-c34b-4509-bf95-a40236804383_941x197.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Step 4: Tournament</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IAex!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1689ce4-2706-4ac9-93b5-44bdb90b3706_943x156.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IAex!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1689ce4-2706-4ac9-93b5-44bdb90b3706_943x156.png 424w, https://substackcdn.com/image/fetch/$s_!IAex!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1689ce4-2706-4ac9-93b5-44bdb90b3706_943x156.png 848w, https://substackcdn.com/image/fetch/$s_!IAex!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1689ce4-2706-4ac9-93b5-44bdb90b3706_943x156.png 1272w, https://substackcdn.com/image/fetch/$s_!IAex!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1689ce4-2706-4ac9-93b5-44bdb90b3706_943x156.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IAex!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1689ce4-2706-4ac9-93b5-44bdb90b3706_943x156.png" width="943" height="156" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a1689ce4-2706-4ac9-93b5-44bdb90b3706_943x156.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:156,&quot;width&quot;:943,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:27828,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1689ce4-2706-4ac9-93b5-44bdb90b3706_943x156.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!IAex!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1689ce4-2706-4ac9-93b5-44bdb90b3706_943x156.png 424w, https://substackcdn.com/image/fetch/$s_!IAex!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1689ce4-2706-4ac9-93b5-44bdb90b3706_943x156.png 848w, https://substackcdn.com/image/fetch/$s_!IAex!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1689ce4-2706-4ac9-93b5-44bdb90b3706_943x156.png 1272w, https://substackcdn.com/image/fetch/$s_!IAex!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1689ce4-2706-4ac9-93b5-44bdb90b3706_943x156.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Step 5: Loop Until Done</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!s0gI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0231945b-db23-4ac8-af63-3d123a903585_940x177.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!s0gI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0231945b-db23-4ac8-af63-3d123a903585_940x177.png 424w, https://substackcdn.com/image/fetch/$s_!s0gI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0231945b-db23-4ac8-af63-3d123a903585_940x177.png 848w, https://substackcdn.com/image/fetch/$s_!s0gI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0231945b-db23-4ac8-af63-3d123a903585_940x177.png 1272w, https://substackcdn.com/image/fetch/$s_!s0gI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0231945b-db23-4ac8-af63-3d123a903585_940x177.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!s0gI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0231945b-db23-4ac8-af63-3d123a903585_940x177.png" width="940" height="177" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0231945b-db23-4ac8-af63-3d123a903585_940x177.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:177,&quot;width&quot;:940,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:28889,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0231945b-db23-4ac8-af63-3d123a903585_940x177.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!s0gI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0231945b-db23-4ac8-af63-3d123a903585_940x177.png 424w, https://substackcdn.com/image/fetch/$s_!s0gI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0231945b-db23-4ac8-af63-3d123a903585_940x177.png 848w, https://substackcdn.com/image/fetch/$s_!s0gI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0231945b-db23-4ac8-af63-3d123a903585_940x177.png 1272w, https://substackcdn.com/image/fetch/$s_!s0gI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0231945b-db23-4ac8-af63-3d123a903585_940x177.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Full Pipeline (Single Prompt)</h3><p><strong>FULL RUN</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qhF4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6a5f4fa-58b2-454c-ba0c-7088c798939a_786x667.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qhF4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6a5f4fa-58b2-454c-ba0c-7088c798939a_786x667.png 424w, https://substackcdn.com/image/fetch/$s_!qhF4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6a5f4fa-58b2-454c-ba0c-7088c798939a_786x667.png 848w, https://substackcdn.com/image/fetch/$s_!qhF4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6a5f4fa-58b2-454c-ba0c-7088c798939a_786x667.png 1272w, https://substackcdn.com/image/fetch/$s_!qhF4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6a5f4fa-58b2-454c-ba0c-7088c798939a_786x667.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qhF4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6a5f4fa-58b2-454c-ba0c-7088c798939a_786x667.png" width="786" height="667" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f6a5f4fa-58b2-454c-ba0c-7088c798939a_786x667.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:667,&quot;width&quot;:786,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:98780,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6a5f4fa-58b2-454c-ba0c-7088c798939a_786x667.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qhF4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6a5f4fa-58b2-454c-ba0c-7088c798939a_786x667.png 424w, https://substackcdn.com/image/fetch/$s_!qhF4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6a5f4fa-58b2-454c-ba0c-7088c798939a_786x667.png 848w, https://substackcdn.com/image/fetch/$s_!qhF4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6a5f4fa-58b2-454c-ba0c-7088c798939a_786x667.png 1272w, https://substackcdn.com/image/fetch/$s_!qhF4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6a5f4fa-58b2-454c-ba0c-7088c798939a_786x667.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Note that the evaluation criteria was very important, without which Haiku &#8220;interpreted&#8221; what failure meant:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ItOv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f1e7b6b-baca-4986-9638-446bc8ec3fba_652x197.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ItOv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f1e7b6b-baca-4986-9638-446bc8ec3fba_652x197.png 424w, https://substackcdn.com/image/fetch/$s_!ItOv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f1e7b6b-baca-4986-9638-446bc8ec3fba_652x197.png 848w, https://substackcdn.com/image/fetch/$s_!ItOv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f1e7b6b-baca-4986-9638-446bc8ec3fba_652x197.png 1272w, https://substackcdn.com/image/fetch/$s_!ItOv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f1e7b6b-baca-4986-9638-446bc8ec3fba_652x197.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ItOv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f1e7b6b-baca-4986-9638-446bc8ec3fba_652x197.png" width="652" height="197" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0f1e7b6b-baca-4986-9638-446bc8ec3fba_652x197.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:197,&quot;width&quot;:652,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:27691,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f1e7b6b-baca-4986-9638-446bc8ec3fba_652x197.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ItOv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f1e7b6b-baca-4986-9638-446bc8ec3fba_652x197.png 424w, https://substackcdn.com/image/fetch/$s_!ItOv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f1e7b6b-baca-4986-9638-446bc8ec3fba_652x197.png 848w, https://substackcdn.com/image/fetch/$s_!ItOv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f1e7b6b-baca-4986-9638-446bc8ec3fba_652x197.png 1272w, https://substackcdn.com/image/fetch/$s_!ItOv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f1e7b6b-baca-4986-9638-446bc8ec3fba_652x197.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GC3y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3cdd332-c3c8-4d16-bcdd-4d2105e6884d_888x173.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GC3y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3cdd332-c3c8-4d16-bcdd-4d2105e6884d_888x173.png 424w, https://substackcdn.com/image/fetch/$s_!GC3y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3cdd332-c3c8-4d16-bcdd-4d2105e6884d_888x173.png 848w, https://substackcdn.com/image/fetch/$s_!GC3y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3cdd332-c3c8-4d16-bcdd-4d2105e6884d_888x173.png 1272w, https://substackcdn.com/image/fetch/$s_!GC3y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3cdd332-c3c8-4d16-bcdd-4d2105e6884d_888x173.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GC3y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3cdd332-c3c8-4d16-bcdd-4d2105e6884d_888x173.png" width="728" height="141.82882882882882" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b3cdd332-c3c8-4d16-bcdd-4d2105e6884d_888x173.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:173,&quot;width&quot;:888,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:15652,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3cdd332-c3c8-4d16-bcdd-4d2105e6884d_888x173.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GC3y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3cdd332-c3c8-4d16-bcdd-4d2105e6884d_888x173.png 424w, https://substackcdn.com/image/fetch/$s_!GC3y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3cdd332-c3c8-4d16-bcdd-4d2105e6884d_888x173.png 848w, https://substackcdn.com/image/fetch/$s_!GC3y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3cdd332-c3c8-4d16-bcdd-4d2105e6884d_888x173.png 1272w, https://substackcdn.com/image/fetch/$s_!GC3y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3cdd332-c3c8-4d16-bcdd-4d2105e6884d_888x173.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><strong>Run stages 1&#8211;5 individually when:</strong></p><ul><li><p>You are doing this for the first time</p></li><li><p>You want to inspect the output of each stage before proceeding</p></li><li><p>You want to check the token count after each stage</p></li><li><p>Something fails and you need to know which stage broke</p></li></ul><p><strong>Run the Full Pipeline prompt when:</strong></p><ul><li><p>You have already run the stages individually at least once and know they work</p></li><li><p>You want to run the whole audit in one go (production use)</p></li><li><p>You are re-running against updated evidence files on a recurring basis</p></li></ul><p>The practical learning path is: run Step 0, then run stages individually the first time through. After that, save the workflow with S and use the Full Pipeline prompt for every subsequent run. That is what &#8220;save workflow as audit_full.js&#8221; at the end of the Full Run prompt means &#8212; the first full run generates and saves the reusable harness.</p><h2>8. Architectural Evolution: H1 to Dynamic Workflow</h2><p>The diagram below places dynamic workflows in the progression from my earliest harness experiments. It is a single continuous trajectory, not a replacement.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NFmG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81011772-1c91-4130-ae2c-020195ac2125_933x457.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NFmG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81011772-1c91-4130-ae2c-020195ac2125_933x457.png 424w, https://substackcdn.com/image/fetch/$s_!NFmG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81011772-1c91-4130-ae2c-020195ac2125_933x457.png 848w, https://substackcdn.com/image/fetch/$s_!NFmG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81011772-1c91-4130-ae2c-020195ac2125_933x457.png 1272w, https://substackcdn.com/image/fetch/$s_!NFmG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81011772-1c91-4130-ae2c-020195ac2125_933x457.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NFmG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81011772-1c91-4130-ae2c-020195ac2125_933x457.png" width="933" height="457" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/81011772-1c91-4130-ae2c-020195ac2125_933x457.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:457,&quot;width&quot;:933,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:22545,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81011772-1c91-4130-ae2c-020195ac2125_933x457.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NFmG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81011772-1c91-4130-ae2c-020195ac2125_933x457.png 424w, https://substackcdn.com/image/fetch/$s_!NFmG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81011772-1c91-4130-ae2c-020195ac2125_933x457.png 848w, https://substackcdn.com/image/fetch/$s_!NFmG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81011772-1c91-4130-ae2c-020195ac2125_933x457.png 1272w, https://substackcdn.com/image/fetch/$s_!NFmG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81011772-1c91-4130-ae2c-020195ac2125_933x457.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!f67-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d14927-8923-449b-8b95-dc8941540c28_938x518.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!f67-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d14927-8923-449b-8b95-dc8941540c28_938x518.png 424w, https://substackcdn.com/image/fetch/$s_!f67-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d14927-8923-449b-8b95-dc8941540c28_938x518.png 848w, https://substackcdn.com/image/fetch/$s_!f67-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d14927-8923-449b-8b95-dc8941540c28_938x518.png 1272w, https://substackcdn.com/image/fetch/$s_!f67-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d14927-8923-449b-8b95-dc8941540c28_938x518.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!f67-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d14927-8923-449b-8b95-dc8941540c28_938x518.png" width="938" height="518" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/97d14927-8923-449b-8b95-dc8941540c28_938x518.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:518,&quot;width&quot;:938,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24823,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d14927-8923-449b-8b95-dc8941540c28_938x518.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!f67-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d14927-8923-449b-8b95-dc8941540c28_938x518.png 424w, https://substackcdn.com/image/fetch/$s_!f67-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d14927-8923-449b-8b95-dc8941540c28_938x518.png 848w, https://substackcdn.com/image/fetch/$s_!f67-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d14927-8923-449b-8b95-dc8941540c28_938x518.png 1272w, https://substackcdn.com/image/fetch/$s_!f67-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d14927-8923-449b-8b95-dc8941540c28_938x518.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>9. Prior Approaches vs Workflow Patterns</h2><p>An honest question for any project where you already have working methods and times when I have had to rethink: would the prior approaches have handled the compliance audit as well as, or better than, a dynamic workflow? The answer depends on the scale and who is coordinating.</p><p><strong>Prompting skill does not disappear with workflows &#8212; it moves upstream.</strong></p><p>With a single-agent prompt, bad instructions produce a bad response and you see it immediately. With a workflow, bad instructions get multiplied across ten Haiku agents running in parallel, fed into a Sonnet synthesizer, passed to an adversary, and looped. The error compounds before you ever see output. The prompt matters more, not less.</p><p><strong>The workflow is only as good as its evaluation criteria.</strong></p><p>Initially i had some issues not being specific with evaluation criteria for Haiku (which it then &#8220;reinterpreted&#8221;). What Claude Code flagged was not a structural problem &#8212; the pipeline logic was correct. It was a <strong>specification problem: the agents (Haiku) had no precise definition of what constitutes a gap.</strong> Without that, every agent applies its own judgment independently and inconsistently. The guard rails I added (three gap conditions, explicit time limits, adversary dismissal rule) are not workflow logic &#8212; they are a rubric. The same thing when I hand-crafted for the ASCRS gold answer and the &#945; metric. The <strong>mechanism is different but the discipline is the same.</strong></p><p><strong>Automation changes when you catch errors, not whether you catch them.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!g_Uh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cd2da6b-0901-466d-a7dd-025cb26f38fc_811x251.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!g_Uh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cd2da6b-0901-466d-a7dd-025cb26f38fc_811x251.png 424w, https://substackcdn.com/image/fetch/$s_!g_Uh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cd2da6b-0901-466d-a7dd-025cb26f38fc_811x251.png 848w, https://substackcdn.com/image/fetch/$s_!g_Uh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cd2da6b-0901-466d-a7dd-025cb26f38fc_811x251.png 1272w, https://substackcdn.com/image/fetch/$s_!g_Uh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cd2da6b-0901-466d-a7dd-025cb26f38fc_811x251.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!g_Uh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cd2da6b-0901-466d-a7dd-025cb26f38fc_811x251.png" width="811" height="251" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4cd2da6b-0901-466d-a7dd-025cb26f38fc_811x251.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:251,&quot;width&quot;:811,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:33925,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cd2da6b-0901-466d-a7dd-025cb26f38fc_811x251.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!g_Uh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cd2da6b-0901-466d-a7dd-025cb26f38fc_811x251.png 424w, https://substackcdn.com/image/fetch/$s_!g_Uh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cd2da6b-0901-466d-a7dd-025cb26f38fc_811x251.png 848w, https://substackcdn.com/image/fetch/$s_!g_Uh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cd2da6b-0901-466d-a7dd-025cb26f38fc_811x251.png 1272w, https://substackcdn.com/image/fetch/$s_!g_Uh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cd2da6b-0901-466d-a7dd-025cb26f38fc_811x251.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is why running stages individually the first time is genuinely important &#8212; not just for token management but because it surfaces specification gaps at Stage 1 before they propagate to Stage 5. Once you have confirmed the evaluation criteria work correctly at Stage 1, the consolidated prompt is safe to use repeatedly.</p><p>The broader point and this is so critical: <strong>dynamic workflows are a force multiplier on whatever the prompt says. Precise prompting produces precise automation. Vague prompting produces vague automation at scale, which is harder to diagnose than vague output from a single agent. The skill shifts from crafting one good response to crafting one good specification &#8212; which is a higher-order version of the same discipline.</strong></p><h3>9.1 How Each Prior Approach Maps to the Audit</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vUKW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc4e0e31-15f1-4783-9492-ee3a266ac59c_942x617.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vUKW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc4e0e31-15f1-4783-9492-ee3a266ac59c_942x617.png 424w, https://substackcdn.com/image/fetch/$s_!vUKW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc4e0e31-15f1-4783-9492-ee3a266ac59c_942x617.png 848w, https://substackcdn.com/image/fetch/$s_!vUKW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc4e0e31-15f1-4783-9492-ee3a266ac59c_942x617.png 1272w, https://substackcdn.com/image/fetch/$s_!vUKW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc4e0e31-15f1-4783-9492-ee3a266ac59c_942x617.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vUKW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc4e0e31-15f1-4783-9492-ee3a266ac59c_942x617.png" width="942" height="617" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc4e0e31-15f1-4783-9492-ee3a266ac59c_942x617.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:617,&quot;width&quot;:942,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:100877,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc4e0e31-15f1-4783-9492-ee3a266ac59c_942x617.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vUKW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc4e0e31-15f1-4783-9492-ee3a266ac59c_942x617.png 424w, https://substackcdn.com/image/fetch/$s_!vUKW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc4e0e31-15f1-4783-9492-ee3a266ac59c_942x617.png 848w, https://substackcdn.com/image/fetch/$s_!vUKW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc4e0e31-15f1-4783-9492-ee3a266ac59c_942x617.png 1272w, https://substackcdn.com/image/fetch/$s_!vUKW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc4e0e31-15f1-4783-9492-ee3a266ac59c_942x617.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>9.2 The Core Trade-off: Who Coordinates</h3><p>The capability difference between prior methods and dynamic workflows is not large for a 10-control case study. The coordination difference is significant at any scale. These are questions you, the reader, consider before you apply Dynamic Workflows. I have found them useful:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vkXn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb54af2e-ef27-4e8b-b739-6124295004de_785x712.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vkXn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb54af2e-ef27-4e8b-b739-6124295004de_785x712.png 424w, https://substackcdn.com/image/fetch/$s_!vkXn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb54af2e-ef27-4e8b-b739-6124295004de_785x712.png 848w, https://substackcdn.com/image/fetch/$s_!vkXn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb54af2e-ef27-4e8b-b739-6124295004de_785x712.png 1272w, https://substackcdn.com/image/fetch/$s_!vkXn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb54af2e-ef27-4e8b-b739-6124295004de_785x712.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vkXn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb54af2e-ef27-4e8b-b739-6124295004de_785x712.png" width="785" height="712" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eb54af2e-ef27-4e8b-b739-6124295004de_785x712.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:712,&quot;width&quot;:785,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:47124,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb54af2e-ef27-4e8b-b739-6124295004de_785x712.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vkXn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb54af2e-ef27-4e8b-b739-6124295004de_785x712.png 424w, https://substackcdn.com/image/fetch/$s_!vkXn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb54af2e-ef27-4e8b-b739-6124295004de_785x712.png 848w, https://substackcdn.com/image/fetch/$s_!vkXn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb54af2e-ef27-4e8b-b739-6124295004de_785x712.png 1272w, https://substackcdn.com/image/fetch/$s_!vkXn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb54af2e-ef27-4e8b-b739-6124295004de_785x712.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>9.3 Practical Recommendation</h3><p>For a 10-control checklist with known gaps, a well-crafted single prompt in Claude.ai (the H2 approach) would find the gaps adequately. The workflow earns its complexity at three thresholds. Take this with a pinch of salt. Different tasks, circumstances and applications to various domains apply:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0V76!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d1c0a8-8743-4ec0-bc54-23f2f5ce9bff_943x356.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0V76!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d1c0a8-8743-4ec0-bc54-23f2f5ce9bff_943x356.png 424w, https://substackcdn.com/image/fetch/$s_!0V76!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d1c0a8-8743-4ec0-bc54-23f2f5ce9bff_943x356.png 848w, https://substackcdn.com/image/fetch/$s_!0V76!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d1c0a8-8743-4ec0-bc54-23f2f5ce9bff_943x356.png 1272w, https://substackcdn.com/image/fetch/$s_!0V76!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d1c0a8-8743-4ec0-bc54-23f2f5ce9bff_943x356.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0V76!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d1c0a8-8743-4ec0-bc54-23f2f5ce9bff_943x356.png" width="943" height="356" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/36d1c0a8-8743-4ec0-bc54-23f2f5ce9bff_943x356.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:356,&quot;width&quot;:943,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:56923,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200899094?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d1c0a8-8743-4ec0-bc54-23f2f5ce9bff_943x356.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0V76!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d1c0a8-8743-4ec0-bc54-23f2f5ce9bff_943x356.png 424w, https://substackcdn.com/image/fetch/$s_!0V76!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d1c0a8-8743-4ec0-bc54-23f2f5ce9bff_943x356.png 848w, https://substackcdn.com/image/fetch/$s_!0V76!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d1c0a8-8743-4ec0-bc54-23f2f5ce9bff_943x356.png 1272w, https://substackcdn.com/image/fetch/$s_!0V76!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36d1c0a8-8743-4ec0-bc54-23f2f5ce9bff_943x356.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The ASCRS result (H2 defeating H9) still holds here: for a greenfield, one-time task with few inputs, a precise prompt beats a multi-agent workflow. <strong>Workflows pay for themselves when the task recurs, scales, or requires structural independence between agents to prevent bias.</strong></p><h2><strong>Final Thoughts - Evaluation criteria must be clear!</strong></h2><p>A high level of dicipline is required when using workflows. There are four things worth building into your general practice.</p><div><hr></div><p><strong>1. A rubric before you run</strong></p><p>Before writing the workflow prompt, write down explicitly:</p><ul><li><p>What counts as a pass</p></li><li><p>What counts as a fail</p></li><li><p>What the edge cases are and which way they resolve</p></li></ul><p>This is exactly what I built for ASCRS (gold_answer.md, deliberate traps, three-tier confidence). The same discipline applies to workflow prompts. If you cannot write the rubric in plain language before running, the agents cannot apply it consistently during the run.</p><pre><code><code>ASCRS equivalent        Workflow equivalent
&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;
gold_answer.md          Evaluation rules in Stage 1
Deliberate traps        Edge cases called out explicitly
&#945; scoring rubric        Adversary dismissal criteria
Expected H1 alpha       Token budget with loop guard
</code></code></pre><div><hr></div><p><strong>2. A pre-flight checklist for any workflow prompt</strong></p><p>Before running, ask these five questions:</p><pre><code><code>+--------------------------------------------------+
| 1. INPUTS    Are all input files named            |
|              explicitly? Does the prompt tell     |
|              agents exactly where to read from?   |
+--------------------------------------------------+
| 2. OUTPUTS   Are all output paths explicit?       |
|              reports/ not just a filename.        |
+--------------------------------------------------+
| 3. CRITERIA  Is pass/fail defined precisely?      |
|              Could two agents reading the same    |
|              file reach different conclusions?    |
+--------------------------------------------------+
| 4. EDGE CASES Are time limits, thresholds, and    |
|              special conditions stated? Or left   |
|              to agent judgment?                   |
+--------------------------------------------------+
| 5. FAILURE   What happens if a stage fails or     |
|              the loop does not converge?          |
|              Is there a fallback and a cap?       |
+--------------------------------------------------+
</code></code></pre><p>If any answer is &#8220;the agent will figure it out&#8221; &#8212; that is the gap to fix before running.</p><div><hr></div><p><strong>3. A dry run on one unit before fanning out</strong></p><p>Before running the full fan-out across all ten controls, run one agent against one control manually and inspect the JSON output. Confirm:</p><ul><li><p>The status field is populated correctly</p></li><li><p>The gap field says what you expect</p></li><li><p>The evidenceFile field points to a real file</p></li></ul><p>One agent, one control, one read. If that output is wrong, fix the evaluation rules before multiplying across all ten. This costs almost nothing and catches specification errors before they compound.</p><pre><code><code>  Test one unit first
        |
        v
  +------------------+
  | Agent: C3 only   |  &lt;- inspect output manually
  +------------------+
        |
   Output correct?
        |
   Yes --&gt; run full fan-out
   No  --&gt; fix evaluation rules, re-test C3
</code></code></pre><div><hr></div><p><strong>4. A prompt template for future workflow projects</strong></p><p>This is the reusable structure that forces completeness. I keep it as a file in my  projects folder:</p><pre><code><code>WORKFLOW PROMPT TEMPLATE
&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;
ultracode: read CLAUDE.md for output rules.

INPUTS:
  [list every file the workflow reads]

PIPELINE:
  Stage 1: [what it does]
    Evaluation rules:
    - Pass condition: [explicit]
    - Fail condition: [explicit, with all edge cases]
    - Do not: [what agents must not invent]

  Stage 2: [what it does]
    Dismissal rules:
    - Dismiss if: [explicit condition]
    - Confirm if: [explicit condition]
    - Do not: [what the adversary must not invent]

  Stage 3-N: [downstream stages]

OUTPUTS:
  [list every file the workflow writes, with full path]

CONTROLS:
  Token budget: [number]k
  Loop maximum: [N] passes
  Fallback: if not converged after [N] passes, [action]

Save workflow as [name].js
</code></code></pre><div><hr></div><p>The underlying principle across all four is the same one the ASCRS work established: <em><strong>the intelligence is in the structure, not the execution. A workflow with vague criteria delegates judgment to agents that have no shared context and no consistent standard. A workflow with precise criteria turns agents into reliable executors of a well-defined specification. The prompt is the specification. Writing it is the skilled work.</strong></em></p><p><em><strong>A lot of good advice is found in the <a href="https://claude.com/blog/a-harness-for-every-task-dynamic-workflows-in-claude-code">Anthropic Blog </a>on similar.</strong></em></p><h1>References</h1><p>&#8226; <a href="http://claude.com/blog/a-harness-for-every-task-dynamic-workflows-in-claude-code">Anthropic (2026). &#8220;A harness for every task: dynamic workflows in Claude Code.&#8221; Thariq Shihipar &amp; Sid Bidasaria. June 2, 2026</a>. </p><p>&#8226; <a href="http://code.claude.com/docs/en/workflows">Anthropic (2026). Dynamic Workflows reference documentation.</a> </p><p>&#8226; &#8220;<a href="https://interestingengineering.substack.com/p/the-geometry-of-unpredictability">The Geometry of Unpredictability</a>.&#8221; Harness Engineering Series &#8212; SkillOpt integration, MCP token economics, failure taxonomy.</p><p>&#8226; &#8220;<a href="https://interestingengineering.substack.com/p/the-structure-is-the-intelligence">The Structure Is the Intelligence.</a>&#8221; StockPilot agent decomposition, CMA Cycles 0&#8211;4, 97% token reduction result.</p><p>&#8226; <a href="https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated">ASCRS Harness Lab</a>. H1&#8211;H10 architecture evaluation, &#945;/&#954; metrics, H2 vs H9 result (Hormuz Strait domain).</p><p>&#8226; &#8220;<a href="https://interestingengineering.substack.com/p/the-invisible-codebase">The Invisible Codebase.</a>&#8221; Reversa five-agent COBOL-to-Python pipeline.</p><p>&#8226; Harness Engineering <a href="https://interestingengineering.substack.com/p/harness-engineering-scaffolding-a">Parts I</a> &amp;<a href="https://interestingengineering.substack.com/p/the-token-tax"> II</a>. H1&#8211;H10 progressive harness experiments, E1&#8211;E11 token efficiency series.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Invisible Codebase]]></title><description><![CDATA[250 Billion Lines of Business Logic That Nobody Fully Understands, Why AI Is Finally Making It Extractable, and What the Practitioner Toolkit Looks Like in 2026]]></description><link>https://interestingengineering.substack.com/p/the-invisible-codebase</link><guid isPermaLink="false">https://interestingengineering.substack.com/p/the-invisible-codebase</guid><dc:creator><![CDATA[Interesting Engineering ++]]></dc:creator><pubDate>Thu, 04 Jun 2026 18:55:12 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!cgP_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a07de27-39a7-41ea-bae2-a6b82a1cf170_1115x617.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cgP_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a07de27-39a7-41ea-bae2-a6b82a1cf170_1115x617.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cgP_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a07de27-39a7-41ea-bae2-a6b82a1cf170_1115x617.png 424w, https://substackcdn.com/image/fetch/$s_!cgP_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a07de27-39a7-41ea-bae2-a6b82a1cf170_1115x617.png 848w, https://substackcdn.com/image/fetch/$s_!cgP_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a07de27-39a7-41ea-bae2-a6b82a1cf170_1115x617.png 1272w, https://substackcdn.com/image/fetch/$s_!cgP_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a07de27-39a7-41ea-bae2-a6b82a1cf170_1115x617.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cgP_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a07de27-39a7-41ea-bae2-a6b82a1cf170_1115x617.png" width="1115" height="617" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4a07de27-39a7-41ea-bae2-a6b82a1cf170_1115x617.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:617,&quot;width&quot;:1115,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1130741,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200645893?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a07de27-39a7-41ea-bae2-a6b82a1cf170_1115x617.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!cgP_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a07de27-39a7-41ea-bae2-a6b82a1cf170_1115x617.png 424w, https://substackcdn.com/image/fetch/$s_!cgP_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a07de27-39a7-41ea-bae2-a6b82a1cf170_1115x617.png 848w, https://substackcdn.com/image/fetch/$s_!cgP_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a07de27-39a7-41ea-bae2-a6b82a1cf170_1115x617.png 1272w, https://substackcdn.com/image/fetch/$s_!cgP_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a07de27-39a7-41ea-bae2-a6b82a1cf170_1115x617.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>In early 2026, a single Anthropic blog post about <strong>Claude Code and COBOL</strong> erased nearly $40 billion from IBM&#8217;s market capitalisation in a day (Note: IBM shares not only fully recovered from its $40 billion market cap drop, but the stock recently reached all-time highs - some of it better understood by reading through and understanding the experiment I ran). That reaction was not panic &#8212; it was the market correctly pricing a structural shift. This article surveys the landscape that shift created: what IBM, Amazon, and specialist firms are actually deploying at enterprise scale; whether Claude Code can perform meaningful migration without a structured framework; where Reversa (an open-source methodology released May 2026) fits in the ecosystem; and what a practitioner might learn by building and running a five-agent migration pipeline on their own machine. The central argument is that <strong>the tool matters far less than the methodology</strong> &#8212; specifically, the distinction between <strong>syntactic code translation and operational contract reconstruction</strong>, and the <strong>confidence tagging system that makes human oversight tractable rather than exhausting</strong>.</em></p><p><em>Note: <strong>The experiment relies on assumptions of a hypothetical but working code base written in COBOL. Tools of analysis were selected based on the requirements of the case study. Different models, harnesses and agentic pipelines can produce varying results, which depend on the methodologies applied herein.</strong> </em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_h_C!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1cd80f25-a945-4996-a96e-b20c456361c6_1116x615.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_h_C!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1cd80f25-a945-4996-a96e-b20c456361c6_1116x615.png 424w, https://substackcdn.com/image/fetch/$s_!_h_C!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1cd80f25-a945-4996-a96e-b20c456361c6_1116x615.png 848w, https://substackcdn.com/image/fetch/$s_!_h_C!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1cd80f25-a945-4996-a96e-b20c456361c6_1116x615.png 1272w, https://substackcdn.com/image/fetch/$s_!_h_C!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1cd80f25-a945-4996-a96e-b20c456361c6_1116x615.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_h_C!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1cd80f25-a945-4996-a96e-b20c456361c6_1116x615.png" width="1116" height="615" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1cd80f25-a945-4996-a96e-b20c456361c6_1116x615.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:615,&quot;width&quot;:1116,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1104320,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200645893?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1cd80f25-a945-4996-a96e-b20c456361c6_1116x615.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_h_C!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1cd80f25-a945-4996-a96e-b20c456361c6_1116x615.png 424w, https://substackcdn.com/image/fetch/$s_!_h_C!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1cd80f25-a945-4996-a96e-b20c456361c6_1116x615.png 848w, https://substackcdn.com/image/fetch/$s_!_h_C!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1cd80f25-a945-4996-a96e-b20c456361c6_1116x615.png 1272w, https://substackcdn.com/image/fetch/$s_!_h_C!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1cd80f25-a945-4996-a96e-b20c456361c6_1116x615.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The Post That Moved Markets</h2><p>In February 2026, Anthropic published a post describing how Claude Code could automate the exploration and analysis of COBOL codebases. The announcement was technically measured &#8212; careful about what AI could and could not do. The market was less measured. IBM&#8217;s stock fell thirteen percent in a single session, its worst single-day loss since 2000. The trigger was a blog post about a 67-year-old programming language.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CvRf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe81fe655-0acd-4e89-8895-d022dc5c4027_1119x620.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CvRf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe81fe655-0acd-4e89-8895-d022dc5c4027_1119x620.png 424w, https://substackcdn.com/image/fetch/$s_!CvRf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe81fe655-0acd-4e89-8895-d022dc5c4027_1119x620.png 848w, https://substackcdn.com/image/fetch/$s_!CvRf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe81fe655-0acd-4e89-8895-d022dc5c4027_1119x620.png 1272w, https://substackcdn.com/image/fetch/$s_!CvRf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe81fe655-0acd-4e89-8895-d022dc5c4027_1119x620.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CvRf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe81fe655-0acd-4e89-8895-d022dc5c4027_1119x620.png" width="1119" height="620" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e81fe655-0acd-4e89-8895-d022dc5c4027_1119x620.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:620,&quot;width&quot;:1119,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1062413,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200645893?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe81fe655-0acd-4e89-8895-d022dc5c4027_1119x620.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CvRf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe81fe655-0acd-4e89-8895-d022dc5c4027_1119x620.png 424w, https://substackcdn.com/image/fetch/$s_!CvRf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe81fe655-0acd-4e89-8895-d022dc5c4027_1119x620.png 848w, https://substackcdn.com/image/fetch/$s_!CvRf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe81fe655-0acd-4e89-8895-d022dc5c4027_1119x620.png 1272w, https://substackcdn.com/image/fetch/$s_!CvRf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe81fe655-0acd-4e89-8895-d022dc5c4027_1119x620.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>To understand why, you need to understand what COBOL represents to IBM&#8217;s business model. Approximately 84 percent of IBM&#8217;s Z mainframe clients are running COBOL applications. Those clients pay substantial ongoing fees for mainframe hardware, software licences, and specialised support. If AI could meaningfully accelerate the migration of those workloads off mainframe and onto commodity cloud infrastructure, the revenue implications for IBM were severe. The market priced that risk immediately.</p><p>Whether the market&#8217;s reaction was proportionate is a separate question. What it confirmed, beyond any industry analyst&#8217;s forecast, is that <em><strong>AI-assisted legacy migration has crossed from speculative to credible in the minds of the people who allocate capital</strong></em>. That is the moment this article is written inside.</p><p style="text-align: center;"></p><blockquote><p style="text-align: center;"><em>The market did not react to a technology demonstration. It reacted to a plausible threat to a decades-old business model. Those are different things, and the distinction matters for how you think about this space.</em></p></blockquote><h1>What Is Actually Being Migrated &#8212; and Why It Is Hard</h1><p>The scale figures are well known: an estimated 250 billion lines of COBOL in production worldwide, processing roughly 95 percent of ATM transactions and 80 percent of in-person credit card transactions daily. What is less often articulated is why migrating this code is structurally different from migrating any other software.</p><p>The answer is that a COBOL program written in 1985 and maintained through 2026 contains two codebases that are technically inseparable but conceptually distinct.</p><h3>The Visible Codebase</h3><p>Source files, database schemas, JCL job scripts, copybooks. Everything a developer or AI agent can read. This is what most migration tools operate on.</p><h3>The Invisible Codebase</h3><p>The accumulated decisions, regulatory adaptations, and implicit business rules that were never formally documented. The $9,500 threshold that was set in 1994 in response to a specific FinCEN guidance letter and exists nowhere except as a literal constant in a PROCEDURE DIVISION paragraph. The leap-year interest divisor that differs between account types for a reason that retired with the person who coded it. The DTI ceiling that was adjusted after a 2008 OCC examination finding and has never appeared in any policy document since.</p><p>Any migration approach that ignores the invisible codebase produces code that passes all its tests in a UAT environment and fails in production when it encounters the exact condition the invisible rule was written to handle. This has happened, repeatedly and expensively, at organisations that attempted big-bang migrations using earlier-generation automated tools.</p><blockquote><p><strong>THE CORE INSIGHT</strong></p><p>This is why the distinction between code translation and operational contract reconstruction is not academic. <strong>Code translation reads the visible codebase and rewrites it in another language. Operational contract reconstruction reads both codebases &#8212; the visible one from the source, the invisible one from patterns, comments, regulatory citations, and structured inference </strong>&#8212; and produces a specification that an agent can safely use as a migration target.</p></blockquote><h2>The Institutional Landscape: What Enterprises Are Actually Using</h2><p>Three distinct tiers have emerged, with very different economics, target audiences, and architectural philosophies.</p><h3>Tier 1 &#8212; IBM Watsonx Code Assistant for Z</h3><p>IBM&#8217;s response to the migration moment is a dedicated enterprise product: Watsonx Code Assistant for Z. The architecture is revealing. IBM trained a 20-billion parameter Granite foundation model on CodeNet &#8212; one of the largest code datasets available &#8212; supplemented by enterprise COBOL specifically to ensure the model understands the language&#8217;s idiosyncrasies rather than treating it as a dialect of something more common.</p><p>The target migration is COBOL to Java, not to Python or any other language. This is a deliberate choice: Java on IBM Z maintains the performance and transactional characteristics of the mainframe runtime while moving the code to a language with a large modern developer workforce. IBM is not trying to move workloads off mainframe &#8212; it is trying to modernise workloads while keeping them on mainframe. The business model preservation is explicit.</p><p>Version 2.8, released March 2026, introduced an agentic workflow using MCP (Model Context Protocol) to connect the AI to mainframe-specific data sources &#8212; job control metadata, copybook libraries, VSAM data definitions. The result is a system with genuine enterprise data governance: code never leaves the client&#8217;s environment if they use the on-premise deployment option, which matters considerably to banks that are legally restricted from transmitting certain data to external cloud services.</p><p>Pricing is enterprise. Deployment requires IBM Z hardware and software stack. The addressable audience is a few hundred institutions globally with the largest COBOL estates.</p><h3>Tier 2 &#8212; AWS Transform and the Claude Code Partnership</h3><p>Amazon&#8217;s approach is architecturally the most instructive for understanding what actually works at scale, because AWS has been unusually transparent about why it made the architectural choices it did.</p><p>AWS Transform handles reverse engineering. It deploys specialised deterministic agents that scan entire codebases &#8212; potentially millions of lines of COBOL, PL/I, and JCL &#8212; extracting structured business rules, dependency graphs, data lineage maps, and domain decompositions. The keyword is deterministic: AWS&#8217;s documentation explicitly states that this extraction uses purpose-built tools rather than probabilistic models, because probabilistic models at this stage introduce the kind of uncertainty that makes the downstream migration untrustworthy.</p><p>For the forward engineering phase &#8212; taking those extracted specifications and generating modern code &#8212; AWS recommends Claude Code. Their own published workflow blog post states this directly: AWS Transform produces the reliable foundation, Claude Code does the generation. The separation is the architecture.</p><blockquote><p><strong>THE AWS INSIGHT</strong></p><p>&#8220;This is not a task that you can solve by pointing a general-purpose AI Agent at a repository.&#8221; AWS&#8217;s own documentation on its enterprise mainframe service. The deterministic extraction layer is not a nice-to-have &#8212; it is what makes the generative layer trustworthy.</p></blockquote><p>Transamerica, Allianz, and Marriott Hotels are among AWS&#8217;s referenced enterprise modernisation clients. The service is cloud-hosted, not on-premise by default, which creates friction for the most heavily regulated banking institutions. AWS addressed this partly by acquiring Blu Age, a French automated refactoring company that handles automated COBOL-to-Java conversion for clients who want deterministic transformation rather than LLM-generated output.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DBAi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45ffaef7-78e5-4b63-8fd5-ffb4fd8dfe50_1123x618.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DBAi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45ffaef7-78e5-4b63-8fd5-ffb4fd8dfe50_1123x618.png 424w, https://substackcdn.com/image/fetch/$s_!DBAi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45ffaef7-78e5-4b63-8fd5-ffb4fd8dfe50_1123x618.png 848w, https://substackcdn.com/image/fetch/$s_!DBAi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45ffaef7-78e5-4b63-8fd5-ffb4fd8dfe50_1123x618.png 1272w, https://substackcdn.com/image/fetch/$s_!DBAi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45ffaef7-78e5-4b63-8fd5-ffb4fd8dfe50_1123x618.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DBAi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45ffaef7-78e5-4b63-8fd5-ffb4fd8dfe50_1123x618.png" width="1123" height="618" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/45ffaef7-78e5-4b63-8fd5-ffb4fd8dfe50_1123x618.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:618,&quot;width&quot;:1123,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1177283,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200645893?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45ffaef7-78e5-4b63-8fd5-ffb4fd8dfe50_1123x618.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DBAi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45ffaef7-78e5-4b63-8fd5-ffb4fd8dfe50_1123x618.png 424w, https://substackcdn.com/image/fetch/$s_!DBAi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45ffaef7-78e5-4b63-8fd5-ffb4fd8dfe50_1123x618.png 848w, https://substackcdn.com/image/fetch/$s_!DBAi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45ffaef7-78e5-4b63-8fd5-ffb4fd8dfe50_1123x618.png 1272w, https://substackcdn.com/image/fetch/$s_!DBAi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45ffaef7-78e5-4b63-8fd5-ffb4fd8dfe50_1123x618.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Tier 3 &#8212; Specialist Firms and Proprietary Methodologies</h3><p>A third tier of specialist companies approaches the problem from a fundamentally different angle. One illuminating example is Phase Change Software, whose product COBOL Colleague uses graph-based deterministic analysis to establish verified facts about COBOL programs before any LLM is involved. Their characterisation of Claude Code&#8217;s role is precise: &#8220;The LLM translates our verified facts into fluent prose. While Claude Code struggles to present the facts, it excels at narrating them.&#8221;</p><p>The consulting firms &#8212; Accenture, Deloitte, IBM Consulting &#8212; operate at the intersection of all three tiers, typically deploying combinations of vendor tools alongside proprietary frameworks developed through years of actual migration engagements. Their differentiator is not the AI; it is the pattern library of edge cases, regulatory nuances, and failure modes accumulated from real migrations.</p><h2>The Uncomfortable Reality Most Skip</h2><p>Most large financial institutions are not migrating their COBOL. They are running it, carefully, and managing the risk of the aging workforce through knowledge transfer programmes, documentation initiatives, and strategic hiring of the few remaining COBOL-fluent developers. The migration momentum exists at second-tier institutions and government agencies, not at the top twenty global banks, most of which have attempted and partially failed major migration efforts in the past two decades and carry institutional scar tissue from those experiences.</p><p>AI-assisted migration changes the economics and the risk profile of attempting it. It does not eliminate the need for judgment, the risk of missing an invisible rule, or the regulatory requirement to prove behavioral equivalence before cutover. Those constraints remain structural regardless of which tool sits in the pipeline.</p><h2>Can Claude Do It Without a Structured Framework?</h2><p>This is the question practitioners naturally ask, and it deserves a direct answer rather than a diplomatic one.</p><p>Yes, with important caveats. Claude, prompted carefully, can read a COBOL program, identify its business rules, generate idiomatic Python, and write tests that verify the output. For a clean, well-commented COBOL program with regulatory citations in comments and consistent naming conventions, the results are genuinely useful on a first pass.</p><p>The TechChannel evaluation team tested Claude Code on real Medicare payment COBOL &#8212; not the kind of clean example used in tutorials &#8212; and documented specific failure modes: dropped conditional branches, compressed algorithms that merged adjacent logic, and incorrect regulatory constants. Their conclusion was blunt: in a financial or healthcare system, the difference between &#8220;mostly correct&#8221; and &#8220;correct&#8221; is the difference between compliance and liability.</p><p>Three things that a structured framework adds, which raw prompting does not:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!H5_Y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F102c5dd3-f4f9-4e7d-9a44-565a1ffca4f3_944x368.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!H5_Y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F102c5dd3-f4f9-4e7d-9a44-565a1ffca4f3_944x368.png 424w, https://substackcdn.com/image/fetch/$s_!H5_Y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F102c5dd3-f4f9-4e7d-9a44-565a1ffca4f3_944x368.png 848w, https://substackcdn.com/image/fetch/$s_!H5_Y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F102c5dd3-f4f9-4e7d-9a44-565a1ffca4f3_944x368.png 1272w, https://substackcdn.com/image/fetch/$s_!H5_Y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F102c5dd3-f4f9-4e7d-9a44-565a1ffca4f3_944x368.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!H5_Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F102c5dd3-f4f9-4e7d-9a44-565a1ffca4f3_944x368.png" width="944" height="368" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/102c5dd3-f4f9-4e7d-9a44-565a1ffca4f3_944x368.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:368,&quot;width&quot;:944,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:53185,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200645893?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F102c5dd3-f4f9-4e7d-9a44-565a1ffca4f3_944x368.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!H5_Y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F102c5dd3-f4f9-4e7d-9a44-565a1ffca4f3_944x368.png 424w, https://substackcdn.com/image/fetch/$s_!H5_Y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F102c5dd3-f4f9-4e7d-9a44-565a1ffca4f3_944x368.png 848w, https://substackcdn.com/image/fetch/$s_!H5_Y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F102c5dd3-f4f9-4e7d-9a44-565a1ffca4f3_944x368.png 1272w, https://substackcdn.com/image/fetch/$s_!H5_Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F102c5dd3-f4f9-4e7d-9a44-565a1ffca4f3_944x368.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The AWS architecture described above encodes exactly this lesson at enterprise scale. The structured extraction layer produces what raw prompting cannot: verifiable, traceable, confidence-tagged foundations that a generative model can safely build on.</p><h2>Where Reversa Fits: Honest Assessment</h2><p>Reversa, released in May 2026 by Macedo and da Costa, is the open-source instantiation of the same methodology that IBM and AWS apply at enterprise scale. It is not competing with Watsonx Code Assistant for Z. It is making the methodology accessible to practitioners who do not have a mainframe contract or a seven-figure modernisation budget.</p><p>What Reversa contributes technically is a structured coordination layer for Claude Code: the five-phase Discovery pipeline (Scout, Archaeologist, Detective, Architect, Writer), the confidence tagging system, the SDD (Software Design Document) output format, and the slash command interface (/reversa, /reversa-migrate, /reversa-forward) that provides phase discipline and checkpoint resumption. The AI capability underneath is Claude. The value Reversa adds is the methodology wrapping.</p><p>To be honest about what this means: a skilled prompt engineer with enough time could replicate Reversa&#8217;s core functionality. The project&#8217;s genuine contribution is that it has done that work, made it publicly available, and embedded in it the methodological discipline &#8212; particularly the confidence tagging and GAP system &#8212; that most ad hoc approaches omit.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uXsC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23c9c40e-976c-4f87-a2fb-4e06964710db_1123x621.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uXsC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23c9c40e-976c-4f87-a2fb-4e06964710db_1123x621.png 424w, https://substackcdn.com/image/fetch/$s_!uXsC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23c9c40e-976c-4f87-a2fb-4e06964710db_1123x621.png 848w, https://substackcdn.com/image/fetch/$s_!uXsC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23c9c40e-976c-4f87-a2fb-4e06964710db_1123x621.png 1272w, https://substackcdn.com/image/fetch/$s_!uXsC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23c9c40e-976c-4f87-a2fb-4e06964710db_1123x621.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uXsC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23c9c40e-976c-4f87-a2fb-4e06964710db_1123x621.png" width="1123" height="621" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/23c9c40e-976c-4f87-a2fb-4e06964710db_1123x621.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:621,&quot;width&quot;:1123,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1081719,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200645893?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23c9c40e-976c-4f87-a2fb-4e06964710db_1123x621.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uXsC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23c9c40e-976c-4f87-a2fb-4e06964710db_1123x621.png 424w, https://substackcdn.com/image/fetch/$s_!uXsC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23c9c40e-976c-4f87-a2fb-4e06964710db_1123x621.png 848w, https://substackcdn.com/image/fetch/$s_!uXsC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23c9c40e-976c-4f87-a2fb-4e06964710db_1123x621.png 1272w, https://substackcdn.com/image/fetch/$s_!uXsC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23c9c40e-976c-4f87-a2fb-4e06964710db_1123x621.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Where Reversa Is Strong</h3><p>&#8226; <strong>Non-mainframe legacy: </strong>VB6, RPG/AS400, older Java EE, PL/I, FORTRAN &#8212; the long tail of enterprise legacy that IBM&#8217;s mainframe-specific tools do not address.</p><p>&#8226; <strong>Practitioner scale: </strong>A single developer or small team working on a bounded legacy system. The overhead of an enterprise migration platform is unnecessary and counterproductive at this scale.</p><p>&#8226; <strong>Learning and research: </strong>Understanding the methodology, experimenting with different COBOL programs, building institutional knowledge before committing to a production migration.</p><p>&#8226; <strong>The forward evolution use case: </strong>The /reversa-forward command, which propagates regulatory or policy changes through an already-migrated system, is particularly well-suited to insurance policy administration where rules change continuously.</p><h3>Where Reversa Has Limits</h3><p>&#8226; <strong>Million-line mainframe codebases: </strong>Enterprise COBOL estates are too large for Claude Code&#8217;s context window without the deterministic scaffolding that AWS Transform or IBM&#8217;s ADDI tool provides.</p><p>&#8226; <strong>Regulatory data governance: </strong>A bank subject to strict data residency rules cannot send production COBOL to a cloud-hosted Claude model. IBM&#8217;s on-premise option exists specifically for this constraint.</p><p>&#8226; <strong>The invisible codebase still needs humans: </strong>No tool, including Reversa, can answer what a threshold means if it is undocumented. GAP resolution requires domain expertise regardless of how well the framework surfaces the GAPs.</p><h2>Custom Pipeline vs Reversa</h2><h3>What Reversa Is Trying to Do</h3><p>So I decided to run my own pipeline: COBOL &#8594; Python. Reversa and my simple pipeline solve <strong>different problems</strong>, which is why both can be working solutions.</p><p><strong>Reversa</strong> asks: <em>&#8220;What does this legacy system actually do?&#8221;</em> &#8212; It reverse-engineers the system into a specification document that humans and AI coding agents can use as a blueprint going forward. The output is <strong>knowledge</strong>, not code.</p><p><strong>This pipeline</strong> asks: <em>&#8220;Can we run this COBOL as Python today?&#8221;</em> &#8212; It goes straight to a working translation, tests it, and scores it. The output is <strong>runnable code with a verdict.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!h7Uc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18e6126a-cfe9-4cb6-884f-4e1becbc613f_787x554.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!h7Uc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18e6126a-cfe9-4cb6-884f-4e1becbc613f_787x554.png 424w, https://substackcdn.com/image/fetch/$s_!h7Uc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18e6126a-cfe9-4cb6-884f-4e1becbc613f_787x554.png 848w, https://substackcdn.com/image/fetch/$s_!h7Uc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18e6126a-cfe9-4cb6-884f-4e1becbc613f_787x554.png 1272w, https://substackcdn.com/image/fetch/$s_!h7Uc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18e6126a-cfe9-4cb6-884f-4e1becbc613f_787x554.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!h7Uc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18e6126a-cfe9-4cb6-884f-4e1becbc613f_787x554.png" width="787" height="554" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/18e6126a-cfe9-4cb6-884f-4e1becbc613f_787x554.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:554,&quot;width&quot;:787,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:40924,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200645893?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18e6126a-cfe9-4cb6-884f-4e1becbc613f_787x554.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!h7Uc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18e6126a-cfe9-4cb6-884f-4e1becbc613f_787x554.png 424w, https://substackcdn.com/image/fetch/$s_!h7Uc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18e6126a-cfe9-4cb6-884f-4e1becbc613f_787x554.png 848w, https://substackcdn.com/image/fetch/$s_!h7Uc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18e6126a-cfe9-4cb6-884f-4e1becbc613f_787x554.png 1272w, https://substackcdn.com/image/fetch/$s_!h7Uc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18e6126a-cfe9-4cb6-884f-4e1becbc613f_787x554.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Why Both Achieve Their Objectives</h3><p><strong>Reversa succeeds</strong> because it doesn&#8217;t try to produce code &#8212; it produces understanding. By making humans approve each phase, it catches the INFERRED and GAP items before they become bugs in production. It is more conservative and more thorough for complex, undocumented systems where you cannot afford to guess.</p><p><strong>This pipeline succeeds</strong> because it narrows the problem. By committing to one target language (Python) and one output (runnable code), it can automate the full loop &#8212; translate, test, score &#8212; in under 3 minutes. The tradeoff is that it will confidently produce something wrong if the COBOL had undocumented assumptions.</p><div><hr></div><h3>The Honest Observation</h3><p>Both projects use the exact same <strong>CONFIRMED / INFERRED / GAP</strong> confidence model. That convergence is not a coincidence &#8212; it is the fundamental truth of legacy migration: some things are knowable from the code, some things are educated guesses, and some things require a human who was there. No amount of AI sophistication eliminates that third category.</p><p>Reversa builds its entire workflow around that fact and refuses to proceed without human sign-off. This pipeline surfaces it in the final report and issues a BLOCKED verdict when too many GAPs exist. Different mechanisms, same underlying problem.</p><p>If you are migrating a payroll system or an AML engine that has been running in production for 30 years, Reversa&#8217;s approach &#8212; slow, documented, human-approved at every step &#8212; is the safer choice. If you want a quick first-pass translation to understand the scope of effort, this pipeline gets you there faster. Albeit not perfectly, but with a bit of fine-tuning, and sit-down sessions - all doable.</p><h2>Building It Yourself: What the Custom Pipeline Reveals</h2><p>A five-agent pipeline built in the companion experiment to this article (only results shared here) took a different architectural approach from Reversa: standalone Python running via OpenRouter rather than Claude Code slash commands. This was a deliberate choice, and the difference is instructive.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0aq3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb56abcc-ecdf-4aaa-a80c-9a2de0a8b89b_944x405.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0aq3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb56abcc-ecdf-4aaa-a80c-9a2de0a8b89b_944x405.png 424w, https://substackcdn.com/image/fetch/$s_!0aq3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb56abcc-ecdf-4aaa-a80c-9a2de0a8b89b_944x405.png 848w, https://substackcdn.com/image/fetch/$s_!0aq3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb56abcc-ecdf-4aaa-a80c-9a2de0a8b89b_944x405.png 1272w, https://substackcdn.com/image/fetch/$s_!0aq3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb56abcc-ecdf-4aaa-a80c-9a2de0a8b89b_944x405.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0aq3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb56abcc-ecdf-4aaa-a80c-9a2de0a8b89b_944x405.png" width="944" height="405" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eb56abcc-ecdf-4aaa-a80c-9a2de0a8b89b_944x405.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:405,&quot;width&quot;:944,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:55449,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200645893?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb56abcc-ecdf-4aaa-a80c-9a2de0a8b89b_944x405.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0aq3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb56abcc-ecdf-4aaa-a80c-9a2de0a8b89b_944x405.png 424w, https://substackcdn.com/image/fetch/$s_!0aq3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb56abcc-ecdf-4aaa-a80c-9a2de0a8b89b_944x405.png 848w, https://substackcdn.com/image/fetch/$s_!0aq3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb56abcc-ecdf-4aaa-a80c-9a2de0a8b89b_944x405.png 1272w, https://substackcdn.com/image/fetch/$s_!0aq3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb56abcc-ecdf-4aaa-a80c-9a2de0a8b89b_944x405.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Each stage in the pipeline is a separate AI agent &#8212; a distinct model with its own system prompt and specific job. Because this was a &#8220;lab experiment&#8221; for me, and given fairly straightforward, uncomplicated code, earlier model versions were applied. Here's what each one did on a sample AML-TEST.cbl run:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Q-Y-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f0a69a8-98a1-4a0f-a8c9-a970b49bbbc3_713x622.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Q-Y-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f0a69a8-98a1-4a0f-a8c9-a970b49bbbc3_713x622.png 424w, https://substackcdn.com/image/fetch/$s_!Q-Y-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f0a69a8-98a1-4a0f-a8c9-a970b49bbbc3_713x622.png 848w, https://substackcdn.com/image/fetch/$s_!Q-Y-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f0a69a8-98a1-4a0f-a8c9-a970b49bbbc3_713x622.png 1272w, https://substackcdn.com/image/fetch/$s_!Q-Y-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f0a69a8-98a1-4a0f-a8c9-a970b49bbbc3_713x622.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Q-Y-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f0a69a8-98a1-4a0f-a8c9-a970b49bbbc3_713x622.png" width="713" height="622" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6f0a69a8-98a1-4a0f-a8c9-a970b49bbbc3_713x622.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:622,&quot;width&quot;:713,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:33574,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200645893?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f0a69a8-98a1-4a0f-a8c9-a970b49bbbc3_713x622.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Q-Y-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f0a69a8-98a1-4a0f-a8c9-a970b49bbbc3_713x622.png 424w, https://substackcdn.com/image/fetch/$s_!Q-Y-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f0a69a8-98a1-4a0f-a8c9-a970b49bbbc3_713x622.png 848w, https://substackcdn.com/image/fetch/$s_!Q-Y-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f0a69a8-98a1-4a0f-a8c9-a970b49bbbc3_713x622.png 1272w, https://substackcdn.com/image/fetch/$s_!Q-Y-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f0a69a8-98a1-4a0f-a8c9-a970b49bbbc3_713x622.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Each agent only sees what it needs &#8212; the Translator never sees the test code, the Evaluator never touches the COBOL. They pass work forward like an assembly line, with the orchestrator (<code>orchestrator.py</code>) coordinating the handoffs.</p><p>The faster/cheaper Gemini models handle the mechanical work (write tests, run tests). The more capable Claude models handle the reasoning-heavy work (understand COBOL intent, judge migration quality). That&#8217;s the cost/quality tradeoff baked into the design.</p><p>Running the pipeline on an AML transaction screening program produced a complete extraction, idiomatic Python with Decimal arithmetic throughout, a <strong>33-test pytest suite, and an evaluation report scoring 77.1/100 with a CONDITIONAL verdict</strong>. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZjSg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1a59f95-f619-4cf6-bdc4-ffeab51a0a30_534x510.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZjSg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1a59f95-f619-4cf6-bdc4-ffeab51a0a30_534x510.png 424w, https://substackcdn.com/image/fetch/$s_!ZjSg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1a59f95-f619-4cf6-bdc4-ffeab51a0a30_534x510.png 848w, https://substackcdn.com/image/fetch/$s_!ZjSg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1a59f95-f619-4cf6-bdc4-ffeab51a0a30_534x510.png 1272w, https://substackcdn.com/image/fetch/$s_!ZjSg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1a59f95-f619-4cf6-bdc4-ffeab51a0a30_534x510.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZjSg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1a59f95-f619-4cf6-bdc4-ffeab51a0a30_534x510.png" width="534" height="510" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e1a59f95-f619-4cf6-bdc4-ffeab51a0a30_534x510.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:510,&quot;width&quot;:534,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:58192,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200645893?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1a59f95-f619-4cf6-bdc4-ffeab51a0a30_534x510.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZjSg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1a59f95-f619-4cf6-bdc4-ffeab51a0a30_534x510.png 424w, https://substackcdn.com/image/fetch/$s_!ZjSg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1a59f95-f619-4cf6-bdc4-ffeab51a0a30_534x510.png 848w, https://substackcdn.com/image/fetch/$s_!ZjSg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1a59f95-f619-4cf6-bdc4-ffeab51a0a30_534x510.png 1272w, https://substackcdn.com/image/fetch/$s_!ZjSg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1a59f95-f619-4cf6-bdc4-ffeab51a0a30_534x510.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!za97!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98d1fe18-d236-4b66-9579-f9a47a3b722b_532x474.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!za97!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98d1fe18-d236-4b66-9579-f9a47a3b722b_532x474.png 424w, https://substackcdn.com/image/fetch/$s_!za97!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98d1fe18-d236-4b66-9579-f9a47a3b722b_532x474.png 848w, https://substackcdn.com/image/fetch/$s_!za97!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98d1fe18-d236-4b66-9579-f9a47a3b722b_532x474.png 1272w, https://substackcdn.com/image/fetch/$s_!za97!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98d1fe18-d236-4b66-9579-f9a47a3b722b_532x474.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!za97!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98d1fe18-d236-4b66-9579-f9a47a3b722b_532x474.png" width="532" height="474" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/98d1fe18-d236-4b66-9579-f9a47a3b722b_532x474.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:474,&quot;width&quot;:532,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:61243,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200645893?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98d1fe18-d236-4b66-9579-f9a47a3b722b_532x474.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!za97!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98d1fe18-d236-4b66-9579-f9a47a3b722b_532x474.png 424w, https://substackcdn.com/image/fetch/$s_!za97!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98d1fe18-d236-4b66-9579-f9a47a3b722b_532x474.png 848w, https://substackcdn.com/image/fetch/$s_!za97!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98d1fe18-d236-4b66-9579-f9a47a3b722b_532x474.png 1272w, https://substackcdn.com/image/fetch/$s_!za97!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98d1fe18-d236-4b66-9579-f9a47a3b722b_532x474.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tav3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02895352-d29b-43f5-aff4-eb7606dea332_533x97.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tav3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02895352-d29b-43f5-aff4-eb7606dea332_533x97.png 424w, https://substackcdn.com/image/fetch/$s_!tav3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02895352-d29b-43f5-aff4-eb7606dea332_533x97.png 848w, https://substackcdn.com/image/fetch/$s_!tav3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02895352-d29b-43f5-aff4-eb7606dea332_533x97.png 1272w, https://substackcdn.com/image/fetch/$s_!tav3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02895352-d29b-43f5-aff4-eb7606dea332_533x97.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tav3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02895352-d29b-43f5-aff4-eb7606dea332_533x97.png" width="533" height="97" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/02895352-d29b-43f5-aff4-eb7606dea332_533x97.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:97,&quot;width&quot;:533,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:15683,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200645893?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02895352-d29b-43f5-aff4-eb7606dea332_533x97.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tav3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02895352-d29b-43f5-aff4-eb7606dea332_533x97.png 424w, https://substackcdn.com/image/fetch/$s_!tav3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02895352-d29b-43f5-aff4-eb7606dea332_533x97.png 848w, https://substackcdn.com/image/fetch/$s_!tav3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02895352-d29b-43f5-aff4-eb7606dea332_533x97.png 1272w, https://substackcdn.com/image/fetch/$s_!tav3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02895352-d29b-43f5-aff4-eb7606dea332_533x97.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>All 5 failures trace to the same root cause: <strong>$10,000 is a round number divisible by 1,000</strong>, so the round dollar rule fires on every CTR-level cash transaction. The tests expected those to be treated as separate, independent events &#8212; but the generated Python doesn't distinguish them. Fix the round dollar rule's logic and tests 3, 5, 6, and 18 likely resolve automatically. Test 33 fails separately because the dormant activity rule doesn't fire in the combined scenario &#8212; that needs a separate look at migrated_module.py.</p><h3>The Honest Assessment for Leadership</h3><p><strong>AI did not &#8220;convert COBOL to Python.&#8221; What it did was:</strong></p><ul><li><p><strong>Draft a working translation</strong> that gets you most of the way there &#8212; faster than a team starting from scratch</p></li><li><p><strong>Surface the gaps</strong> &#8212; the areas where the COBOL was relying on undocumented assumptions, external systems, or institutional memory. Without this tool, a developer might not even know those gaps existed until something failed in production</p></li><li><p><strong>Write and run a test suite</strong> that identified real defects before anyone deployed anything</p></li></ul><p><strong>What it could not do &#8212; and no AI currently can do &#8212; is:</strong></p><ul><li><p>Validate that inferred business rules match your compliance policy</p></li><li><p>Connect to your sanctions screening system (not in this test)</p></li><li><p>Decide when to file a SAR</p></li><li><p>Vouch for regulatory correctness</p></li></ul><p><strong>The analogy:</strong> think of this like a <strong>very fast, very thorough junior analyst who read the old system overnight, wrote up a detailed summary, drafted new code, and flagged everything they were uncertain about. You still need a senior compliance officer and an engineer to review the flagged items before this goes anywhere near production</strong>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!P0SQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20216eb8-017d-466c-8716-04960d6eaa08_1144x620.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!P0SQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20216eb8-017d-466c-8716-04960d6eaa08_1144x620.png 424w, https://substackcdn.com/image/fetch/$s_!P0SQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20216eb8-017d-466c-8716-04960d6eaa08_1144x620.png 848w, https://substackcdn.com/image/fetch/$s_!P0SQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20216eb8-017d-466c-8716-04960d6eaa08_1144x620.png 1272w, https://substackcdn.com/image/fetch/$s_!P0SQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20216eb8-017d-466c-8716-04960d6eaa08_1144x620.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!P0SQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20216eb8-017d-466c-8716-04960d6eaa08_1144x620.png" width="1144" height="620" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/20216eb8-017d-466c-8716-04960d6eaa08_1144x620.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:620,&quot;width&quot;:1144,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1061743,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200645893?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20216eb8-017d-466c-8716-04960d6eaa08_1144x620.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!P0SQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20216eb8-017d-466c-8716-04960d6eaa08_1144x620.png 424w, https://substackcdn.com/image/fetch/$s_!P0SQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20216eb8-017d-466c-8716-04960d6eaa08_1144x620.png 848w, https://substackcdn.com/image/fetch/$s_!P0SQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20216eb8-017d-466c-8716-04960d6eaa08_1144x620.png 1272w, https://substackcdn.com/image/fetch/$s_!P0SQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20216eb8-017d-466c-8716-04960d6eaa08_1144x620.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h3>Recommended Next Steps</h3><ol><li><p><strong>Compliance review of the 6 inferred rules</strong> &#8212; specifically the $8,500 structuring threshold. Is that your institution&#8217;s policy, or did the AI make it up?</p></li><li><p><strong>Engineering fixes</strong> for the 5 failing tests &#8212; these are known, specific defects and should be straightforward to correct once the correct logic is confirmed</p></li><li><p><strong>Integration planning</strong> for the 4 GAPs &#8212; OFAC, aggregation, SAR filing, and audit logging each require a conversation with the teams that own those systems</p></li><li><p><strong>Do not go live</strong> until the score reaches READY (&#8805; 80) with zero blocking GAPs. The current state is a solid foundation, not a finished product.</p></li></ol><div><hr></div><p><strong>The pipeline saved weeks of initial translation work. It did not save the compliance validation, integration work, or human judgment that AML systems legally require. </strong>That was never going to be automated away.</p><blockquote><p style="text-align: center;"><em>Code is now the cheap part of legacy migration. Architecture judgment and domain knowledge are the expensive parts. That was always true. AI makes it undeniable.</em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OUGq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77a7f9fc-0140-4da5-8f0e-1a983d3cd032_1140x620.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OUGq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77a7f9fc-0140-4da5-8f0e-1a983d3cd032_1140x620.png 424w, https://substackcdn.com/image/fetch/$s_!OUGq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77a7f9fc-0140-4da5-8f0e-1a983d3cd032_1140x620.png 848w, https://substackcdn.com/image/fetch/$s_!OUGq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77a7f9fc-0140-4da5-8f0e-1a983d3cd032_1140x620.png 1272w, https://substackcdn.com/image/fetch/$s_!OUGq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77a7f9fc-0140-4da5-8f0e-1a983d3cd032_1140x620.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OUGq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77a7f9fc-0140-4da5-8f0e-1a983d3cd032_1140x620.png" width="1140" height="620" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/77a7f9fc-0140-4da5-8f0e-1a983d3cd032_1140x620.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:620,&quot;width&quot;:1140,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1042371,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200645893?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77a7f9fc-0140-4da5-8f0e-1a983d3cd032_1140x620.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OUGq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77a7f9fc-0140-4da5-8f0e-1a983d3cd032_1140x620.png 424w, https://substackcdn.com/image/fetch/$s_!OUGq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77a7f9fc-0140-4da5-8f0e-1a983d3cd032_1140x620.png 848w, https://substackcdn.com/image/fetch/$s_!OUGq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77a7f9fc-0140-4da5-8f0e-1a983d3cd032_1140x620.png 1272w, https://substackcdn.com/image/fetch/$s_!OUGq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77a7f9fc-0140-4da5-8f0e-1a983d3cd032_1140x620.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The Methodology Is the Point</h2><p>At every tier of the market &#8212; IBM&#8217;s enterprise Granite model, AWS&#8217;s deterministic Transform agents, Reversa&#8217;s open-source pipeline, the custom five-agent Python implementation &#8212; the same architectural decision appears: <strong>separate the extraction of what the code knows from the generation of what the new code should say. The tools differ enormously. The methodology is consistent.</strong></p><p>Three elements of that methodology deserve emphasis because they are what makes the difference between a migration that produces useful output and one that produces confident-looking output that fails when it encounters an edge case:</p><h3>Operational Contracts, Not Documentation</h3><p>The output of the extraction phase should be machine-executable specifications &#8212; inputs, outputs, invariants, regulatory constraints, edge cases &#8212; not human-readable documentation. Documentation is for people. Specifications are for agents. The difference is that a specification can be tested against. Documentation cannot.</p><h3>The Confidence Tagging System</h3><p>Every extracted claim should carry one of three marks: CONFIRMED (directly readable from code with a source citation), INFERRED (deduced from pattern with a stated rationale), or GAP (not determinable from code, requires human input). This is not administrative overhead. It is the mechanism that makes human oversight tractable. A migration with 47 GAPs to review is manageable. A migration with no explicit GAPs and hidden assumptions is not.</p><h3>Behavioral Equivalence as the Standard, Not Code Correctness</h3><p>The question is not whether the generated Python compiles and passes unit tests. It is <strong>whether the Python produces the same outputs as the COBOL for every input the COBOL ever encountered</strong>. Gherkin parity specs derived from the extracted contracts &#8212; not from the generated code &#8212; are the appropriate test artefact for this standard. This is what makes a Parallel Run viable as a regulatory validation mechanism: both systems run simultaneously against real transactions, and every divergence is documented and investigated.</p><h2>Verdict: Was This Worth My Time &amp; Tokens?</h2><p>The practitioner&#8217;s question is blunt: given that IBM has a purpose-built enterprise tool, AWS has a cloud service with Fortune 500 references, and Anthropic itself publishes guidance on using Claude Code directly for COBOL analysis, is there genuine value in understanding and running an open-source framework or custom pipeline?</p><p>The answer depends on what you are optimising for.</p><h3>If you are evaluating migration tooling for an institution</h3><p>The institutional tier tools (IBM, AWS) have genuine advantages for large mainframe estates: security, data governance, enterprise support, and provenance from organisations with skin in the game. For a bank with ten million lines of COBOL running on IBM Z hardware, Watsonx Code Assistant for Z is the serious option. For a mid-size insurer wanting to modernise a policy administration system to cloud, AWS Transform with Claude Code forward engineering is the serious option. Neither requires you to understand Reversa or build your own pipeline.</p><h3>If you are a practitioner-researcher, consultant, or reader</h3><p>The methodological insight &#8212; <strong>structured extraction, confidence tagging, operational contracts, GAP-driven human oversight &#8212; is what you need to understand regardless of which tool sits on top of it.</strong> <strong>Building and running the custom pipeline teaches you this in a way that reading about IBM&#8217;s product does not. The experiments in this series used a simple AML screening program rather than a million-line mainframe system, but the methodology scales. The same principles govern both.</strong></p><p>Reversa is worth your time as a learning and research instrument. It is also worth writing about, not because it will replace IBM Watsonx Code Assistant for Z, but because it makes the <strong>methodology accessible and demonstrates that the core problem &#8212; extracting the invisible codebase, tagging it honestly, and handing the gaps to a human &#8212; is not a capability reserved soley for enterprises with mainframe contracts.</strong></p><h3>The Question That Remains Open</h3><p>The AI-assisted migration moment is real. </p><p>The tooling is advancing rapidly at every tier. What has not changed, and what no advance in model capability will change, is the epistemological limit: <strong>an AI cannot know what a threshold means if no one wrote it down. The compliance officer who can answer that question in a fifteen-minute meeting is not being replaced by the migration pipeline. She is being given a much shorter, much better-organised list of questions to answer</strong>. That is not a small thing.</p><h2>Appendix: The Tier Landscape at a Glance</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OzRJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd24974b7-5034-4407-af1a-afcb5860ef6e_1138x620.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OzRJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd24974b7-5034-4407-af1a-afcb5860ef6e_1138x620.png 424w, https://substackcdn.com/image/fetch/$s_!OzRJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd24974b7-5034-4407-af1a-afcb5860ef6e_1138x620.png 848w, https://substackcdn.com/image/fetch/$s_!OzRJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd24974b7-5034-4407-af1a-afcb5860ef6e_1138x620.png 1272w, https://substackcdn.com/image/fetch/$s_!OzRJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd24974b7-5034-4407-af1a-afcb5860ef6e_1138x620.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OzRJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd24974b7-5034-4407-af1a-afcb5860ef6e_1138x620.png" width="1138" height="620" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d24974b7-5034-4407-af1a-afcb5860ef6e_1138x620.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:620,&quot;width&quot;:1138,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1118281,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200645893?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd24974b7-5034-4407-af1a-afcb5860ef6e_1138x620.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OzRJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd24974b7-5034-4407-af1a-afcb5860ef6e_1138x620.png 424w, https://substackcdn.com/image/fetch/$s_!OzRJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd24974b7-5034-4407-af1a-afcb5860ef6e_1138x620.png 848w, https://substackcdn.com/image/fetch/$s_!OzRJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd24974b7-5034-4407-af1a-afcb5860ef6e_1138x620.png 1272w, https://substackcdn.com/image/fetch/$s_!OzRJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd24974b7-5034-4407-af1a-afcb5860ef6e_1138x620.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3_iG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9946590-34b1-43ab-b247-816556c73975_1141x620.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3_iG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9946590-34b1-43ab-b247-816556c73975_1141x620.png 424w, https://substackcdn.com/image/fetch/$s_!3_iG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9946590-34b1-43ab-b247-816556c73975_1141x620.png 848w, https://substackcdn.com/image/fetch/$s_!3_iG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9946590-34b1-43ab-b247-816556c73975_1141x620.png 1272w, https://substackcdn.com/image/fetch/$s_!3_iG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9946590-34b1-43ab-b247-816556c73975_1141x620.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3_iG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9946590-34b1-43ab-b247-816556c73975_1141x620.png" width="1141" height="620" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a9946590-34b1-43ab-b247-816556c73975_1141x620.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:620,&quot;width&quot;:1141,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1133830,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200645893?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9946590-34b1-43ab-b247-816556c73975_1141x620.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3_iG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9946590-34b1-43ab-b247-816556c73975_1141x620.png 424w, https://substackcdn.com/image/fetch/$s_!3_iG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9946590-34b1-43ab-b247-816556c73975_1141x620.png 848w, https://substackcdn.com/image/fetch/$s_!3_iG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9946590-34b1-43ab-b247-816556c73975_1141x620.png 1272w, https://substackcdn.com/image/fetch/$s_!3_iG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9946590-34b1-43ab-b247-816556c73975_1141x620.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!J2gO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf544a4c-8338-483a-a683-f73c81752374_861x728.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!J2gO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf544a4c-8338-483a-a683-f73c81752374_861x728.png 424w, https://substackcdn.com/image/fetch/$s_!J2gO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf544a4c-8338-483a-a683-f73c81752374_861x728.png 848w, https://substackcdn.com/image/fetch/$s_!J2gO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf544a4c-8338-483a-a683-f73c81752374_861x728.png 1272w, https://substackcdn.com/image/fetch/$s_!J2gO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf544a4c-8338-483a-a683-f73c81752374_861x728.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!J2gO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf544a4c-8338-483a-a683-f73c81752374_861x728.png" width="861" height="728" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bf544a4c-8338-483a-a683-f73c81752374_861x728.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:728,&quot;width&quot;:861,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:90188,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200645893?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf544a4c-8338-483a-a683-f73c81752374_861x728.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!J2gO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf544a4c-8338-483a-a683-f73c81752374_861x728.png 424w, https://substackcdn.com/image/fetch/$s_!J2gO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf544a4c-8338-483a-a683-f73c81752374_861x728.png 848w, https://substackcdn.com/image/fetch/$s_!J2gO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf544a4c-8338-483a-a683-f73c81752374_861x728.png 1272w, https://substackcdn.com/image/fetch/$s_!J2gO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf544a4c-8338-483a-a683-f73c81752374_861x728.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>References</h1><p>[1] Anthropic. <a href="https://claude.com/blog/how-ai-helps-break-cost-barrier-cobol-modernization">How AI Helps Break the Cost Barrier to COBOL Modernization</a> Anthropic Blog, February 23, 2026.</p><p>[2] Economic Times India / Reuters. <a href="https://economictimes.indiatimes.com/markets/us-stocks/news/ibm-shares-sink-13-record-steepest-drop-in-25-years-after-anthropic-says-ai-can-modernise-cobol/articleshow/128741710.cms?from=mdr">IBM Shares Hit 13% Drop</a>, February 24, 2026.</p><p>[3] Slashdot. <a href="https://slashdot.org/story/26/02/23/2110221/ibm-shares-crater-13-after-anthropic-says-claude-code-can-tackle-cobol-modernization">IBM Shares Crater 13% After Anthropic Says Claude Code Can Tackle COBOL Modernization</a> Slashdot, February 23, 2026.</p><p>[4] PYMNTS. <a href="https://www.pymnts.com/news/artificial-intelligence/2026/anthropics-cobol-bet-shakes-mainframe-economics/">Anthropic&#8217;s COBOL Bet Shakes Mainframe Economics</a> PYMNTS.com, February 24, 2026.</p><p>[5] IBM. <a href="https://www.ibm.com/new/announcements/agentic-ai-for-smarter-mainframe-modernization-with-ibm-watsonx-code-assistant-for-z">Agentic AI for Smarter Mainframe Modernization with IBM Watsonx Code Assistant for Z</a> IBM Newsroom, March 2, 2026.</p><p>[6] IBM Research. <a href="https://research.ibm.com/blog/cobol-java-ibm-z">COBOL to Java: Application Modernization with IBM Generative AI</a> IBM Research Blog, 2025.</p><p>[7] VentureBeat. <a href="https://venturebeat.com/ai/ibm-taps-watsonx-generative-ai-to-help-modernize-cobol-on-mainframes">IBM Taps Watsonx Generative AI to Help Modernize COBOL on Mainframes</a> VentureBeat, December 22, 2025.</p><p>[8] AWS. <a href="https://aws.amazon.com/blogs/migration-and-modernization/reimagining-mainframe-applications-with-aws-transform-and-claude-code/">Reimagining Mainframe Applications with AWS Transform and Claude Code</a> AWS Migration and Modernization Blog, April&#8211;May 2026.</p><p>[9] AWS. <a href="https://aws.amazon.com/mainframe-modernization/features/">AWS Mainframe Modernization Service &#8212; Features</a> AWS Documentation, 2026.</p><p>[10] AWS Migration and Modernization Blog. <a href="https://aws.amazon.com/blogs/migration-and-modernization/aws-named-a-leader-in-the-isg-provider-lens-mainframe-application-modernization-software-2025-report/">AWS Named a Leader in the ISG Provider Lens Mainframe Application Modernization Software 2025 Report</a> AWS Blog, 2025.</p><p>[11] TechChannel. <a href="https://techchannel.com/artificial-intelligence/claude-code-and-cobol/">Can Claude Code Really Understand COBOL Applications?</a> TechChannel, March 24, 2026.</p><p>[12] Phase Change Software. <a href="https://phasechange.ai/blog/anthropic-says-claude-code-can-analyze-cobol-heres-why-analysis-isnt-proof">Anthropic Says Claude Code Can Analyze COBOL. Here&#8217;s Why Analysis Isn&#8217;t Proof.</a> Phase Change Software Blog, February 25, 2026.</p><p>[13] MGM Technology Partners. <a href="https://insights.mgm-tp.com/en/2026/insurance/claude-can-migrate-cobol-migration-who-is-making-the-transition/">Claude Can Migrate COBOL &#8212; Who&#8217;s Making the Transition?</a> MGM Insights, March 5, 2026.</p><p>[14] Macedo, S.O. and da Costa, R.M.. <a href="https://arxiv.org/abs/2605.18684">Reversa: A Reverse Documentation Engineering Framework for Converting Legacy Software into Operational Specifications for AI Agents</a> arXiv:2605.18684, May 18, 2026.</p><p>[15] Macedo, S.O. and da Costa, R.M.. <a href="https://github.com/sandeco/reversa">Reversa &#8212; GitHub Repository (sandeco/reversa)</a> GitHub, 2026.</p><p>[16] The New Stack. <a href="https://thenewstack.io/cobol-everywhere-will-maintain/">COBOL Is Everywhere. Who Will Maintain It?</a> The New Stack, 2024.</p><p>[17] IT Brew. <a href="https://www.itbrew.com/stories/2026/02/26/can-cobol-ers-collab-with-claude-code">Can COBOL&#8217;ers Collab with Claude Code?</a> IT Brew, February 26, 2026.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Building On Anthropic's Claude]]></title><description><![CDATA[A retrospective on Complexity, Cost, Memory, Portability, and the Layer(s) where most problems actually live]]></description><link>https://interestingengineering.substack.com/p/building-on-anthropics-claude</link><guid isPermaLink="false">https://interestingengineering.substack.com/p/building-on-anthropics-claude</guid><dc:creator><![CDATA[Interesting Engineering ++]]></dc:creator><pubDate>Wed, 03 Jun 2026 02:23:12 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!5HUp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c9e72ab-0c2d-4fcc-b13f-abcd4f7b2984_1127x611.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5HUp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c9e72ab-0c2d-4fcc-b13f-abcd4f7b2984_1127x611.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5HUp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c9e72ab-0c2d-4fcc-b13f-abcd4f7b2984_1127x611.png 424w, https://substackcdn.com/image/fetch/$s_!5HUp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c9e72ab-0c2d-4fcc-b13f-abcd4f7b2984_1127x611.png 848w, https://substackcdn.com/image/fetch/$s_!5HUp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c9e72ab-0c2d-4fcc-b13f-abcd4f7b2984_1127x611.png 1272w, https://substackcdn.com/image/fetch/$s_!5HUp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c9e72ab-0c2d-4fcc-b13f-abcd4f7b2984_1127x611.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5HUp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c9e72ab-0c2d-4fcc-b13f-abcd4f7b2984_1127x611.png" width="1127" height="611" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0c9e72ab-0c2d-4fcc-b13f-abcd4f7b2984_1127x611.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:611,&quot;width&quot;:1127,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:940080,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200385232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c9e72ab-0c2d-4fcc-b13f-abcd4f7b2984_1127x611.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5HUp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c9e72ab-0c2d-4fcc-b13f-abcd4f7b2984_1127x611.png 424w, https://substackcdn.com/image/fetch/$s_!5HUp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c9e72ab-0c2d-4fcc-b13f-abcd4f7b2984_1127x611.png 848w, https://substackcdn.com/image/fetch/$s_!5HUp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c9e72ab-0c2d-4fcc-b13f-abcd4f7b2984_1127x611.png 1272w, https://substackcdn.com/image/fetch/$s_!5HUp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c9e72ab-0c2d-4fcc-b13f-abcd4f7b2984_1127x611.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Why this article exists - A Consolidation of Findings and Tiny Experiments</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EJ2Q!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a140f00-355a-4a81-b1ee-9d522dcfbd78_1116x612.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EJ2Q!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a140f00-355a-4a81-b1ee-9d522dcfbd78_1116x612.png 424w, https://substackcdn.com/image/fetch/$s_!EJ2Q!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a140f00-355a-4a81-b1ee-9d522dcfbd78_1116x612.png 848w, https://substackcdn.com/image/fetch/$s_!EJ2Q!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a140f00-355a-4a81-b1ee-9d522dcfbd78_1116x612.png 1272w, https://substackcdn.com/image/fetch/$s_!EJ2Q!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a140f00-355a-4a81-b1ee-9d522dcfbd78_1116x612.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EJ2Q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a140f00-355a-4a81-b1ee-9d522dcfbd78_1116x612.png" width="1116" height="612" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3a140f00-355a-4a81-b1ee-9d522dcfbd78_1116x612.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:612,&quot;width&quot;:1116,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1009994,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200385232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a140f00-355a-4a81-b1ee-9d522dcfbd78_1116x612.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EJ2Q!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a140f00-355a-4a81-b1ee-9d522dcfbd78_1116x612.png 424w, https://substackcdn.com/image/fetch/$s_!EJ2Q!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a140f00-355a-4a81-b1ee-9d522dcfbd78_1116x612.png 848w, https://substackcdn.com/image/fetch/$s_!EJ2Q!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a140f00-355a-4a81-b1ee-9d522dcfbd78_1116x612.png 1272w, https://substackcdn.com/image/fetch/$s_!EJ2Q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a140f00-355a-4a81-b1ee-9d522dcfbd78_1116x612.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em><strong>This will be a consolidation article that takes a perspective or stand on the few recent articles and experiments run with Claude/Claude Code. The experiments were taken from similar actual cases, and then task-oriented down for case study orientation and discussion. </strong></em></p><p>They originally stemmed from a set of concerns that circulate regularly among practitioners and non-practioners alike, people who asked, even friends: </p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>(1) <strong>that the product is drifting toward developers, </strong></p><p><strong>(2) that native memory is unreliable, </strong></p><p><strong>(3) that complex agent systems produce better results, </strong></p><p><strong>(4) that vendor lock-in is an inevitable consequence of building on the platform, </strong></p><p><strong>(5) that API choice does not matter, and </strong></p><p><strong>(6)that automation requires infrastructure that most business users cannot manage.</strong></p><p>These concerns are not invented. They reflect real experiences. But across a series of controlled experiments run recently  -- with <strong>real measured outputs, gold-standard answers, and scored rubrics</strong> -- a consistent pattern emerged: <em><strong>the concerns are usually accurate observations about default behaviour, and inaccurate conclusions about what that behaviour means. </strong></em></p><p><em><strong>All experiments focus on Anthropic&#8217;s products, as this is my preference for executing at the Institutional Grade level. That does not mean I do not use other harnesses, models or service providers - because the space evolves fast, and there is always something new to experiment on. But my preferences are strong for Claude.</strong></em></p><p>This article is the retrospective. It pulls the key findings together in one place, addresses each concern directly, and links to the full article for readers who want the detail. The argument is not that Claude is without limitations. It is that: </p><blockquote><p><em>M<strong>ost of the limitations practitioners describe are architectural, and architecture is something the practitioner controls.</strong></em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Rh49!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F962453f4-f9ff-4a3a-9094-8e587a50b6f4_1120x608.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Rh49!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F962453f4-f9ff-4a3a-9094-8e587a50b6f4_1120x608.png 424w, https://substackcdn.com/image/fetch/$s_!Rh49!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F962453f4-f9ff-4a3a-9094-8e587a50b6f4_1120x608.png 848w, https://substackcdn.com/image/fetch/$s_!Rh49!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F962453f4-f9ff-4a3a-9094-8e587a50b6f4_1120x608.png 1272w, https://substackcdn.com/image/fetch/$s_!Rh49!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F962453f4-f9ff-4a3a-9094-8e587a50b6f4_1120x608.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Rh49!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F962453f4-f9ff-4a3a-9094-8e587a50b6f4_1120x608.png" width="1120" height="608" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/962453f4-f9ff-4a3a-9094-8e587a50b6f4_1120x608.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:608,&quot;width&quot;:1120,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:923407,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200385232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F962453f4-f9ff-4a3a-9094-8e587a50b6f4_1120x608.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Rh49!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F962453f4-f9ff-4a3a-9094-8e587a50b6f4_1120x608.png 424w, https://substackcdn.com/image/fetch/$s_!Rh49!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F962453f4-f9ff-4a3a-9094-8e587a50b6f4_1120x608.png 848w, https://substackcdn.com/image/fetch/$s_!Rh49!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F962453f4-f9ff-4a3a-9094-8e587a50b6f4_1120x608.png 1272w, https://substackcdn.com/image/fetch/$s_!Rh49!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F962453f4-f9ff-4a3a-9094-8e587a50b6f4_1120x608.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Before the Findings: One Diagram</strong></h2><p>Every finding in this series connects to the same underlying observation. Claude is not a single fixed thing. It is a stack of layers -- and most concerns about it are concerns about one specific layer, misidentified as a concern about the whole.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!w2e1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92bdd6eb-aa13-4467-9315-ce50d6cf7f85_845x359.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!w2e1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92bdd6eb-aa13-4467-9315-ce50d6cf7f85_845x359.png 424w, https://substackcdn.com/image/fetch/$s_!w2e1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92bdd6eb-aa13-4467-9315-ce50d6cf7f85_845x359.png 848w, https://substackcdn.com/image/fetch/$s_!w2e1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92bdd6eb-aa13-4467-9315-ce50d6cf7f85_845x359.png 1272w, https://substackcdn.com/image/fetch/$s_!w2e1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92bdd6eb-aa13-4467-9315-ce50d6cf7f85_845x359.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!w2e1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92bdd6eb-aa13-4467-9315-ce50d6cf7f85_845x359.png" width="845" height="359" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/92bdd6eb-aa13-4467-9315-ce50d6cf7f85_845x359.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:359,&quot;width&quot;:845,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:38427,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200385232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92bdd6eb-aa13-4467-9315-ce50d6cf7f85_845x359.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!w2e1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92bdd6eb-aa13-4467-9315-ce50d6cf7f85_845x359.png 424w, https://substackcdn.com/image/fetch/$s_!w2e1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92bdd6eb-aa13-4467-9315-ce50d6cf7f85_845x359.png 848w, https://substackcdn.com/image/fetch/$s_!w2e1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92bdd6eb-aa13-4467-9315-ce50d6cf7f85_845x359.png 1272w, https://substackcdn.com/image/fetch/$s_!w2e1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92bdd6eb-aa13-4467-9315-ce50d6cf7f85_845x359.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p><em><strong>A memory failure is almost always a Layer 2 problem -- the wrong context was loaded, or none was. An infrastructure complaint is almost always a Layer 3 choice, not a Layer 0 limitation. A cost problem is almost always a Layer 1 or Layer 2 decision. None of the six concerns addressed in this series turned out to be a Layer 0 problem.</strong></em></p></blockquote><p>Note:</p><p><em><strong>ISR</strong> &#8212; Intelligence Systems Review. The name I used in the publication series.</em></p><p><em><strong>ASCRS</strong> &#8212; AI Supply Chain Response System. The pharmaceutical supply chain case study domain I built as the running experiment environment throughout the series &#8212; the Hormuz Strait scenario, the 23 purchase orders, the carrier data, the rubric and gold answer that underpinned H1 through H10 and the memory experiment.</em></p><h2><strong>1. If I Build on Claude, Am I Locked In?</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2zls!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc17f4027-affc-4415-86e6-f41dab1cbfba_1099x595.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2zls!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc17f4027-affc-4415-86e6-f41dab1cbfba_1099x595.png 424w, https://substackcdn.com/image/fetch/$s_!2zls!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc17f4027-affc-4415-86e6-f41dab1cbfba_1099x595.png 848w, https://substackcdn.com/image/fetch/$s_!2zls!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc17f4027-affc-4415-86e6-f41dab1cbfba_1099x595.png 1272w, https://substackcdn.com/image/fetch/$s_!2zls!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc17f4027-affc-4415-86e6-f41dab1cbfba_1099x595.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2zls!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc17f4027-affc-4415-86e6-f41dab1cbfba_1099x595.png" width="1099" height="595" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c17f4027-affc-4415-86e6-f41dab1cbfba_1099x595.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:595,&quot;width&quot;:1099,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:997955,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200385232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc17f4027-affc-4415-86e6-f41dab1cbfba_1099x595.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2zls!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc17f4027-affc-4415-86e6-f41dab1cbfba_1099x595.png 424w, https://substackcdn.com/image/fetch/$s_!2zls!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc17f4027-affc-4415-86e6-f41dab1cbfba_1099x595.png 848w, https://substackcdn.com/image/fetch/$s_!2zls!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc17f4027-affc-4415-86e6-f41dab1cbfba_1099x595.png 1272w, https://substackcdn.com/image/fetch/$s_!2zls!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc17f4027-affc-4415-86e6-f41dab1cbfba_1099x595.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p><strong>Concern: </strong><em>Vendor lock-in and lack of portability</em></p></blockquote><p>Building around Claude-specific conventions -- CLAUDE.md, Claude Code&#8217;s project structure, Anthropic&#8217;s hosted environments -- creates dependency. Moving to a different tool would mean rebuilding everything.</p><h3><strong>The finding</strong></h3><p>Lock-in is real when your intelligence lives inside a platform. It does not exist when it lives in files you own.</p><p>Every ISR experiment was built on plain markdown files: a MASTER_GUIDE containing domain knowledge and experimental history, a BOOTSTRAP_PROMPT containing session-start context, and skill files containing reusable workflow recipes. All of these are standard text. All are readable by any AI tool. The naming convention CLAUDE.md is a convention -- not a format. Changing the filename to AGENTS.md (the Forge convention) takes thirty seconds. The content does not change. The intelligence does not move.</p><p>Three cases where lock-in is a genuine risk, not a hypothetical: Claude.ai&#8217;s built-in memory summaries (stored in Anthropic&#8217;s system, not in your files); Managed Agents hosted environments (the execution environment does not export); and MCP server connections (configured per-tool, not universally portable). These are specific, nameable tradeoffs -- not a general trap.</p><p>There is also an efficiency dimension to this. Every token in CLAUDE.md is re-sent on every Claude Code request. Independent benchmarking shows that trimming a context file from 3,847 tokens to 312 tokens -- keeping only rules the model acts on -- reduces per-session cost by 91.9%. The discipline that prevents lock-in and the discipline that reduces cost are the same discipline: keep your intelligence in lean, portable, plain-text files you own.</p><p><strong>The rule:</strong></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pN0l!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67dc96d0-2afe-46e5-ba7e-e5ba8032dead_872x195.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pN0l!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67dc96d0-2afe-46e5-ba7e-e5ba8032dead_872x195.png 424w, https://substackcdn.com/image/fetch/$s_!pN0l!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67dc96d0-2afe-46e5-ba7e-e5ba8032dead_872x195.png 848w, https://substackcdn.com/image/fetch/$s_!pN0l!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67dc96d0-2afe-46e5-ba7e-e5ba8032dead_872x195.png 1272w, https://substackcdn.com/image/fetch/$s_!pN0l!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67dc96d0-2afe-46e5-ba7e-e5ba8032dead_872x195.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pN0l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67dc96d0-2afe-46e5-ba7e-e5ba8032dead_872x195.png" width="872" height="195" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/67dc96d0-2afe-46e5-ba7e-e5ba8032dead_872x195.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:195,&quot;width&quot;:872,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:18746,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200385232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67dc96d0-2afe-46e5-ba7e-e5ba8032dead_872x195.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pN0l!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67dc96d0-2afe-46e5-ba7e-e5ba8032dead_872x195.png 424w, https://substackcdn.com/image/fetch/$s_!pN0l!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67dc96d0-2afe-46e5-ba7e-e5ba8032dead_872x195.png 848w, https://substackcdn.com/image/fetch/$s_!pN0l!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67dc96d0-2afe-46e5-ba7e-e5ba8032dead_872x195.png 1272w, https://substackcdn.com/image/fetch/$s_!pN0l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67dc96d0-2afe-46e5-ba7e-e5ba8032dead_872x195.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><strong>Article Ref Start: </strong><a href="https://interestingengineering.substack.com/p/the-architecture-of-awareness-design">The Architecture of Awareness</a></p><h2><strong>2. Does Architectural Complexity Produce Better Results?</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eLBk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19a8f9e2-68c8-4471-b7ac-c82ed42822de_1127x614.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eLBk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19a8f9e2-68c8-4471-b7ac-c82ed42822de_1127x614.png 424w, https://substackcdn.com/image/fetch/$s_!eLBk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19a8f9e2-68c8-4471-b7ac-c82ed42822de_1127x614.png 848w, https://substackcdn.com/image/fetch/$s_!eLBk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19a8f9e2-68c8-4471-b7ac-c82ed42822de_1127x614.png 1272w, https://substackcdn.com/image/fetch/$s_!eLBk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19a8f9e2-68c8-4471-b7ac-c82ed42822de_1127x614.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eLBk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19a8f9e2-68c8-4471-b7ac-c82ed42822de_1127x614.png" width="1127" height="614" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/19a8f9e2-68c8-4471-b7ac-c82ed42822de_1127x614.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:614,&quot;width&quot;:1127,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:998770,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200385232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19a8f9e2-68c8-4471-b7ac-c82ed42822de_1127x614.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eLBk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19a8f9e2-68c8-4471-b7ac-c82ed42822de_1127x614.png 424w, https://substackcdn.com/image/fetch/$s_!eLBk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19a8f9e2-68c8-4471-b7ac-c82ed42822de_1127x614.png 848w, https://substackcdn.com/image/fetch/$s_!eLBk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19a8f9e2-68c8-4471-b7ac-c82ed42822de_1127x614.png 1272w, https://substackcdn.com/image/fetch/$s_!eLBk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19a8f9e2-68c8-4471-b7ac-c82ed42822de_1127x614.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p><strong>Concern: </strong><em>Technical drift toward developers / Complex setups outperform simple ones</em></p></blockquote><p>Two concerns that travel together. </p><p>The first: Claude&#8217;s product is growing more complex, drifting toward developers and away from business users. </p><p>The second: if you invest in building a complex multi-agent system, it must outperform simpler approaches.</p><h3><strong>The finding</strong></h3><p>Ten different architectures were tested against the exact same task using the exact same model. A pharmaceutical supply chain crisis. 23 purchase orders, three priority tiers, carrier availability constraints, and four deliberate reasoning traps. A gold-standard answer prepared in advance. A six-criterion rubric. Every architecture was scored against it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8agy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8114d06b-f392-44da-9fab-bb66e6d0d959_871x262.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8agy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8114d06b-f392-44da-9fab-bb66e6d0d959_871x262.png 424w, https://substackcdn.com/image/fetch/$s_!8agy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8114d06b-f392-44da-9fab-bb66e6d0d959_871x262.png 848w, https://substackcdn.com/image/fetch/$s_!8agy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8114d06b-f392-44da-9fab-bb66e6d0d959_871x262.png 1272w, https://substackcdn.com/image/fetch/$s_!8agy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8114d06b-f392-44da-9fab-bb66e6d0d959_871x262.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8agy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8114d06b-f392-44da-9fab-bb66e6d0d959_871x262.png" width="871" height="262" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8114d06b-f392-44da-9fab-bb66e6d0d959_871x262.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:262,&quot;width&quot;:871,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:37404,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200385232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8114d06b-f392-44da-9fab-bb66e6d0d959_871x262.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8agy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8114d06b-f392-44da-9fab-bb66e6d0d959_871x262.png 424w, https://substackcdn.com/image/fetch/$s_!8agy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8114d06b-f392-44da-9fab-bb66e6d0d959_871x262.png 848w, https://substackcdn.com/image/fetch/$s_!8agy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8114d06b-f392-44da-9fab-bb66e6d0d959_871x262.png 1272w, https://substackcdn.com/image/fetch/$s_!8agy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8114d06b-f392-44da-9fab-bb66e6d0d959_871x262.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>H2 -- a single, well-structured prompt with explicit role, output schema, scoring weights, and anti-hallucination rules -- achieved a perfect score at 15,277 tokens. H9 -- a five-agent swarm with specialised roles for routing, compliance, inventory, commercial analysis, and review -- scored below the bare model at nearly four times the cost.</p><p>H9 was predicted to win before the experiment ran. The reasoning was intuitive: five specialised agents coordinating toward a shared answer should outperform a single prompt. What happened instead: each agent produced a partial answer, the coordination layer had to synthesise five partials, and synthesis introduced ambiguity that was not present when a single prompt held the full picture.</p><p>H7 -- model routing, directing simpler steps to Haiku at 3.75x lower cost than Sonnet -- achieved 0.900 alpha. This is the result that matters most for practitioners concerned about cost: routing decisions alone recover most of the performance of a complex system at a fraction of the price.</p><p>On the technical drift concern: the sophisticated options in Claude&#8217;s ecosystem exist for tasks that genuinely need them. This experiment however, showed that they do not need to be the default path. The highest-scoring result used no agents, no infrastructure, and no special features -- just a precise prompt/skill.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5q0W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F695fa092-8238-4a42-9fa7-194ae6ade23d_875x386.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5q0W!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F695fa092-8238-4a42-9fa7-194ae6ade23d_875x386.png 424w, https://substackcdn.com/image/fetch/$s_!5q0W!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F695fa092-8238-4a42-9fa7-194ae6ade23d_875x386.png 848w, https://substackcdn.com/image/fetch/$s_!5q0W!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F695fa092-8238-4a42-9fa7-194ae6ade23d_875x386.png 1272w, https://substackcdn.com/image/fetch/$s_!5q0W!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F695fa092-8238-4a42-9fa7-194ae6ade23d_875x386.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5q0W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F695fa092-8238-4a42-9fa7-194ae6ade23d_875x386.png" width="875" height="386" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/695fa092-8238-4a42-9fa7-194ae6ade23d_875x386.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:386,&quot;width&quot;:875,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:34034,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200385232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F695fa092-8238-4a42-9fa7-194ae6ade23d_875x386.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5q0W!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F695fa092-8238-4a42-9fa7-194ae6ade23d_875x386.png 424w, https://substackcdn.com/image/fetch/$s_!5q0W!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F695fa092-8238-4a42-9fa7-194ae6ade23d_875x386.png 848w, https://substackcdn.com/image/fetch/$s_!5q0W!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F695fa092-8238-4a42-9fa7-194ae6ade23d_875x386.png 1272w, https://substackcdn.com/image/fetch/$s_!5q0W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F695fa092-8238-4a42-9fa7-194ae6ade23d_875x386.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Full article: </strong><a href="https://interestingengineering.substack.com/p/the-harness-experiment">The Harness Experiment</a></p><p><strong>Full article: </strong><a href="https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated">ASCRS Harness Lab -- The Integrated Agentic Stack</a></p><p><strong>Full article: </strong><a href="https://interestingengineering.substack.com/p/the-prompt-is-not-the-architecture">The Prompt Is Not the Architecture</a></p><h2><strong>3. Does or Can Decomposing an Agent Actually Reduce Cost?</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!16FI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42161323-3334-4872-bd6c-4346d965580d_1110x597.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!16FI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42161323-3334-4872-bd6c-4346d965580d_1110x597.png 424w, https://substackcdn.com/image/fetch/$s_!16FI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42161323-3334-4872-bd6c-4346d965580d_1110x597.png 848w, https://substackcdn.com/image/fetch/$s_!16FI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42161323-3334-4872-bd6c-4346d965580d_1110x597.png 1272w, https://substackcdn.com/image/fetch/$s_!16FI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42161323-3334-4872-bd6c-4346d965580d_1110x597.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!16FI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42161323-3334-4872-bd6c-4346d965580d_1110x597.png" width="1110" height="597" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/42161323-3334-4872-bd6c-4346d965580d_1110x597.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:597,&quot;width&quot;:1110,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1103213,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200385232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42161323-3334-4872-bd6c-4346d965580d_1110x597.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!16FI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42161323-3334-4872-bd6c-4346d965580d_1110x597.png 424w, https://substackcdn.com/image/fetch/$s_!16FI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42161323-3334-4872-bd6c-4346d965580d_1110x597.png 848w, https://substackcdn.com/image/fetch/$s_!16FI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42161323-3334-4872-bd6c-4346d965580d_1110x597.png 1272w, https://substackcdn.com/image/fetch/$s_!16FI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42161323-3334-4872-bd6c-4346d965580d_1110x597.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p><strong>Concern: </strong><em>Multi-agent systems are expensive and require complex infrastructure</em></p></blockquote><p>The harness experiment demonstrated that a single precise prompt outperforms complex architectures on analysis tasks. A separate experiment tested the other direction: what happens to a monolithic agent when you systematically decompose it?</p><h3><strong>The finding</strong></h3><p><strong>Anthropic&#8217;s StockPilot</strong> stock forecasting agent -- from their public workshop repository -- was run through a structured decomposition process across five cycles in Claude Code. Real API costs were measured at every step using Anthropic&#8217;s own evaluation suite.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iBbU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf641df7-9167-41dc-aa7f-c176b8eb3a40_872x226.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iBbU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf641df7-9167-41dc-aa7f-c176b8eb3a40_872x226.png 424w, https://substackcdn.com/image/fetch/$s_!iBbU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf641df7-9167-41dc-aa7f-c176b8eb3a40_872x226.png 848w, https://substackcdn.com/image/fetch/$s_!iBbU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf641df7-9167-41dc-aa7f-c176b8eb3a40_872x226.png 1272w, https://substackcdn.com/image/fetch/$s_!iBbU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf641df7-9167-41dc-aa7f-c176b8eb3a40_872x226.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iBbU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf641df7-9167-41dc-aa7f-c176b8eb3a40_872x226.png" width="872" height="226" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cf641df7-9167-41dc-aa7f-c176b8eb3a40_872x226.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:226,&quot;width&quot;:872,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:33710,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200385232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf641df7-9167-41dc-aa7f-c176b8eb3a40_872x226.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iBbU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf641df7-9167-41dc-aa7f-c176b8eb3a40_872x226.png 424w, https://substackcdn.com/image/fetch/$s_!iBbU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf641df7-9167-41dc-aa7f-c176b8eb3a40_872x226.png 848w, https://substackcdn.com/image/fetch/$s_!iBbU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf641df7-9167-41dc-aa7f-c176b8eb3a40_872x226.png 1272w, https://substackcdn.com/image/fetch/$s_!iBbU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf641df7-9167-41dc-aa7f-c176b8eb3a40_872x226.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>97% token reduction. Accuracy improved from 71% to 92%. The model did not change. The task did not change. Only what each call was asked to carry changed.</p><p>A monolithic agent accumulates context it does not need on every call. A system prompt listing 40 tools for a task that uses 3 is paying for 37 unused tools on every single invocation. <strong>Decomposition removes the accumulation</strong>. Each component receives only what it needs for its specific function. The intelligence does not change. The waste does.</p><p>One live discovery during the run: when routing was switched from OpenRouter to Anthropic direct, the score dropped from 92% to 75% with no error message. One sub-agent function had a hardcoded API client that did not update with the environment variable. The function silently failed. The model was fine. The integration layer had a hidden assumption. Finding it required knowing what had changed between runs -- which is the argument for structured, documented experiments over ad hoc iteration.</p><p><strong>Full article: </strong><a href="https://interestingengineering.substack.com/p/the-structure-is-the-intelligence">The Structure Is The Intelligence</a></p><p><strong>Full article: </strong><a href="https://interestingengineering.substack.com/p/every-company-is-an-agent-waiting">Every Company Is An Agent Waiting to Be Decomposed</a></p><p><strong>Full article: </strong><a href="https://interestingengineering.substack.com/p/the-token-tax">The Token Tax</a></p><h2><strong>4. Does the API Route Matter?</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MFwN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc698e5e3-f802-4082-a4db-417f40a84c00_1166x627.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MFwN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc698e5e3-f802-4082-a4db-417f40a84c00_1166x627.png 424w, https://substackcdn.com/image/fetch/$s_!MFwN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc698e5e3-f802-4082-a4db-417f40a84c00_1166x627.png 848w, https://substackcdn.com/image/fetch/$s_!MFwN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc698e5e3-f802-4082-a4db-417f40a84c00_1166x627.png 1272w, https://substackcdn.com/image/fetch/$s_!MFwN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc698e5e3-f802-4082-a4db-417f40a84c00_1166x627.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MFwN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc698e5e3-f802-4082-a4db-417f40a84c00_1166x627.png" width="1166" height="627" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c698e5e3-f802-4082-a4db-417f40a84c00_1166x627.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:627,&quot;width&quot;:1166,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1101083,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200385232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc698e5e3-f802-4082-a4db-417f40a84c00_1166x627.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MFwN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc698e5e3-f802-4082-a4db-417f40a84c00_1166x627.png 424w, https://substackcdn.com/image/fetch/$s_!MFwN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc698e5e3-f802-4082-a4db-417f40a84c00_1166x627.png 848w, https://substackcdn.com/image/fetch/$s_!MFwN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc698e5e3-f802-4082-a4db-417f40a84c00_1166x627.png 1272w, https://substackcdn.com/image/fetch/$s_!MFwN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc698e5e3-f802-4082-a4db-417f40a84c00_1166x627.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p><strong>Concern: </strong><em>Integration choices have no meaningful impact on cost or behaviour</em></p></blockquote><p>The model is the same regardless of which API provider you use to reach it. The route is just a technical detail.</p><h3><strong>The finding</strong></h3><p>During the StockPilot experiment, prompt caching was configured correctly -- cache_control markers included in the system prompt. The experiment ran through OpenRouter. The savings did not appear.</p><p>The investigation found that OpenRouter rotates requests across multiple infrastructure providers -- Anthropic direct, Amazon Bedrock, Google Vertex -- automatically. A prompt cached on one provider is not cached on another. Each rotation resets the cache. The instruction was correctly written and silently ineffective. No error. No warning. The fix is one line: set allow_fallbacks: false in the OpenRouter request, pinning all requests to a single provider.</p><p>A second finding: the experiment&#8217;s logging captured input_tokens only -- not cache_read_input_tokens or cache_creation_input_tokens. Those are separate fields in the Anthropic API response. If your logging does not read them, you cannot confirm caching is working even when it is. Understanding what your instrumentation captures is as important as understanding what the API reports.</p><h2><strong>What independent benchmarking shows about stacking</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pVjU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9ff366a-583c-4fe5-ac32-07a8612d6ee4_1115x620.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pVjU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9ff366a-583c-4fe5-ac32-07a8612d6ee4_1115x620.png 424w, https://substackcdn.com/image/fetch/$s_!pVjU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9ff366a-583c-4fe5-ac32-07a8612d6ee4_1115x620.png 848w, https://substackcdn.com/image/fetch/$s_!pVjU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9ff366a-583c-4fe5-ac32-07a8612d6ee4_1115x620.png 1272w, https://substackcdn.com/image/fetch/$s_!pVjU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9ff366a-583c-4fe5-ac32-07a8612d6ee4_1115x620.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pVjU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9ff366a-583c-4fe5-ac32-07a8612d6ee4_1115x620.png" width="1115" height="620" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d9ff366a-583c-4fe5-ac32-07a8612d6ee4_1115x620.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:620,&quot;width&quot;:1115,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:966148,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200385232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9ff366a-583c-4fe5-ac32-07a8612d6ee4_1115x620.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pVjU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9ff366a-583c-4fe5-ac32-07a8612d6ee4_1115x620.png 424w, https://substackcdn.com/image/fetch/$s_!pVjU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9ff366a-583c-4fe5-ac32-07a8612d6ee4_1115x620.png 848w, https://substackcdn.com/image/fetch/$s_!pVjU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9ff366a-583c-4fe5-ac32-07a8612d6ee4_1115x620.png 1272w, https://substackcdn.com/image/fetch/$s_!pVjU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9ff366a-583c-4fe5-ac32-07a8612d6ee4_1115x620.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The caching discovery is one finding from one integration path. Independent benchmarking by Hamza Farooq (claude-sonnet-4-6, May 2026, approximately 17,768 requests per month at a $500 base) quantifies four optimisation techniques and what happens when they compound:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gkg5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec68d3c-7e95-47da-ae8b-36dcdc24e9c7_869x446.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gkg5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec68d3c-7e95-47da-ae8b-36dcdc24e9c7_869x446.png 424w, https://substackcdn.com/image/fetch/$s_!gkg5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec68d3c-7e95-47da-ae8b-36dcdc24e9c7_869x446.png 848w, https://substackcdn.com/image/fetch/$s_!gkg5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec68d3c-7e95-47da-ae8b-36dcdc24e9c7_869x446.png 1272w, https://substackcdn.com/image/fetch/$s_!gkg5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec68d3c-7e95-47da-ae8b-36dcdc24e9c7_869x446.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gkg5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec68d3c-7e95-47da-ae8b-36dcdc24e9c7_869x446.png" width="869" height="446" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dec68d3c-7e95-47da-ae8b-36dcdc24e9c7_869x446.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:446,&quot;width&quot;:869,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:79221,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200385232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec68d3c-7e95-47da-ae8b-36dcdc24e9c7_869x446.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gkg5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec68d3c-7e95-47da-ae8b-36dcdc24e9c7_869x446.png 424w, https://substackcdn.com/image/fetch/$s_!gkg5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec68d3c-7e95-47da-ae8b-36dcdc24e9c7_869x446.png 848w, https://substackcdn.com/image/fetch/$s_!gkg5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec68d3c-7e95-47da-ae8b-36dcdc24e9c7_869x446.png 1272w, https://substackcdn.com/image/fetch/$s_!gkg5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdec68d3c-7e95-47da-ae8b-36dcdc24e9c7_869x446.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The critical insight from the combined result: <strong>optimisation techniques are multiplicative, not additive. Prompt caching alone saves 71.5%. Model routing alone saves 77.1%. Stacked with output budgeting and multi-turn caching, the combined saving reaches 89.3%</strong> -- $447 per month on a $500 base. Each layer cuts what remains after the previous layer.</p><p>The ISR experiments provide direct evidence for two of the four techniques: the <strong>caching discovery (prompt caching) and H7&#8217;s model routing result (0.900 alpha at materially lower cost).</strong> The multi-turn caching and output budgeting figures are from Farooq&#8217;s independent benchmarks and have not been replicated in ISR experiments. They are noted as externally evidenced, not internally verified.</p><p><strong>Practical sequence -- four checks, in order of effort</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Rccd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2899a02-e255-44f6-9c98-9e7f4cee03b6_869x336.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Rccd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2899a02-e255-44f6-9c98-9e7f4cee03b6_869x336.png 424w, https://substackcdn.com/image/fetch/$s_!Rccd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2899a02-e255-44f6-9c98-9e7f4cee03b6_869x336.png 848w, https://substackcdn.com/image/fetch/$s_!Rccd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2899a02-e255-44f6-9c98-9e7f4cee03b6_869x336.png 1272w, https://substackcdn.com/image/fetch/$s_!Rccd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2899a02-e255-44f6-9c98-9e7f4cee03b6_869x336.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Rccd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2899a02-e255-44f6-9c98-9e7f4cee03b6_869x336.png" width="869" height="336" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d2899a02-e255-44f6-9c98-9e7f4cee03b6_869x336.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:336,&quot;width&quot;:869,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:36906,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200385232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2899a02-e255-44f6-9c98-9e7f4cee03b6_869x336.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Rccd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2899a02-e255-44f6-9c98-9e7f4cee03b6_869x336.png 424w, https://substackcdn.com/image/fetch/$s_!Rccd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2899a02-e255-44f6-9c98-9e7f4cee03b6_869x336.png 848w, https://substackcdn.com/image/fetch/$s_!Rccd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2899a02-e255-44f6-9c98-9e7f4cee03b6_869x336.png 1272w, https://substackcdn.com/image/fetch/$s_!Rccd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2899a02-e255-44f6-9c98-9e7f4cee03b6_869x336.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Full article: </strong><a href="https://interestingengineering.substack.com/p/the-structure-is-the-intelligence">The Structure Is The Intelligence -- Appendix: Are All APIs Created Equal?</a></p><p><strong>External benchmark: </strong><a href="https://hamzafarooq.github.io/token-optimizer/dashboard/interactive.html#techniques">Hamza Farooq -- Claude Token Optimizer</a><em> -- open-source, claude-sonnet-4-6, May 2026</em></p><h2><strong>5. Is Claude&#8217;s Memory Actually Poor?</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HadP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7891dbc3-6e82-49a6-ad7e-73cc578ea280_1108x597.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HadP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7891dbc3-6e82-49a6-ad7e-73cc578ea280_1108x597.png 424w, https://substackcdn.com/image/fetch/$s_!HadP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7891dbc3-6e82-49a6-ad7e-73cc578ea280_1108x597.png 848w, https://substackcdn.com/image/fetch/$s_!HadP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7891dbc3-6e82-49a6-ad7e-73cc578ea280_1108x597.png 1272w, https://substackcdn.com/image/fetch/$s_!HadP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7891dbc3-6e82-49a6-ad7e-73cc578ea280_1108x597.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HadP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7891dbc3-6e82-49a6-ad7e-73cc578ea280_1108x597.png" width="1108" height="597" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7891dbc3-6e82-49a6-ad7e-73cc578ea280_1108x597.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:597,&quot;width&quot;:1108,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:950245,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200385232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7891dbc3-6e82-49a6-ad7e-73cc578ea280_1108x597.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HadP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7891dbc3-6e82-49a6-ad7e-73cc578ea280_1108x597.png 424w, https://substackcdn.com/image/fetch/$s_!HadP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7891dbc3-6e82-49a6-ad7e-73cc578ea280_1108x597.png 848w, https://substackcdn.com/image/fetch/$s_!HadP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7891dbc3-6e82-49a6-ad7e-73cc578ea280_1108x597.png 1272w, https://substackcdn.com/image/fetch/$s_!HadP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7891dbc3-6e82-49a6-ad7e-73cc578ea280_1108x597.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p><strong>Concern: </strong><em>Native memory is unreliable and requires third-party tools to fix</em></p></blockquote><p>Claude has no persistent memory between sessions? Everything is forgotten when the conversation ends? Fixing this properly requires third-party databases, vector stores, or external memory services?</p><h3><strong>A note on the research landscape</strong></h3><p>Three independent research projects on AI memory were surveyed in preparation for this series: MEMENTO (Microsoft Research, April 2026), MemPalace (April 2026), and AutoResearch (Karpathy, March 2026). I did not complete the analysis as a published synthesis because a direct philosophical conflict was found between the approaches: MemPalace stores everything verbatim, with no AI deciding what to forget; MEMENTO compresses. These positions are structurally incompatible. Publishing a synthesis would have been inaccurate. At least not for now. More testing. And I will come back to this in future.</p><p>The honest position: long-term AI memory at the level of complex multi-session knowledge accumulation remains an open and contested research problem. Practitioners should be cautious about any tool claiming to have resolved it cleanly.</p><h3><strong>The finding</strong></h3><p>Instead my experiment tested something narrower and more immediately useful: for session-level and task-level memory, does injecting context at session start measurably improve performance against a scored rubric?</p><p>The experiment was designed around two conditions: a rich, self-documenting data file versus a stripped raw version, both run without any prompt-level context injection. The finding was not what the original design anticipated.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CAzu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd06548f8-6ced-40d2-9fda-1f228ec5635b_870x252.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CAzu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd06548f8-6ced-40d2-9fda-1f228ec5635b_870x252.png 424w, https://substackcdn.com/image/fetch/$s_!CAzu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd06548f8-6ced-40d2-9fda-1f228ec5635b_870x252.png 848w, https://substackcdn.com/image/fetch/$s_!CAzu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd06548f8-6ced-40d2-9fda-1f228ec5635b_870x252.png 1272w, https://substackcdn.com/image/fetch/$s_!CAzu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd06548f8-6ced-40d2-9fda-1f228ec5635b_870x252.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CAzu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd06548f8-6ced-40d2-9fda-1f228ec5635b_870x252.png" width="870" height="252" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d06548f8-6ced-40d2-9fda-1f228ec5635b_870x252.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:252,&quot;width&quot;:870,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:42935,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200385232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd06548f8-6ced-40d2-9fda-1f228ec5635b_870x252.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CAzu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd06548f8-6ced-40d2-9fda-1f228ec5635b_870x252.png 424w, https://substackcdn.com/image/fetch/$s_!CAzu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd06548f8-6ced-40d2-9fda-1f228ec5635b_870x252.png 848w, https://substackcdn.com/image/fetch/$s_!CAzu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd06548f8-6ced-40d2-9fda-1f228ec5635b_870x252.png 1272w, https://substackcdn.com/image/fetch/$s_!CAzu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd06548f8-6ced-40d2-9fda-1f228ec5635b_870x252.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The gap of 0.175 was concentrated in exactly two criteria: R1 (tier prioritisation weights, 0.5) and R4 (financial aggregation methodology, 0.5). Four criteria held at 1.0 regardless of context, because their answers are derivable from data structure alone. The two that dropped depend on institutional parameters -- a 60/40 weighting schema and a bespoke aggregation method -- that exist only in organisational practice and cannot be inferred from raw data or general reasoning.</p><p>This produced a more precise argument than the original design would have. <em><strong>Context engineering operates at two layers: the data structure you pass to the model, and the prompt context you inject at session start. Both are valid. A well-structured data file can substitute for prompt injection on criteria with objective, derivable answers. Prompt injection is essential on criteria requiring institutional knowledge the data file cannot carry.</strong></em></p><p>The practical implication is an audit, not a prescription: identify which criteria in your task depend on institutional knowledge. Ensure your data file or your BOOTSTRAP_PROMPT carries those criteria explicitly. Do not write context that repeats what the data already says -- that is redundant tokens and noise.</p><p>Claude&#8217;s context window holds up to 200,000 tokens. When context is injected at session start, the model is not retrieving memory from somewhere external. It is carrying it. The architectural cost of a well-written context file is zero. The only cost is not having written it.</p><p><strong>Full article: </strong><a href="https://interestingengineering.substack.com/p/is-claudes-memory-actually-poor">Is Claude&#8217;s Memory Actually Poor?</a></p><h2><strong>6. Do You Need Infrastructure to Automate?</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mpCW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51684eae-7be3-484d-bfa6-06ab194f1b74_1164x632.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mpCW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51684eae-7be3-484d-bfa6-06ab194f1b74_1164x632.png 424w, https://substackcdn.com/image/fetch/$s_!mpCW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51684eae-7be3-484d-bfa6-06ab194f1b74_1164x632.png 848w, https://substackcdn.com/image/fetch/$s_!mpCW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51684eae-7be3-484d-bfa6-06ab194f1b74_1164x632.png 1272w, https://substackcdn.com/image/fetch/$s_!mpCW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51684eae-7be3-484d-bfa6-06ab194f1b74_1164x632.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mpCW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51684eae-7be3-484d-bfa6-06ab194f1b74_1164x632.png" width="1164" height="632" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/51684eae-7be3-484d-bfa6-06ab194f1b74_1164x632.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:632,&quot;width&quot;:1164,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1173672,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200385232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51684eae-7be3-484d-bfa6-06ab194f1b74_1164x632.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mpCW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51684eae-7be3-484d-bfa6-06ab194f1b74_1164x632.png 424w, https://substackcdn.com/image/fetch/$s_!mpCW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51684eae-7be3-484d-bfa6-06ab194f1b74_1164x632.png 848w, https://substackcdn.com/image/fetch/$s_!mpCW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51684eae-7be3-484d-bfa6-06ab194f1b74_1164x632.png 1272w, https://substackcdn.com/image/fetch/$s_!mpCW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51684eae-7be3-484d-bfa6-06ab194f1b74_1164x632.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p><strong>Concern: </strong><em>Automation requires cloud infrastructure or 24/7 hardware</em></p></blockquote><p>Running Claude automatically requires either leaving your computer on 24 hours a day or setting up GitHub repositories, cloud containers, and backend infrastructure? Claude&#8217;s routines interface uses developer jargon -- autofix pull requests, pushes, merges -- that has nothing to do with how a business owner thinks about getting a task done? Token-maxxing? Token bloat?</p><h3><strong>Where the concern is correct</strong></h3><p>If the requirement is for Claude to run a task at 3am with no human present -- that is scheduling. Scheduling requires infrastructure. A local machine in sleep mode will not trigger the run. Cloud infrastructure solves this and it is genuinely complex to configure. That part of the concern is accurate and should not be dismissed. The GitHub jargon in the routines interface is a real friction point for non-developers, and it reflects a product surface that was designed for development teams first.</p><h3><strong>The distinction that changes the answer</strong></h3><p>Most business workflows described as routines are not scheduling problems. They are repeatability problems. The difference matters:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LOvY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd608b454-2d86-4aca-89a4-a1234574f89a_843x277.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LOvY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd608b454-2d86-4aca-89a4-a1234574f89a_843x277.png 424w, https://substackcdn.com/image/fetch/$s_!LOvY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd608b454-2d86-4aca-89a4-a1234574f89a_843x277.png 848w, https://substackcdn.com/image/fetch/$s_!LOvY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd608b454-2d86-4aca-89a4-a1234574f89a_843x277.png 1272w, https://substackcdn.com/image/fetch/$s_!LOvY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd608b454-2d86-4aca-89a4-a1234574f89a_843x277.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LOvY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd608b454-2d86-4aca-89a4-a1234574f89a_843x277.png" width="843" height="277" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d608b454-2d86-4aca-89a4-a1234574f89a_843x277.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:277,&quot;width&quot;:843,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:26632,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200385232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd608b454-2d86-4aca-89a4-a1234574f89a_843x277.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LOvY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd608b454-2d86-4aca-89a4-a1234574f89a_843x277.png 424w, https://substackcdn.com/image/fetch/$s_!LOvY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd608b454-2d86-4aca-89a4-a1234574f89a_843x277.png 848w, https://substackcdn.com/image/fetch/$s_!LOvY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd608b454-2d86-4aca-89a4-a1234574f89a_843x277.png 1272w, https://substackcdn.com/image/fetch/$s_!LOvY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd608b454-2d86-4aca-89a4-a1234574f89a_843x277.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Claude Code&#8217;s native file system access -- <em><strong>reading files, writing files, running scripts</strong></em> -- covers the majority of repeatable business workflows without any additional infrastructure layer. MCP connections and managed agents exist for specific integration needs: live databases, external APIs, real-time data feeds. They are not prerequisites for agentic work in general.</p><p>A controlled experiment demonstrating this distinction -- a complete weekly review workflow, triggered by a single command with no cloud infrastructure -- is forthcoming. The argument above is logical rather than experimental. It will be updated with measured results when available.</p><p>It is also worth noting that Anthropic&#8217;s own roadmap is moving in this direction. <em><strong>Dreaming -- an asynchronous between-session memory consolidation process that shipped for Claude Managed Agents (CMA) in May 2026 -- represents Anthropic&#8217;s recognition that the infrastructure complexity concern is real and worth solving at the product level</strong></em>. The scheduling problem is a genuine constraint. The field is actively working on it.</p><h2><strong>What the Series Collectively Demonstrates</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TKfk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7902b46-0594-43de-b116-e812287c1dcc_1119x610.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TKfk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7902b46-0594-43de-b116-e812287c1dcc_1119x610.png 424w, https://substackcdn.com/image/fetch/$s_!TKfk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7902b46-0594-43de-b116-e812287c1dcc_1119x610.png 848w, https://substackcdn.com/image/fetch/$s_!TKfk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7902b46-0594-43de-b116-e812287c1dcc_1119x610.png 1272w, https://substackcdn.com/image/fetch/$s_!TKfk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7902b46-0594-43de-b116-e812287c1dcc_1119x610.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TKfk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7902b46-0594-43de-b116-e812287c1dcc_1119x610.png" width="1119" height="610" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e7902b46-0594-43de-b116-e812287c1dcc_1119x610.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:610,&quot;width&quot;:1119,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1009377,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200385232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7902b46-0594-43de-b116-e812287c1dcc_1119x610.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TKfk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7902b46-0594-43de-b116-e812287c1dcc_1119x610.png 424w, https://substackcdn.com/image/fetch/$s_!TKfk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7902b46-0594-43de-b116-e812287c1dcc_1119x610.png 848w, https://substackcdn.com/image/fetch/$s_!TKfk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7902b46-0594-43de-b116-e812287c1dcc_1119x610.png 1272w, https://substackcdn.com/image/fetch/$s_!TKfk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7902b46-0594-43de-b116-e812287c1dcc_1119x610.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Across five controlled experiments and multiple architectural explorations, one pattern appeared consistently: the concerns practitioners raise about Claude are usually accurate descriptions of what the default behaviour produces. They are not accurate descriptions of what the architecture can produce.</p><p>Default behaviour -- a blank session context, a monolithic agent, an unoptimised routing configuration -- produces exactly the results the critics describe. Poor recall. High cost. Fragile coordination. Genuine friction for non-developers.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GgEs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32716f04-7b3a-4a08-acdf-69c4fc5054ea_1134x593.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GgEs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32716f04-7b3a-4a08-acdf-69c4fc5054ea_1134x593.png 424w, https://substackcdn.com/image/fetch/$s_!GgEs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32716f04-7b3a-4a08-acdf-69c4fc5054ea_1134x593.png 848w, https://substackcdn.com/image/fetch/$s_!GgEs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32716f04-7b3a-4a08-acdf-69c4fc5054ea_1134x593.png 1272w, https://substackcdn.com/image/fetch/$s_!GgEs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32716f04-7b3a-4a08-acdf-69c4fc5054ea_1134x593.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GgEs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32716f04-7b3a-4a08-acdf-69c4fc5054ea_1134x593.png" width="1134" height="593" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/32716f04-7b3a-4a08-acdf-69c4fc5054ea_1134x593.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:593,&quot;width&quot;:1134,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:886366,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200385232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32716f04-7b3a-4a08-acdf-69c4fc5054ea_1134x593.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GgEs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32716f04-7b3a-4a08-acdf-69c4fc5054ea_1134x593.png 424w, https://substackcdn.com/image/fetch/$s_!GgEs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32716f04-7b3a-4a08-acdf-69c4fc5054ea_1134x593.png 848w, https://substackcdn.com/image/fetch/$s_!GgEs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32716f04-7b3a-4a08-acdf-69c4fc5054ea_1134x593.png 1272w, https://substackcdn.com/image/fetch/$s_!GgEs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32716f04-7b3a-4a08-acdf-69c4fc5054ea_1134x593.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Intentional architecture -- a well-structured data file, a lean BOOTSTRAP prompt, a decomposed agent, a correctly configured caching setup, a precise single prompt instead of a swarm -- produces results that invalidate most of those criticisms. The model has not changed in any of these comparisons. What changed was the structure around it.</p><p>Three questions that serve as a practical filter before any integration decision:</p><blockquote><p>* <strong>Is this a model problem or an architecture problem? </strong>If Claude produces a poor result when given clear context and a precise task, that is a model issue. If it produces a poor result without context or structure, that is an architecture issue. Almost every concern in this series turned out to be the second kind.</p><p>* <strong>What is my portability requirement? </strong>Keep your intelligence in plain text files. Not in platform memory systems. Not in hosted environments that cannot export. The portability decision is made at file creation, not at migration time.</p><p>* <strong>Do I need scheduling, or repeatability? </strong>These are different requirements with different solutions. Conflating them produces unnecessary infrastructure and unnecessary frustration.</p></blockquote><p><em><strong>Claude is a flexible system of choices, not a single product with fixed behaviour. The practitioners who get consistent, high-quality results from it are not the ones who found a workaround. They are the ones who treated architecture as a variable -- and measured what changed when they adjusted it.</strong></em></p><p><em><strong>For now, enjoy the journey. </strong></em></p><h3><strong>Series References</strong></h3><p><a href="https://interestingengineering.substack.com/p/the-architecture-of-awareness-design">The Architecture of Awareness</a></p><p>V1-V4 agent design, ASCRS system, file architecture</p><p><a href="https://interestingengineering.substack.com/p/the-harness-experiment">The Harness Experiment</a></p><p>H1-H10 results, ten architectures one task</p><p><a href="https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated">ASCRS Harness Lab -- The Integrated Agentic Stack</a></p><p>Full rubric, gold answer, scored results</p><p><a href="https://interestingengineering.substack.com/p/the-prompt-is-not-the-architecture">The Prompt Is Not the Architecture</a></p><p>Anthropic Prompting Playbook reconciled with ASCRS findings</p><p><a href="https://interestingengineering.substack.com/p/harness-engineering-scaffolding-a">Harness Engineering -- Scaffolding A Small Model</a></p><p>Right scaffolding beats upgrading the model</p><p><a href="https://interestingengineering.substack.com/p/the-token-tax">The Token Tax</a></p><p>Harness Engineering Part II: performance to efficiency</p><p><a href="https://interestingengineering.substack.com/p/the-structure-is-the-intelligence">The Structure Is The Intelligence</a></p><p>97% token reduction, API caching discovery</p><p><a href="https://interestingengineering.substack.com/p/every-company-is-an-agent-waiting">Every Company Is An Agent Waiting to Be Decomposed</a></p><p>Graph-of-algorithms framework, decomposition principles</p><p><a href="https://interestingengineering.substack.com/p/why-multi-agent-ai-systems-break">Why Multi-Agent AI Systems Break</a></p><p>Updated framework for failure modes</p><p><a href="https://interestingengineering.substack.com/p/the-geometry-of-unpredictability">The Geometry of Unpredictability</a></p><p>Agentic workflow failures and polymorphism analogy</p><p><a href="https://interestingengineering.substack.com/p/is-claudes-memory-actually-poor">Is Claude&#8217;s Memory Actually Poor?</a></p><p>Controlled experiment: data architecture vs prompt injection</p><p><a href="https://hamzafarooq.github.io/token-optimizer/dashboard/interactive.html#techniques">Hamza Farooq -- Claude Token Optimizer</a></p><p>Four-technique stacking benchmark, claude-sonnet-4-6, May 2026</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Is Claude's Memory Actually Poor?]]></title><description><![CDATA[What a Controlled Experiment Reveals About Data Architecture, Context Injection, and Where the Gap Actually Lives: A Task Based Experiment]]></description><link>https://interestingengineering.substack.com/p/is-claudes-memory-actually-poor</link><guid isPermaLink="false">https://interestingengineering.substack.com/p/is-claudes-memory-actually-poor</guid><dc:creator><![CDATA[Interesting Engineering ++]]></dc:creator><pubDate>Tue, 02 Jun 2026 18:41:45 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!1K-3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11197c36-20a3-40cb-ad5b-da0011ebd756_1099x589.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1K-3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11197c36-20a3-40cb-ad5b-da0011ebd756_1099x589.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1K-3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11197c36-20a3-40cb-ad5b-da0011ebd756_1099x589.png 424w, https://substackcdn.com/image/fetch/$s_!1K-3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11197c36-20a3-40cb-ad5b-da0011ebd756_1099x589.png 848w, https://substackcdn.com/image/fetch/$s_!1K-3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11197c36-20a3-40cb-ad5b-da0011ebd756_1099x589.png 1272w, https://substackcdn.com/image/fetch/$s_!1K-3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11197c36-20a3-40cb-ad5b-da0011ebd756_1099x589.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1K-3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11197c36-20a3-40cb-ad5b-da0011ebd756_1099x589.png" width="1099" height="589" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/11197c36-20a3-40cb-ad5b-da0011ebd756_1099x589.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:589,&quot;width&quot;:1099,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1031317,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200331220?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11197c36-20a3-40cb-ad5b-da0011ebd756_1099x589.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1K-3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11197c36-20a3-40cb-ad5b-da0011ebd756_1099x589.png 424w, https://substackcdn.com/image/fetch/$s_!1K-3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11197c36-20a3-40cb-ad5b-da0011ebd756_1099x589.png 848w, https://substackcdn.com/image/fetch/$s_!1K-3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11197c36-20a3-40cb-ad5b-da0011ebd756_1099x589.png 1272w, https://substackcdn.com/image/fetch/$s_!1K-3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11197c36-20a3-40cb-ad5b-da0011ebd756_1099x589.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>A Concern I Sometimes Hear</strong></h2><p>&#8220;<em>Claude has no persistent memory between sessions&#8221;. &#8220;Everything is forgotten when the conversation ends.&#8221; &#8220;The native memory tools are unreliable, and fixing this properly requires third-party databases, vector stores, or external memory services.</em>&#8221;</p><p>I have heard the above said in different ways, many times. Even just yesterday.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>This concern is widely held, and it is not entirely wrong. Claude does not natively remember all previous conversations. But <strong><mark>Claude features memory capabilities that allow it to reference previous interactions and build context over time</mark></strong>. Instead of operating entirely in isolation, you can <strong><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">enable settings</mark></strong> that allow the AI to actively search your chat history and retain personal preferences across new sessions</p><p>Logically, with code, without any architecture or &#8220;harness&#8221; around it, each session begins as a blank slate. The question worth testing is whether the conclusion that follows -- that external memory services are required -- is actually correct. For this - Claude Code.</p><h2><strong>Continue watching the space for memory research&#8230;</strong></h2><p>I did recently look into three independent research projects on AI memory in April 2026: <strong>MEMENTO (Microsoft Research), MemPalace, and AutoResearch (Karpathy)</strong>. The article was not completed as a published synthesis, however, because I started including them in various experiments (which I will write about more comprehensively, at a later date, as the space evolves).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hLyN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F135c16db-5f82-4db3-8561-4c7d3d7e3598_1133x612.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hLyN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F135c16db-5f82-4db3-8561-4c7d3d7e3598_1133x612.png 424w, https://substackcdn.com/image/fetch/$s_!hLyN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F135c16db-5f82-4db3-8561-4c7d3d7e3598_1133x612.png 848w, https://substackcdn.com/image/fetch/$s_!hLyN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F135c16db-5f82-4db3-8561-4c7d3d7e3598_1133x612.png 1272w, https://substackcdn.com/image/fetch/$s_!hLyN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F135c16db-5f82-4db3-8561-4c7d3d7e3598_1133x612.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hLyN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F135c16db-5f82-4db3-8561-4c7d3d7e3598_1133x612.png" width="1133" height="612" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/135c16db-5f82-4db3-8561-4c7d3d7e3598_1133x612.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:612,&quot;width&quot;:1133,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1152061,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200331220?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F135c16db-5f82-4db3-8561-4c7d3d7e3598_1133x612.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hLyN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F135c16db-5f82-4db3-8561-4c7d3d7e3598_1133x612.png 424w, https://substackcdn.com/image/fetch/$s_!hLyN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F135c16db-5f82-4db3-8561-4c7d3d7e3598_1133x612.png 848w, https://substackcdn.com/image/fetch/$s_!hLyN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F135c16db-5f82-4db3-8561-4c7d3d7e3598_1133x612.png 1272w, https://substackcdn.com/image/fetch/$s_!hLyN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F135c16db-5f82-4db3-8561-4c7d3d7e3598_1133x612.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>MemPalace is built on the principle that everything should be stored verbatim</strong> -- no AI decides what to forget. <strong>MEMENTO&#8217;s approach is compression-first </strong>-- it actively summarises and compresses episodic memory across sessions. These two positions are structurally incompatible. A synthesis presenting them as complementary would have been inaccurate. So I made the decision not to publish (whilst experimenting) rather than publish something misleading or inconsistent.</p><p>That conflict is noted here because it is the honest position: <strong>long-term AI memory at the level of complex multi-session knowledge accumulation remains an open and contested research problem</strong>. Practitioners should be cautious about any tool or article claiming to have resolved it cleanly. For now.</p><p>What the experiment below demonstrates is narrower and more immediately useful: <strong><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">for session-level and task-level memory -- the kind needed to do a specific job well</mark></strong><mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);"> </mark>--<mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);"> the answer is more architectural than technical</mark>, and the tools are simpler than the concern suggests.</p><h2><strong>The experiment</strong></h2><p>The ASCRS pharmaceutical supply chain crisis task has been used throughout this series as a controlled benchmark domain. A Strait of Hormuz closure scenario. 23 purchase orders across three priority tiers. Carrier availability constraints, financial parameters, gate sequencing rules, and four deliberate reasoning traps embedded in the data. A gold-standard answer -- a full CFO-approvable brief -- prepared in advance, against which all outputs are scored using a six-criterion rubric.</p><p>This experiment was designed to test a simple question: <strong>does injecting context at session start measurably improve performance against a scored rubric</strong>? It ran in Claude Code, in separate sessions, in the experiments/memory-ab/ subfolder of my ascrs-harness-lab project.</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;9fef61f9-4356-407e-aa28-989d577c0e00&quot;,&quot;caption&quot;:&quot;Had some time on my hands, and applied the features of The Harness Experiment(s) to the Architecture of Awareness design considerations. You will remember from The Harness Experiment (applied to a mini vendor analysis case study) that the results presented as follows:&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;ASCRS Harness Lab - The Integrated Agentic Stack: When Does More Architecture Mean Better AI? A Diagnostic Teardown&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:124460392,&quot;name&quot;:&quot;Interesting Engineering ++&quot;,&quot;bio&quot;:&quot;I spend my time learning about, and understanding our complex world better. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/977225f0-cc19-41f4-9df4-e21d01541411_347x347.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-05-16T17:52:19.700Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!cv0d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fb21fa-dc0a-4c0f-8ce4-6b2e85594843_1160x595.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:198013155,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:2,&quot;comment_count&quot;:2,&quot;publication_id&quot;:1335585,&quot;publication_name&quot;:&quot;Interesting Engineering++&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!-M9w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05150353-1bdc-48d2-b72c-c0bd499513eb_1024x1024.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h2><strong>What the experiment design revealed</strong></h2><p>The original design called for a direct comparison between a session with no context and a session with BOOTSTRAP_PROMPT.md injected. But in my case both returned alpha = 1.0.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_H_F!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23a5c69b-d3c4-4a50-93bf-5351779446ea_1098x589.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_H_F!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23a5c69b-d3c4-4a50-93bf-5351779446ea_1098x589.png 424w, https://substackcdn.com/image/fetch/$s_!_H_F!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23a5c69b-d3c4-4a50-93bf-5351779446ea_1098x589.png 848w, https://substackcdn.com/image/fetch/$s_!_H_F!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23a5c69b-d3c4-4a50-93bf-5351779446ea_1098x589.png 1272w, https://substackcdn.com/image/fetch/$s_!_H_F!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23a5c69b-d3c4-4a50-93bf-5351779446ea_1098x589.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_H_F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23a5c69b-d3c4-4a50-93bf-5351779446ea_1098x589.png" width="1098" height="589" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/23a5c69b-d3c4-4a50-93bf-5351779446ea_1098x589.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:589,&quot;width&quot;:1098,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:994870,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200331220?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23a5c69b-d3c4-4a50-93bf-5351779446ea_1098x589.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_H_F!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23a5c69b-d3c4-4a50-93bf-5351779446ea_1098x589.png 424w, https://substackcdn.com/image/fetch/$s_!_H_F!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23a5c69b-d3c4-4a50-93bf-5351779446ea_1098x589.png 848w, https://substackcdn.com/image/fetch/$s_!_H_F!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23a5c69b-d3c4-4a50-93bf-5351779446ea_1098x589.png 1272w, https://substackcdn.com/image/fetch/$s_!_H_F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23a5c69b-d3c4-4a50-93bf-5351779446ea_1098x589.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The first instinct was to treat this as a failed experiment. It was not. </p><p>It revealed that the original data file -- disruption_context.json -- was already doing the context work. The file contained <strong>embedded _agent_trap</strong> fields describing each reasoning trap explicitly, and a <strong>weighted_median_derivation</strong> section laying out the financial methodology. The data file was self-documenting. It was supplying the domain knowledge that BOOTSTRAP was expected to supply.</p><p>So I redesigned the experiment. A stripped version of the data file was created -- disruption_context_raw.json -- with the embedded guidance fields removed. Condition A was re-run using the raw file, with no BOOTSTRAP. Condition B remained as run: the original rich data file, no BOOTSTRAP. The variable between the two conditions was data architecture alone.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PNvd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7a276f-0176-4042-807e-dc77b29c86f4_1115x590.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PNvd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7a276f-0176-4042-807e-dc77b29c86f4_1115x590.png 424w, https://substackcdn.com/image/fetch/$s_!PNvd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7a276f-0176-4042-807e-dc77b29c86f4_1115x590.png 848w, https://substackcdn.com/image/fetch/$s_!PNvd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7a276f-0176-4042-807e-dc77b29c86f4_1115x590.png 1272w, https://substackcdn.com/image/fetch/$s_!PNvd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7a276f-0176-4042-807e-dc77b29c86f4_1115x590.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PNvd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7a276f-0176-4042-807e-dc77b29c86f4_1115x590.png" width="1115" height="590" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7d7a276f-0176-4042-807e-dc77b29c86f4_1115x590.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:590,&quot;width&quot;:1115,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:982085,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200331220?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7a276f-0176-4042-807e-dc77b29c86f4_1115x590.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PNvd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7a276f-0176-4042-807e-dc77b29c86f4_1115x590.png 424w, https://substackcdn.com/image/fetch/$s_!PNvd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7a276f-0176-4042-807e-dc77b29c86f4_1115x590.png 848w, https://substackcdn.com/image/fetch/$s_!PNvd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7a276f-0176-4042-807e-dc77b29c86f4_1115x590.png 1272w, https://substackcdn.com/image/fetch/$s_!PNvd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7a276f-0176-4042-807e-dc77b29c86f4_1115x590.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Results</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tSu7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89552864-bd1d-4c6d-98f3-e9401071d3b0_724x283.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tSu7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89552864-bd1d-4c6d-98f3-e9401071d3b0_724x283.png 424w, https://substackcdn.com/image/fetch/$s_!tSu7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89552864-bd1d-4c6d-98f3-e9401071d3b0_724x283.png 848w, https://substackcdn.com/image/fetch/$s_!tSu7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89552864-bd1d-4c6d-98f3-e9401071d3b0_724x283.png 1272w, https://substackcdn.com/image/fetch/$s_!tSu7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89552864-bd1d-4c6d-98f3-e9401071d3b0_724x283.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tSu7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89552864-bd1d-4c6d-98f3-e9401071d3b0_724x283.png" width="724" height="283" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/89552864-bd1d-4c6d-98f3-e9401071d3b0_724x283.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:283,&quot;width&quot;:724,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:32001,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200331220?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89552864-bd1d-4c6d-98f3-e9401071d3b0_724x283.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tSu7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89552864-bd1d-4c6d-98f3-e9401071d3b0_724x283.png 424w, https://substackcdn.com/image/fetch/$s_!tSu7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89552864-bd1d-4c6d-98f3-e9401071d3b0_724x283.png 848w, https://substackcdn.com/image/fetch/$s_!tSu7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89552864-bd1d-4c6d-98f3-e9401071d3b0_724x283.png 1272w, https://substackcdn.com/image/fetch/$s_!tSu7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89552864-bd1d-4c6d-98f3-e9401071d3b0_724x283.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The gap of 0.175 was not uniform across criteria. Four of six criteria held at 1.0 in Condition A. Two dropped to 0.5.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KNtE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ef40e68-dfa1-4c5f-8ebf-de54cdfeec62_725x459.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KNtE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ef40e68-dfa1-4c5f-8ebf-de54cdfeec62_725x459.png 424w, https://substackcdn.com/image/fetch/$s_!KNtE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ef40e68-dfa1-4c5f-8ebf-de54cdfeec62_725x459.png 848w, https://substackcdn.com/image/fetch/$s_!KNtE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ef40e68-dfa1-4c5f-8ebf-de54cdfeec62_725x459.png 1272w, https://substackcdn.com/image/fetch/$s_!KNtE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ef40e68-dfa1-4c5f-8ebf-de54cdfeec62_725x459.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KNtE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ef40e68-dfa1-4c5f-8ebf-de54cdfeec62_725x459.png" width="725" height="459" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3ef40e68-dfa1-4c5f-8ebf-de54cdfeec62_725x459.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:459,&quot;width&quot;:725,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:59272,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200331220?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ef40e68-dfa1-4c5f-8ebf-de54cdfeec62_725x459.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KNtE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ef40e68-dfa1-4c5f-8ebf-de54cdfeec62_725x459.png 424w, https://substackcdn.com/image/fetch/$s_!KNtE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ef40e68-dfa1-4c5f-8ebf-de54cdfeec62_725x459.png 848w, https://substackcdn.com/image/fetch/$s_!KNtE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ef40e68-dfa1-4c5f-8ebf-de54cdfeec62_725x459.png 1272w, https://substackcdn.com/image/fetch/$s_!KNtE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ef40e68-dfa1-4c5f-8ebf-de54cdfeec62_725x459.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The pattern however, was precise. R1 and R4 dropped because both depend on <strong>domain-specific parameters</strong> that exist only in institutional practice: a 60/40 tier weighting schema and a ~340-operator aggregation methodology. Neither can be inferred from raw data. <strong>Neither is available through general reasoning</strong>. The model handled both correctly in Condition B because the original data file had embedded that methodology explicitly.</p><p>R2, R3, R5, and R6 held at 1.0 in both conditions because their answers are derivable from data structure. The carrier routing constraint for PO-2853 was present in the raw data. The gate schema and G4/G7 dependency were present in the raw data. The epistemic uncertainty tiers and trigger derivation were present in the raw data. The model did not need institutional guidance to answer those criteria correctly -- it needed the facts, which the raw file contained.</p><p>The rubric measured the following:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rwCq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc65226e7-e905-4ab7-a9ab-9b782a93fad1_913x434.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rwCq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc65226e7-e905-4ab7-a9ab-9b782a93fad1_913x434.png 424w, https://substackcdn.com/image/fetch/$s_!rwCq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc65226e7-e905-4ab7-a9ab-9b782a93fad1_913x434.png 848w, https://substackcdn.com/image/fetch/$s_!rwCq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc65226e7-e905-4ab7-a9ab-9b782a93fad1_913x434.png 1272w, https://substackcdn.com/image/fetch/$s_!rwCq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc65226e7-e905-4ab7-a9ab-9b782a93fad1_913x434.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rwCq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc65226e7-e905-4ab7-a9ab-9b782a93fad1_913x434.png" width="913" height="434" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c65226e7-e905-4ab7-a9ab-9b782a93fad1_913x434.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:434,&quot;width&quot;:913,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:311506,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200331220?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc65226e7-e905-4ab7-a9ab-9b782a93fad1_913x434.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rwCq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc65226e7-e905-4ab7-a9ab-9b782a93fad1_913x434.png 424w, https://substackcdn.com/image/fetch/$s_!rwCq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc65226e7-e905-4ab7-a9ab-9b782a93fad1_913x434.png 848w, https://substackcdn.com/image/fetch/$s_!rwCq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc65226e7-e905-4ab7-a9ab-9b782a93fad1_913x434.png 1272w, https://substackcdn.com/image/fetch/$s_!rwCq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc65226e7-e905-4ab7-a9ab-9b782a93fad1_913x434.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>What this demonstrates</strong></h2><p>The gap between Condition A and Condition B is not a gap in model capability. The model did not change between conditions. It is a gap in what the model was given to work with -- specifically, <strong>the absence of two institutional parameters that exist only in domain-specific practice.</strong></p><p>As a quick note, the way I use Claude&#8217;s context window, despite the 1m context window: basically beyond 200,000 tokens -- approximately 150,000 words, I start worrying about the &#8220;dumb zone&#8221; applying. Want to find out more: <a href="https://devinterrupted.substack.com/p/dex-horthy-on-ralph-rpi-and-escaping">Dex Horty covers a lot</a> here.  </p><p>When context is injected at session start, whether via a rich data file or via BOOTSTRAP_PROMPT.md, the model is not retrieving memory from somewhere external. It is carrying the <strong>memory in its active context</strong>. <strong>Retrieval can fail, decay, or hallucinate under pressure.</strong> Injection simply loads. The distinction is architectural, not cosmetic.</p><p>The experiment also highlights a choice that most practitioners do not realise they have. <strong>Context engineering operates at two layers</strong>, not just one:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uJ-W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69f160cf-f78e-4ddb-a237-d7be194a3b49_1093x589.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uJ-W!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69f160cf-f78e-4ddb-a237-d7be194a3b49_1093x589.png 424w, https://substackcdn.com/image/fetch/$s_!uJ-W!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69f160cf-f78e-4ddb-a237-d7be194a3b49_1093x589.png 848w, https://substackcdn.com/image/fetch/$s_!uJ-W!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69f160cf-f78e-4ddb-a237-d7be194a3b49_1093x589.png 1272w, https://substackcdn.com/image/fetch/$s_!uJ-W!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69f160cf-f78e-4ddb-a237-d7be194a3b49_1093x589.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uJ-W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69f160cf-f78e-4ddb-a237-d7be194a3b49_1093x589.png" width="1093" height="589" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/69f160cf-f78e-4ddb-a237-d7be194a3b49_1093x589.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:589,&quot;width&quot;:1093,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:944787,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200331220?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69f160cf-f78e-4ddb-a237-d7be194a3b49_1093x589.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uJ-W!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69f160cf-f78e-4ddb-a237-d7be194a3b49_1093x589.png 424w, https://substackcdn.com/image/fetch/$s_!uJ-W!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69f160cf-f78e-4ddb-a237-d7be194a3b49_1093x589.png 848w, https://substackcdn.com/image/fetch/$s_!uJ-W!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69f160cf-f78e-4ddb-a237-d7be194a3b49_1093x589.png 1272w, https://substackcdn.com/image/fetch/$s_!uJ-W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69f160cf-f78e-4ddb-a237-d7be194a3b49_1093x589.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p>* <strong>The data layer -- </strong>how your data file is structured (nevermind the where). A self-documenting data file that embeds <strong>domain guidance, methodology notes, and constraint descriptions</strong> is doing context work before any prompt is written. The model reads it as part of the task and reasons from it correctly.</p><p>* <strong>The prompt layer -- </strong>what you <strong>inject at session start (or add as you test)</strong> via BOOTSTRAP or a <strong>system prompt</strong>. This layer compensates for what the data layer cannot or does not carry -- <strong>institutional parameters, weighting schemas, bespoke methodologies, business-specific thresholds. Unless of course these were included within the data layer, in the first place.</strong></p></blockquote><p><strong>Both layers are valid. Both compensate for the absence of the other. The choice between them depends on what you control. What needs to persist, or evolve.</strong></p><p><strong>If you control how your data is structured -- your own pipelines, your own files -- you can embed domain guidance directly</strong>. The model reads it as data and reasons from it without any additional prompt engineering. <strong>If your data comes from an external source you cannot modify -- raw exports, third-party feeds, legacy files with no embedded metadata -- then BOOTSTRAP becomes essential</strong> because the data layer is not available to you.</p><p><strong>The two-layer audit question</strong></p><p>Look at your rubric criteria. For each one, ask:</p><p>Can the model answer this correctly from the data file alone?</p><p><strong>Yes</strong> --&gt; data structure is sufficient for this criterion.</p><p><strong>No </strong>--&gt; this criterion needs BOOTSTRAP.</p><p>Write BOOTSTRAP to cover the gap, not to repeat the data.</p><p>Redundant context is noise. Targeted context is leverage.</p><h2><strong>The architecture in practice</strong></h2><p>Across every harness experiment in this series, a two-file context architecture has been in consistent use without any third-party memory service:</p><blockquote><p>* <strong>MASTER_GUIDE.md -- </strong>the long-term reference document. Alsways portable. Full experimental history, domain knowledge, architectural decisions, rubric designs. Updated after every experiment. Read in the Claude.ai chat window for design and analysis work between sessions. Or capture it in <strong>CLAUDE.md or AGENTS.md</strong></p><p>* <strong>BOOTSTRAP_PROMPT.md -- </strong>the session-start injection file. Contains only what Claude Code needs for the current task, distilled from MASTER_GUIDE. Loaded at the start of every Claude Code session before any task runs.</p></blockquote><p><strong>Neither file is platform-specific. Both are plain markdown. Both are readable by any AI tool.</strong> The pattern is not a Claude workaround -- it is a general principle: your intelligence lives in files you own, not in a platform&#8217;s memory system. But it can if you choose.</p><p>The experiment adds precision to how BOOTSTRAP should or can be written. The criteria that held at 1.0 in Condition A did not need BOOTSTRAP coverage -- they were handled correctly from data alone. Including them in BOOTSTRAP would have added tokens without adding value. The criteria that dropped -- R1 and R4 -- are exactly what BOOTSTRAP should carry: the 60/40 weighting schema, the aggregation methodology, the institutional parameters the data file cannot provide.</p><p>A lean, targeted BOOTSTRAP outperforms a comprehensive one. This connects to a separate finding documented in Article A: every token in CLAUDE.md and BOOTSTRAP is re-sent on every request. Dead weight in those files is a recurring cost, not a one-time investment. <strong>Independent benchmarking shows that trimming a context file from 3,847 to 312 tokens -- removing boilerplate and keeping only what the model acts on -- reduces per-session token cost by 91.9%.</strong> I did it differently but with the same contextual impact, with <a href="https://interestingengineering.substack.com/p/the-structure-is-the-intelligence">The Structure Is Intelligence</a>! <mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">The discipline that produces a better experiment also produces a cheaper one.</mark></p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;87782560-4a0e-484b-9064-c77bc41ede99&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;The Structure Is The Intelligence&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:124460392,&quot;name&quot;:&quot;Interesting Engineering ++&quot;,&quot;bio&quot;:&quot;I spend my time learning about, and understanding our complex world better. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/977225f0-cc19-41f4-9df4-e21d01541411_347x347.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-05-28T19:12:24.790Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Vnz9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cf79cb4-0cd1-49e1-b750-1abc62fe754e_1205x659.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://interestingengineering.substack.com/p/the-structure-is-the-intelligence&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:199597361,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:2,&quot;comment_count&quot;:1,&quot;publication_id&quot;:1335585,&quot;publication_name&quot;:&quot;Interesting Engineering++&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!-M9w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05150353-1bdc-48d2-b72c-c0bd499513eb_1024x1024.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h2><strong>What this does not solve</strong></h2><p>My experiment demonstrates <strong>session-level context injection</strong>. It does not demonstrate <strong>long-term autonomous memory accumulation across many sessions without human involvement</strong> -- the kind where an agent builds a knowledge base over weeks/months and queries it intelligently. That is a harder problem. At least, for now, most of us have not had that lenght of time to test these very new tools that only just presented themselves in the last quarter of last year. So, time will tell. The research landscape I surveyed offers partial answers but no clean solution, and the philosophical conflict between verbatim storage and compression-first approaches remains unresolved.</p><p>The two-file architecture also does not <strong>scale indefinitely without maintenance</strong>. A MASTER_GUIDE that grows without pruning eventually contains outdated decisions alongside current ones. The discipline is the same as any <strong>knowledge management practice: regular review, deliberate pruning, honest updating</strong>. The architecture provides the structure. The practitioner provides the upkeep.</p><h2><strong>So?</strong></h2><p>The experiment produced an alpha gap of 0.175 between a raw-data-only condition and a self-documenting-data-only condition, concentrated in R1 (tier prioritisation, 0.5) and R4 (financial aggregation methodology, 0.5). Four criteria held at 1.0 regardless of context, because their answers are derivable from data structure alone.</p><p>The finding is more specific and more useful than a simple before/after comparison: BOOTSTRAP injection does not improve performance uniformly. It closes the gap precisely on criteria requiring institutional parameters -- weighting schemas, bespoke methodologies, business-specific thresholds -- that neither raw data nor general reasoning can supply. The model&#8217;s reasoning capability was present in both conditions. What differed was the availability of the parameters it needed to apply that capability correctly.</p><p>The practical implication is an audit, not a prescription. Many things, even memory, are fixable with a little bit of engineering. </p><p>Identify which criteria in your rubric depend on institutional knowledge. Ensure your data file or your BOOTSTRAP carries those criteria explicitly. Do not write context that repeats what the data already says. <strong>Context engineering is not about loading everything the model might need. It is about identifying the specific gap between what the model can derive and what it needs to be told</strong> -- and closing exactly that gap, nothing more.</p><p>Note: <strong>Anthropic <a href="https://www.digitalapplied.com/blog/ai-agent-memory-vector-graph-episodic-2026">Dreaming</a></strong> shipped May 6, 2026 for Claude Managed Agents (CMA). It is an <strong>asynchronous between-session process that reviews session transcripts and existing memory stores, extracts patterns, merges duplicates, replaces stale entries, and writes reorganised memory entries that future sessions can use</strong>. Anthropic explicitly <mark data-color="#ffff00" style="background-color: rgb(255, 255, 0); color: rgb(0, 0, 0);">models it on hippocampal memory consolidation</mark>. So a lot of interesting solutions to a significant issue.</p><p>However, the concern that opened this article -- that Claude&#8217;s memory is poor and requires third-party tools to fix -- is a real observation about default behaviour. The conclusion here does not support it. <strong>The model&#8217;s default behaviour may be a blank slate. A good practitioner&#8217;s job is to decide how and what needs to go into memory, at which layer, and why in the most (cost) efficient, persistent manner possible. Skills you develop as you get to know your AI better. And technology that enables it - progress of which is evolving in the right direction, fast.</strong></p><p><strong>References</strong></p><p><a href="https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated">ASCRS Harness Lab -- The Integrated Agentic Stack</a><strong> </strong><em> MASTER_GUIDE / BOOTSTRAP architecture in use throughout. Rubric and gold answer design.</em></p><p><a href="https://interestingengineering.substack.com/p/the-architecture-of-awareness-design">The Architecture of Awareness</a><em> V1-V4 agent design. ASCRS system and file architecture introduction.</em></p><p><a href="https://interestingengineering.substack.com/p/the-structure-is-the-intelligence">The Structure Is The Intelligence</a><strong> </strong><em>Decomposition experiment. Context file efficiency and caching findings.</em></p><p><a href="https://hamzafarooq.github.io/token-optimizer/dashboard/interactive.html#techniques">Hamza Farooq -- Claude Token Optimizer</a><em> CLAUDE.md trim benchmark: 3,847 to 312 tokens, 91.9% reduction. Context file efficiency.</em></p><p><em><a href="https://mem0.ai/blog/state-of-ai-agent-memory-2026">State of Agent Memory</a> 2026</em></p><p><em><a href="https://www.digitalapplied.com/blog/ai-agent-memory-vector-graph-episodic-2026">Agent Memory in 2026 - Dreaming, Memory Bank and The Long Context Shift</a></em></p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Every Company Is an Agent Waiting to Be Decomposed]]></title><description><![CDATA[Miessler's Graph of Algorithms, Claude /workflows, and Why the Five-Phase Audit Works on (just about) Anything]]></description><link>https://interestingengineering.substack.com/p/every-company-is-an-agent-waiting</link><guid isPermaLink="false">https://interestingengineering.substack.com/p/every-company-is-an-agent-waiting</guid><dc:creator><![CDATA[Interesting Engineering ++]]></dc:creator><pubDate>Tue, 02 Jun 2026 09:29:56 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!3-aL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f493fa-f342-45da-af6b-e7757b6ecf20_1139x621.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3-aL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f493fa-f342-45da-af6b-e7757b6ecf20_1139x621.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3-aL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f493fa-f342-45da-af6b-e7757b6ecf20_1139x621.png 424w, https://substackcdn.com/image/fetch/$s_!3-aL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f493fa-f342-45da-af6b-e7757b6ecf20_1139x621.png 848w, https://substackcdn.com/image/fetch/$s_!3-aL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f493fa-f342-45da-af6b-e7757b6ecf20_1139x621.png 1272w, https://substackcdn.com/image/fetch/$s_!3-aL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f493fa-f342-45da-af6b-e7757b6ecf20_1139x621.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3-aL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f493fa-f342-45da-af6b-e7757b6ecf20_1139x621.png" width="1139" height="621" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e6f493fa-f342-45da-af6b-e7757b6ecf20_1139x621.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:621,&quot;width&quot;:1139,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1236997,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200234529?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f493fa-f342-45da-af6b-e7757b6ecf20_1139x621.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3-aL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f493fa-f342-45da-af6b-e7757b6ecf20_1139x621.png 424w, https://substackcdn.com/image/fetch/$s_!3-aL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f493fa-f342-45da-af6b-e7757b6ecf20_1139x621.png 848w, https://substackcdn.com/image/fetch/$s_!3-aL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f493fa-f342-45da-af6b-e7757b6ecf20_1139x621.png 1272w, https://substackcdn.com/image/fetch/$s_!3-aL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f493fa-f342-45da-af6b-e7757b6ecf20_1139x621.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_LdY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a051207-eb94-4230-b92e-4c80ec8e29c7_1134x619.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_LdY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a051207-eb94-4230-b92e-4c80ec8e29c7_1134x619.png 424w, https://substackcdn.com/image/fetch/$s_!_LdY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a051207-eb94-4230-b92e-4c80ec8e29c7_1134x619.png 848w, https://substackcdn.com/image/fetch/$s_!_LdY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a051207-eb94-4230-b92e-4c80ec8e29c7_1134x619.png 1272w, https://substackcdn.com/image/fetch/$s_!_LdY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a051207-eb94-4230-b92e-4c80ec8e29c7_1134x619.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_LdY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a051207-eb94-4230-b92e-4c80ec8e29c7_1134x619.png" width="1134" height="619" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5a051207-eb94-4230-b92e-4c80ec8e29c7_1134x619.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:619,&quot;width&quot;:1134,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1094394,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200234529?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a051207-eb94-4230-b92e-4c80ec8e29c7_1134x619.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_LdY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a051207-eb94-4230-b92e-4c80ec8e29c7_1134x619.png 424w, https://substackcdn.com/image/fetch/$s_!_LdY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a051207-eb94-4230-b92e-4c80ec8e29c7_1134x619.png 848w, https://substackcdn.com/image/fetch/$s_!_LdY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a051207-eb94-4230-b92e-4c80ec8e29c7_1134x619.png 1272w, https://substackcdn.com/image/fetch/$s_!_LdY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a051207-eb94-4230-b92e-4c80ec8e29c7_1134x619.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!45rO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815182c4-b5ee-41c3-b26c-794b63b980db_1136x618.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!45rO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815182c4-b5ee-41c3-b26c-794b63b980db_1136x618.png 424w, https://substackcdn.com/image/fetch/$s_!45rO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815182c4-b5ee-41c3-b26c-794b63b980db_1136x618.png 848w, https://substackcdn.com/image/fetch/$s_!45rO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815182c4-b5ee-41c3-b26c-794b63b980db_1136x618.png 1272w, https://substackcdn.com/image/fetch/$s_!45rO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815182c4-b5ee-41c3-b26c-794b63b980db_1136x618.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!45rO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815182c4-b5ee-41c3-b26c-794b63b980db_1136x618.png" width="1136" height="618" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/815182c4-b5ee-41c3-b26c-794b63b980db_1136x618.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:618,&quot;width&quot;:1136,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1239750,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200234529?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815182c4-b5ee-41c3-b26c-794b63b980db_1136x618.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!45rO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815182c4-b5ee-41c3-b26c-794b63b980db_1136x618.png 424w, https://substackcdn.com/image/fetch/$s_!45rO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815182c4-b5ee-41c3-b26c-794b63b980db_1136x618.png 848w, https://substackcdn.com/image/fetch/$s_!45rO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815182c4-b5ee-41c3-b26c-794b63b980db_1136x618.png 1272w, https://substackcdn.com/image/fetch/$s_!45rO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815182c4-b5ee-41c3-b26c-794b63b980db_1136x618.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Convergence</h2><p>I love and follow <a href="https://danielmiessler.com/archives/">Daniel Miessler&#8217;s writings/musings</a> as religiously as I can, because I find his forward-looking &#8220;guidance&#8221; (this is what i sometimes think of them) way ahead of the curve. A security researcher watching companies accumulate process debt: For example, his observation, that every <a href="https://danielmiessler.com/blog/companies-graph-of-algorithms">business/company is simply a graph of algorithms</a> (the title above an ode adapted to our agentic world today) &#8212; workflows stacked inside workflows, most of them unaudited, most of them wasteful, all of them ripe for AI optimisation once made visible. Those who know me will often hear this - &#8220;Do [X] better.&#8221; Optimize!</p><p>With agentic workflows, and fine-tuned or controlled experiments, I see the possibilities of addressing various corporate process debts, more strongly now.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>For example, because this was recent, in <a href="https://interestingengineering.substack.com/p/the-structure-is-the-intelligence">The Structure Is Intelligence</a>, a controlled engineering experiment: <strong>running a bloated AI agent through four decomposition cycles, watching a 97% token reduction emerge not from a smarter model but from restructuring what the model was given to work with.</strong></p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;3ebbec51-fdb4-4e64-910e-99c84f632787&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;The Structure Is The Intelligence&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:124460392,&quot;name&quot;:&quot;Interesting Engineering ++&quot;,&quot;bio&quot;:&quot;I spend my time learning about, and understanding our complex world better. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/977225f0-cc19-41f4-9df4-e21d01541411_347x347.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-05-28T19:12:24.790Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Vnz9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cf79cb4-0cd1-49e1-b750-1abc62fe754e_1205x659.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://interestingengineering.substack.com/p/the-structure-is-the-intelligence&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:199597361,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:2,&quot;comment_count&quot;:1,&quot;publication_id&quot;:1335585,&quot;publication_name&quot;:&quot;Interesting Engineering++&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!-M9w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05150353-1bdc-48d2-b72c-c0bd499513eb_1024x1024.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>The experiment proved the theory. The theory explains why the experiment worked. This note maps the two together &#8212; and gives you the tools to apply both to any system you encounter.</p><blockquote><p><em><strong>&#128279; How to read this piece</strong></em></p><p><em>This is a supplementary note to &#8216;<a href="https://interestingengineering.substack.com/p/the-structure-is-the-intelligence">The Structure Is the Intelligence</a>&#8217;. That article documented the full StockPilot decomposition experiment: Cycles 0&#8211;4, three API configurations, every failure documented (it was adapted from Anthropic but with a few twists i did not expect to find esp on APIs). This piece draws the broader implications using Miessler&#8217;s framework and Claude&#8217;s /workflows system.</em></p><p><em>You do not need to have read the original &#8212; but the numbers cited here come from it.</em></p></blockquote><h2>Miessler&#8217;s Big Ideas</h2><p>Two highly relevant essays form a coherent theory of how organisations should think about AI &#8212; and why most are getting it wrong.</p><h3>Idea 1 &#8212; Companies Are Just a Graph of Algorithms (May 2024)</h3><p>The starting point is deceptively simple: every company is a collection of processes, and every process is a series of steps &#8212; an algorithm. Those algorithms connect to each other (sales feeds operations; operations feeds delivery; delivery feeds support), forming a graph. Most companies have never mapped this graph explicitly. They run it from institutional memory, tribal knowledge, and accumulated habit.</p><p>Miessler&#8217;s argument is that <strong>AI changes the stakes of opacity</strong>. A business that cannot describe its own processes as explicit algorithms cannot hand them to AI for optimisation &#8212; and a competitor that can will do it first. The exercise of mapping the graph is valuable on its own. With AI, it becomes <strong>a competitive moat</strong>.</p><p>The recursive insight is equally important: <strong>every algorithm can be broken into sub-algorithms</strong>. The process for handling a customer complaint contains a process for reading the complaint, a process for retrieving account history, a process for drafting a response. Each of those can be broken further. <strong>It is algorithms all the way down</strong> &#8212; <strong>and every level is a potential optimisation target</strong>. </p><p>Yes, if you&#8217;re thinking it - the opposite of &#8220;turtles all the way down&#8221;. Then again - unless you have a bunch that real do not understand the &#8220;what and why&#8217;s&#8221; of their actions. This happens a lot. Not surprising. Non-trivial.</p><blockquote><p><em><strong>&#128161; The key line from Miessler&#8217;s essay</strong></em></p><p><em>&#8220;AI excels at both discrete task execution and determining how things fit together, and every single one of your company&#8217;s workflow components becomes ripe for optimisation or elimination.&#8221;</em></p><p><em><strong>Substitute &#8216;company&#8217; with &#8216;agent&#8217; </strong>and this is exactly what the <strong>StockPilot experiment </strong>demonstrated across four cycles.</em></p></blockquote><h3>Idea 2 &#8212; <a href="https://danielmiessler.com/blog/policy-sops-and-ai-are-all-you-need">Policy, SOPs, and AI Are All You Need </a>(Sept 2024)</h3><p>The second essay operationalises the first. If companies are graphs of algorithms, what are the components those algorithms are made of? Miessler&#8217;s answer: <strong>Policy, State, SOPs, and Action.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Tthy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8237db6-609f-4b71-aafe-4358ef5789b8_866x442.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Tthy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8237db6-609f-4b71-aafe-4358ef5789b8_866x442.png 424w, https://substackcdn.com/image/fetch/$s_!Tthy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8237db6-609f-4b71-aafe-4358ef5789b8_866x442.png 848w, https://substackcdn.com/image/fetch/$s_!Tthy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8237db6-609f-4b71-aafe-4358ef5789b8_866x442.png 1272w, https://substackcdn.com/image/fetch/$s_!Tthy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8237db6-609f-4b71-aafe-4358ef5789b8_866x442.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Tthy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8237db6-609f-4b71-aafe-4358ef5789b8_866x442.png" width="866" height="442" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e8237db6-609f-4b71-aafe-4358ef5789b8_866x442.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:442,&quot;width&quot;:866,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:43594,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200234529?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8237db6-609f-4b71-aafe-4358ef5789b8_866x442.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Tthy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8237db6-609f-4b71-aafe-4358ef5789b8_866x442.png 424w, https://substackcdn.com/image/fetch/$s_!Tthy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8237db6-609f-4b71-aafe-4358ef5789b8_866x442.png 848w, https://substackcdn.com/image/fetch/$s_!Tthy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8237db6-609f-4b71-aafe-4358ef5789b8_866x442.png 1272w, https://substackcdn.com/image/fetch/$s_!Tthy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8237db6-609f-4b71-aafe-4358ef5789b8_866x442.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The model runs as a <strong>loop</strong>: <strong>leaders set Policy, AI gathers State, everything executes according to SOPs, SOPs get updated, repeat</strong>. Miessler claims that this is not a future state &#8212; it is the <strong>direction every well-run organisation is already moving toward, whether consciously or not</strong>.</p><h2>The Experiment as Proof/Confirmation</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XNII!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25ab7ffd-129e-4d82-9d8a-42203669a01f_1135x619.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XNII!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25ab7ffd-129e-4d82-9d8a-42203669a01f_1135x619.png 424w, https://substackcdn.com/image/fetch/$s_!XNII!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25ab7ffd-129e-4d82-9d8a-42203669a01f_1135x619.png 848w, https://substackcdn.com/image/fetch/$s_!XNII!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25ab7ffd-129e-4d82-9d8a-42203669a01f_1135x619.png 1272w, https://substackcdn.com/image/fetch/$s_!XNII!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25ab7ffd-129e-4d82-9d8a-42203669a01f_1135x619.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XNII!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25ab7ffd-129e-4d82-9d8a-42203669a01f_1135x619.png" width="1135" height="619" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/25ab7ffd-129e-4d82-9d8a-42203669a01f_1135x619.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:619,&quot;width&quot;:1135,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1018120,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200234529?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25ab7ffd-129e-4d82-9d8a-42203669a01f_1135x619.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XNII!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25ab7ffd-129e-4d82-9d8a-42203669a01f_1135x619.png 424w, https://substackcdn.com/image/fetch/$s_!XNII!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25ab7ffd-129e-4d82-9d8a-42203669a01f_1135x619.png 848w, https://substackcdn.com/image/fetch/$s_!XNII!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25ab7ffd-129e-4d82-9d8a-42203669a01f_1135x619.png 1272w, https://substackcdn.com/image/fetch/$s_!XNII!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25ab7ffd-129e-4d82-9d8a-42203669a01f_1135x619.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The StockPilot decomposition experiment in essence ran Miessler&#8217;s theory through a controlled test (without knowing it was doing so). The mapping is exact.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ISVU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb333071a-f89f-4767-98bc-0f169f441a08_863x310.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ISVU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb333071a-f89f-4767-98bc-0f169f441a08_863x310.png 424w, https://substackcdn.com/image/fetch/$s_!ISVU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb333071a-f89f-4767-98bc-0f169f441a08_863x310.png 848w, https://substackcdn.com/image/fetch/$s_!ISVU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb333071a-f89f-4767-98bc-0f169f441a08_863x310.png 1272w, https://substackcdn.com/image/fetch/$s_!ISVU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb333071a-f89f-4767-98bc-0f169f441a08_863x310.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ISVU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb333071a-f89f-4767-98bc-0f169f441a08_863x310.png" width="863" height="310" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b333071a-f89f-4767-98bc-0f169f441a08_863x310.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:310,&quot;width&quot;:863,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:34690,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200234529?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb333071a-f89f-4767-98bc-0f169f441a08_863x310.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ISVU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb333071a-f89f-4767-98bc-0f169f441a08_863x310.png 424w, https://substackcdn.com/image/fetch/$s_!ISVU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb333071a-f89f-4767-98bc-0f169f441a08_863x310.png 848w, https://substackcdn.com/image/fetch/$s_!ISVU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb333071a-f89f-4767-98bc-0f169f441a08_863x310.png 1272w, https://substackcdn.com/image/fetch/$s_!ISVU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb333071a-f89f-4767-98bc-0f169f441a08_863x310.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em><strong>The 402-line system prompt was an unaudited company. Policy, State, SOPs, and Action were collapsed into one wall of text, re-read in full on every turn of every task</strong></em>. The decomposition cycles did exactly what Miessler&#8217;s framework prescribes: separated the layers, made each one explicit, and loaded each only when needed.</p><p>The result was not a smarter agent. It was a more efficient structure. Cycle 0 and Cycle 4 ran the same model. The <strong>97% token reduction came entirely from the structure.</strong></p><h3>What the Experiment Added to the Theory</h3><p>Miessler&#8217;s framework describes what to do. The experiment provided three things the framework did not:</p><blockquote><p>&#9642; <strong>A diagnostic methodology: </strong>The <strong>five-phase audit &#8212; context, tools, clients, sub-processes, output contracts </strong>&#8212; is a specific, replicable procedure for finding exactly <strong>where the algorithm graph is broken.</strong> It is not in Miessler&#8217;s essays (but i havent read all his essays - getting to it!). It emerged from documenting every failure across four cycles.</p><p>&#9642; <strong>Measured evidence: 97% token reduction, 95% cost reduction, quality maintained.</strong> The framework predicts optimisation is possible. The experiment quantified how much and identified the mechanism at each step.</p><p>&#9642; <strong>The cycle structure: </strong>Running improvements one at a time &#8212; <strong>skills first, then bash, then sub-agent delegation, then CMA</strong> &#8212; and measuring after each one revealed which intervention produced which effect. That progression was not in any framework. It was earned through the experiment itself.</p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RmpB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F534debf9-8d25-4125-ae3f-7f253666ddcb_1105x589.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RmpB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F534debf9-8d25-4125-ae3f-7f253666ddcb_1105x589.png 424w, https://substackcdn.com/image/fetch/$s_!RmpB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F534debf9-8d25-4125-ae3f-7f253666ddcb_1105x589.png 848w, https://substackcdn.com/image/fetch/$s_!RmpB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F534debf9-8d25-4125-ae3f-7f253666ddcb_1105x589.png 1272w, https://substackcdn.com/image/fetch/$s_!RmpB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F534debf9-8d25-4125-ae3f-7f253666ddcb_1105x589.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RmpB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F534debf9-8d25-4125-ae3f-7f253666ddcb_1105x589.png" width="1105" height="589" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/534debf9-8d25-4125-ae3f-7f253666ddcb_1105x589.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:589,&quot;width&quot;:1105,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:984236,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200234529?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F534debf9-8d25-4125-ae3f-7f253666ddcb_1105x589.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RmpB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F534debf9-8d25-4125-ae3f-7f253666ddcb_1105x589.png 424w, https://substackcdn.com/image/fetch/$s_!RmpB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F534debf9-8d25-4125-ae3f-7f253666ddcb_1105x589.png 848w, https://substackcdn.com/image/fetch/$s_!RmpB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F534debf9-8d25-4125-ae3f-7f253666ddcb_1105x589.png 1272w, https://substackcdn.com/image/fetch/$s_!RmpB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F534debf9-8d25-4125-ae3f-7f253666ddcb_1105x589.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Claude /workflows: Where the Theory Becomes Executable</h2><p>Claude Code&#8217;s /workflows feature is Miessler&#8217;s SOP model made operational. A <strong>workflow is a structured, reusable process &#8212; defined in a file, loaded on demand, executed with conditional branching and typed outputs at each step</strong>. It is, precisely, a single node in the company graph made explicit and executable.</p><p>The connection to the decomposition principles is direct. <strong>A workflow that loads only when triggered is the on-demand skill pattern.</strong> A workflow step that returns a typed result rather than free text is the typed contract pattern. A workflow that calls a sub-process conditionally is the explicit delegation pattern. The /workflows system is the natural home for the architecture the StockPilot experiment produced.</p><h4><strong>&#128204; The structural parallel</strong></h4><blockquote><p><em><strong>Miessler&#8217;s SOP = Claude skill file loaded on demand</strong></em></p><p><em><strong>Miessler&#8217;s Policy = SHORT_PROMPT (identity only, always loaded)</strong></em></p><p><em><strong>Miessler&#8217;s State = bash_execute returning filtered rows, not full CSVs</strong></em></p><p><em><strong>Miessler&#8217;s Action = conditional sub-agent with typed JSON return contract</strong></em></p></blockquote><h2>The Prompts: Apply This to Any System</h2><p>The following four prompts can be used in Claude Code on any codebase, project, or workflow. It&#8217;s not an end-all answer to issues, but a great first start. As with all complex systems, issues will arise. Address them accordingly, methodically. Easier said than done? Start somewhere! </p><p>These steps apply Miessler&#8217;s graph-of-algorithms lens combined with the decomposition methodology from the StockPilot experiment. Use them in sequence for a complete audit-and-rebuild cycle.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PXS8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30aa584e-d42c-4ee7-bbbb-5c723faa8ef5_1130x621.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PXS8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30aa584e-d42c-4ee7-bbbb-5c723faa8ef5_1130x621.png 424w, https://substackcdn.com/image/fetch/$s_!PXS8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30aa584e-d42c-4ee7-bbbb-5c723faa8ef5_1130x621.png 848w, https://substackcdn.com/image/fetch/$s_!PXS8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30aa584e-d42c-4ee7-bbbb-5c723faa8ef5_1130x621.png 1272w, https://substackcdn.com/image/fetch/$s_!PXS8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30aa584e-d42c-4ee7-bbbb-5c723faa8ef5_1130x621.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PXS8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30aa584e-d42c-4ee7-bbbb-5c723faa8ef5_1130x621.png" width="1130" height="621" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/30aa584e-d42c-4ee7-bbbb-5c723faa8ef5_1130x621.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:621,&quot;width&quot;:1130,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1042887,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200234529?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30aa584e-d42c-4ee7-bbbb-5c723faa8ef5_1130x621.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PXS8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30aa584e-d42c-4ee7-bbbb-5c723faa8ef5_1130x621.png 424w, https://substackcdn.com/image/fetch/$s_!PXS8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30aa584e-d42c-4ee7-bbbb-5c723faa8ef5_1130x621.png 848w, https://substackcdn.com/image/fetch/$s_!PXS8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30aa584e-d42c-4ee7-bbbb-5c723faa8ef5_1130x621.png 1272w, https://substackcdn.com/image/fetch/$s_!PXS8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30aa584e-d42c-4ee7-bbbb-5c723faa8ef5_1130x621.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Step 1 &#8212; Map the Graph</h3><p>Run this first, on anything. It forces visibility before any change is made.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!h8Vs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4698367-451d-45df-9a65-c42ae0b78bd2_949x315.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!h8Vs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4698367-451d-45df-9a65-c42ae0b78bd2_949x315.png 424w, https://substackcdn.com/image/fetch/$s_!h8Vs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4698367-451d-45df-9a65-c42ae0b78bd2_949x315.png 848w, https://substackcdn.com/image/fetch/$s_!h8Vs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4698367-451d-45df-9a65-c42ae0b78bd2_949x315.png 1272w, https://substackcdn.com/image/fetch/$s_!h8Vs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4698367-451d-45df-9a65-c42ae0b78bd2_949x315.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!h8Vs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4698367-451d-45df-9a65-c42ae0b78bd2_949x315.png" width="949" height="315" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c4698367-451d-45df-9a65-c42ae0b78bd2_949x315.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:315,&quot;width&quot;:949,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:32495,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200234529?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4698367-451d-45df-9a65-c42ae0b78bd2_949x315.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!h8Vs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4698367-451d-45df-9a65-c42ae0b78bd2_949x315.png 424w, https://substackcdn.com/image/fetch/$s_!h8Vs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4698367-451d-45df-9a65-c42ae0b78bd2_949x315.png 848w, https://substackcdn.com/image/fetch/$s_!h8Vs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4698367-451d-45df-9a65-c42ae0b78bd2_949x315.png 1272w, https://substackcdn.com/image/fetch/$s_!h8Vs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4698367-451d-45df-9a65-c42ae0b78bd2_949x315.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Step 2 &#8212; Separate Policy from SOPs</h3><p>Use this on any bloated prompt or monolithic instruction set.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LQqN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F971c1d31-512b-4f35-a7ad-5cb5df5e780c_948x390.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LQqN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F971c1d31-512b-4f35-a7ad-5cb5df5e780c_948x390.png 424w, https://substackcdn.com/image/fetch/$s_!LQqN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F971c1d31-512b-4f35-a7ad-5cb5df5e780c_948x390.png 848w, https://substackcdn.com/image/fetch/$s_!LQqN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F971c1d31-512b-4f35-a7ad-5cb5df5e780c_948x390.png 1272w, https://substackcdn.com/image/fetch/$s_!LQqN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F971c1d31-512b-4f35-a7ad-5cb5df5e780c_948x390.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LQqN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F971c1d31-512b-4f35-a7ad-5cb5df5e780c_948x390.png" width="948" height="390" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/971c1d31-512b-4f35-a7ad-5cb5df5e780c_948x390.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:390,&quot;width&quot;:948,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:39502,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200234529?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F971c1d31-512b-4f35-a7ad-5cb5df5e780c_948x390.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LQqN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F971c1d31-512b-4f35-a7ad-5cb5df5e780c_948x390.png 424w, https://substackcdn.com/image/fetch/$s_!LQqN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F971c1d31-512b-4f35-a7ad-5cb5df5e780c_948x390.png 848w, https://substackcdn.com/image/fetch/$s_!LQqN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F971c1d31-512b-4f35-a7ad-5cb5df5e780c_948x390.png 1272w, https://substackcdn.com/image/fetch/$s_!LQqN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F971c1d31-512b-4f35-a7ad-5cb5df5e780c_948x390.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Step 3 &#8212; Build a Lean Workflow</h2><p>Use this to construct a new workflow or rebuild an existing one.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Fke_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd88cacd1-7544-4ae9-919c-07488e741198_949x435.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Fke_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd88cacd1-7544-4ae9-919c-07488e741198_949x435.png 424w, https://substackcdn.com/image/fetch/$s_!Fke_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd88cacd1-7544-4ae9-919c-07488e741198_949x435.png 848w, https://substackcdn.com/image/fetch/$s_!Fke_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd88cacd1-7544-4ae9-919c-07488e741198_949x435.png 1272w, https://substackcdn.com/image/fetch/$s_!Fke_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd88cacd1-7544-4ae9-919c-07488e741198_949x435.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Fke_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd88cacd1-7544-4ae9-919c-07488e741198_949x435.png" width="949" height="435" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d88cacd1-7544-4ae9-919c-07488e741198_949x435.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:435,&quot;width&quot;:949,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:41035,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200234529?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd88cacd1-7544-4ae9-919c-07488e741198_949x435.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Fke_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd88cacd1-7544-4ae9-919c-07488e741198_949x435.png 424w, https://substackcdn.com/image/fetch/$s_!Fke_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd88cacd1-7544-4ae9-919c-07488e741198_949x435.png 848w, https://substackcdn.com/image/fetch/$s_!Fke_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd88cacd1-7544-4ae9-919c-07488e741198_949x435.png 1272w, https://substackcdn.com/image/fetch/$s_!Fke_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd88cacd1-7544-4ae9-919c-07488e741198_949x435.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Step 4 &#8212; Audit an Existing Workflow for Waste</h2><p>Use this on any existing agent, workflow, or process.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!n-KC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b45a20-e5cd-4f8e-a790-f30d9b7f5056_947x370.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!n-KC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b45a20-e5cd-4f8e-a790-f30d9b7f5056_947x370.png 424w, https://substackcdn.com/image/fetch/$s_!n-KC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b45a20-e5cd-4f8e-a790-f30d9b7f5056_947x370.png 848w, https://substackcdn.com/image/fetch/$s_!n-KC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b45a20-e5cd-4f8e-a790-f30d9b7f5056_947x370.png 1272w, https://substackcdn.com/image/fetch/$s_!n-KC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b45a20-e5cd-4f8e-a790-f30d9b7f5056_947x370.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!n-KC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b45a20-e5cd-4f8e-a790-f30d9b7f5056_947x370.png" width="947" height="370" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e5b45a20-e5cd-4f8e-a790-f30d9b7f5056_947x370.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:370,&quot;width&quot;:947,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:32509,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200234529?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b45a20-e5cd-4f8e-a790-f30d9b7f5056_947x370.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!n-KC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b45a20-e5cd-4f8e-a790-f30d9b7f5056_947x370.png 424w, https://substackcdn.com/image/fetch/$s_!n-KC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b45a20-e5cd-4f8e-a790-f30d9b7f5056_947x370.png 848w, https://substackcdn.com/image/fetch/$s_!n-KC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b45a20-e5cd-4f8e-a790-f30d9b7f5056_947x370.png 1272w, https://substackcdn.com/image/fetch/$s_!n-KC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b45a20-e5cd-4f8e-a790-f30d9b7f5056_947x370.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The Usage Sequence</h2><p>These prompts are designed to run in order as a repeatable optimisation loop:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9gnw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb2af29b-6f8c-4b8e-aa88-7316afeb1cb5_942x245.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9gnw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb2af29b-6f8c-4b8e-aa88-7316afeb1cb5_942x245.png 424w, https://substackcdn.com/image/fetch/$s_!9gnw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb2af29b-6f8c-4b8e-aa88-7316afeb1cb5_942x245.png 848w, https://substackcdn.com/image/fetch/$s_!9gnw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb2af29b-6f8c-4b8e-aa88-7316afeb1cb5_942x245.png 1272w, https://substackcdn.com/image/fetch/$s_!9gnw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb2af29b-6f8c-4b8e-aa88-7316afeb1cb5_942x245.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9gnw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb2af29b-6f8c-4b8e-aa88-7316afeb1cb5_942x245.png" width="942" height="245" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cb2af29b-6f8c-4b8e-aa88-7316afeb1cb5_942x245.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:245,&quot;width&quot;:942,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:23673,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200234529?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb2af29b-6f8c-4b8e-aa88-7316afeb1cb5_942x245.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9gnw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb2af29b-6f8c-4b8e-aa88-7316afeb1cb5_942x245.png 424w, https://substackcdn.com/image/fetch/$s_!9gnw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb2af29b-6f8c-4b8e-aa88-7316afeb1cb5_942x245.png 848w, https://substackcdn.com/image/fetch/$s_!9gnw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb2af29b-6f8c-4b8e-aa88-7316afeb1cb5_942x245.png 1272w, https://substackcdn.com/image/fetch/$s_!9gnw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb2af29b-6f8c-4b8e-aa88-7316afeb1cb5_942x245.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p>Start with Step 1 on any unfamiliar codebase. Start with Step 4 on your own code before deploying it. Run the full sequence on any system before scaling it. The loop is: </p><p><strong>map &#8594; separate &#8594; build &#8594; audit &#8594; repeat.</strong></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MG_3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27507c42-f6dd-440f-abf3-d6504a35f293_1130x621.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MG_3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27507c42-f6dd-440f-abf3-d6504a35f293_1130x621.png 424w, https://substackcdn.com/image/fetch/$s_!MG_3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27507c42-f6dd-440f-abf3-d6504a35f293_1130x621.png 848w, https://substackcdn.com/image/fetch/$s_!MG_3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27507c42-f6dd-440f-abf3-d6504a35f293_1130x621.png 1272w, https://substackcdn.com/image/fetch/$s_!MG_3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27507c42-f6dd-440f-abf3-d6504a35f293_1130x621.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MG_3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27507c42-f6dd-440f-abf3-d6504a35f293_1130x621.png" width="1130" height="621" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/27507c42-f6dd-440f-abf3-d6504a35f293_1130x621.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:621,&quot;width&quot;:1130,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1042887,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200234529?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27507c42-f6dd-440f-abf3-d6504a35f293_1130x621.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MG_3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27507c42-f6dd-440f-abf3-d6504a35f293_1130x621.png 424w, https://substackcdn.com/image/fetch/$s_!MG_3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27507c42-f6dd-440f-abf3-d6504a35f293_1130x621.png 848w, https://substackcdn.com/image/fetch/$s_!MG_3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27507c42-f6dd-440f-abf3-d6504a35f293_1130x621.png 1272w, https://substackcdn.com/image/fetch/$s_!MG_3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27507c42-f6dd-440f-abf3-d6504a35f293_1130x621.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2>Connecting Dots&#8230;</h2><p>I love seeing connections like these - Miessler argued in 2024 that companies are graphs of algorithms waiting to be made visible and optimised. The StockPilot experiment proved that the same is true of AI agents &#8212; and that the optimisation lever is structural, not model-level. The five-phase audit prompt is the scanner Miessler&#8217;s framework implies: a systematic procedure for mapping any algorithm graph, separating its layers, and making the waste visible before touching a single line of code. <strong>Claude&#8217;s /workflows system is the implementation layer</strong>: Miessler&#8217;s SOPs made executable, loaded on demand, returning typed results, conditioning on context. The theory, the experiment, and the tooling now point in the same direction. The question is which systems you point them at first.</p><h2>References</h2><h3>Miessler &#8212; Source Essays</h3><p>Website: <a href="https://danielmiessler.com/">https://danielmiessler.com/ </a></p><p><strong>Companies Are Just a Graph of Algorithms (May 2024): </strong><a href="https://danielmiessler.com/blog/companies-graph-of-algorithms">danielmiessler.com/blog/companies-graph-of-algorithms</a></p><p><strong>Policy, SOPs, and AI Are All You Need (September 2024): </strong><a href="https://danielmiessler.com/blog/policy-sops-and-ai-are-all-you-need">danielmiessler.com/blog/policy-sops-and-ai-are-all-you-need</a></p><p><strong>/workflows announcement (May 2026): </strong><a href="https://x.com/i/status/2060100599379841379">x.com/i/status/2060100599379841379</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jDGV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a43af10-26b5-49b4-866e-f9bcbe3c3683_655x836.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jDGV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a43af10-26b5-49b4-866e-f9bcbe3c3683_655x836.png 424w, https://substackcdn.com/image/fetch/$s_!jDGV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a43af10-26b5-49b4-866e-f9bcbe3c3683_655x836.png 848w, https://substackcdn.com/image/fetch/$s_!jDGV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a43af10-26b5-49b4-866e-f9bcbe3c3683_655x836.png 1272w, https://substackcdn.com/image/fetch/$s_!jDGV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a43af10-26b5-49b4-866e-f9bcbe3c3683_655x836.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jDGV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a43af10-26b5-49b4-866e-f9bcbe3c3683_655x836.png" width="655" height="836" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3a43af10-26b5-49b4-866e-f9bcbe3c3683_655x836.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:836,&quot;width&quot;:655,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:195032,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200234529?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a43af10-26b5-49b4-866e-f9bcbe3c3683_655x836.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jDGV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a43af10-26b5-49b4-866e-f9bcbe3c3683_655x836.png 424w, https://substackcdn.com/image/fetch/$s_!jDGV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a43af10-26b5-49b4-866e-f9bcbe3c3683_655x836.png 848w, https://substackcdn.com/image/fetch/$s_!jDGV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a43af10-26b5-49b4-866e-f9bcbe3c3683_655x836.png 1272w, https://substackcdn.com/image/fetch/$s_!jDGV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a43af10-26b5-49b4-866e-f9bcbe3c3683_655x836.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Prior Article in This Series</h3><p><strong>The Structure Is the Intelligence: </strong>What a 97% Token Reduction Reveals About How Multi-Agent Systems Actually Work &#8212; Interesting Engineering++, May 2026<strong>: <a href="https://interestingengineering.substack.com/p/the-structure-is-the-intelligence">https://interestingengineering.substack.com/p/the-structure-is-the-intelligence</a></strong></p><h3>Experiment Source</h3><p><strong>Anthropic cwc-workshops &#8212; Agent Decomposition: </strong><a href="https://github.com/anthropics/cwc-workshops/tree/main/agent-decomposition">github.com/anthropics/cwc-workshops/tree/main/agent-decomposition</a></p><h3>Claude Documentation</h3><p><strong>Claude Managed Agents overview: </strong><a href="https://platform.claude.com/docs/en/managed-agents/overview">platform.claude.com/docs/en/managed-agents/overview</a></p><p><strong>Skills in CMA: </strong><a href="https://platform.claude.com/docs/en/managed-agents/skills">platform.claude.com/docs/en/managed-agents/skills</a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Geometry of Unpredictability]]></title><description><![CDATA[What Agentic AI Workflows Can Learn from Chemical Polymorphism: Engineering Scaffolding for Agentic Stability]]></description><link>https://interestingengineering.substack.com/p/the-geometry-of-unpredictability</link><guid isPermaLink="false">https://interestingengineering.substack.com/p/the-geometry-of-unpredictability</guid><dc:creator><![CDATA[Interesting Engineering ++]]></dc:creator><pubDate>Mon, 01 Jun 2026 03:23:59 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!46CF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6207f35d-d5e4-465a-b14f-cd350d6358f8_1101x592.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!46CF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6207f35d-d5e4-465a-b14f-cd350d6358f8_1101x592.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!46CF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6207f35d-d5e4-465a-b14f-cd350d6358f8_1101x592.png 424w, https://substackcdn.com/image/fetch/$s_!46CF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6207f35d-d5e4-465a-b14f-cd350d6358f8_1101x592.png 848w, https://substackcdn.com/image/fetch/$s_!46CF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6207f35d-d5e4-465a-b14f-cd350d6358f8_1101x592.png 1272w, https://substackcdn.com/image/fetch/$s_!46CF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6207f35d-d5e4-465a-b14f-cd350d6358f8_1101x592.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!46CF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6207f35d-d5e4-465a-b14f-cd350d6358f8_1101x592.png" width="1101" height="592" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6207f35d-d5e4-465a-b14f-cd350d6358f8_1101x592.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:592,&quot;width&quot;:1101,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1184643,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6207f35d-d5e4-465a-b14f-cd350d6358f8_1101x592.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!46CF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6207f35d-d5e4-465a-b14f-cd350d6358f8_1101x592.png 424w, https://substackcdn.com/image/fetch/$s_!46CF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6207f35d-d5e4-465a-b14f-cd350d6358f8_1101x592.png 848w, https://substackcdn.com/image/fetch/$s_!46CF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6207f35d-d5e4-465a-b14f-cd350d6358f8_1101x592.png 1272w, https://substackcdn.com/image/fetch/$s_!46CF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6207f35d-d5e4-465a-b14f-cd350d6358f8_1101x592.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A few thoughts are running through my mind based on observations. Issues are raised about the instability of Agent Systems esp the long running ones. Whilst not perfect there are improvements you see today within the &#8220;harnesses&#8221;.  This article (like quite a few, consistently before this, is going to run a tiny experiment to show the few working angles at play today. I am excited about the space, and what the future holds. Anyway I ramble. Let me begin. This article also follow on from:</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;773dec8e-760a-48bd-a08a-5e3783ae7275&quot;,&quot;caption&quot;:&quot;Had some time on my hands, and applied the features of The Harness Experiment(s) to the Architecture of Awareness design considerations. You will remember from The Harness Experiment (applied to a mini vendor analysis case study) that the results presented as follows:&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;ASCRS Harness Lab - The Integrated Agentic Stack: When Does More Architecture Mean Better AI? A Diagnostic Teardown&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:124460392,&quot;name&quot;:&quot;Interesting Engineering ++&quot;,&quot;bio&quot;:&quot;I spend my time learning about, and understanding our complex world better. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/977225f0-cc19-41f4-9df4-e21d01541411_347x347.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-05-16T17:52:19.700Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!cv0d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fb21fa-dc0a-4c0f-8ce4-6b2e85594843_1160x595.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:198013155,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:2,&quot;comment_count&quot;:2,&quot;publication_id&quot;:1335585,&quot;publication_name&quot;:&quot;Interesting Engineering++&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!-M9w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05150353-1bdc-48d2-b72c-c0bd499513eb_1024x1024.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h2><strong>1. The Paradox of Latent State Spaces</strong></h2><p>Traditional software engineering rests on a reassuring assumption: that a specific input, combined with an explicit instruction, will produce a predictable output. Debugging is, in principle, a process of elimination. State is visible. Branches are enumerable. Failures at least announce themselves.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!K1Cb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bf4b516-dedc-400f-84f0-2deb0005b572_1138x622.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!K1Cb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bf4b516-dedc-400f-84f0-2deb0005b572_1138x622.png 424w, https://substackcdn.com/image/fetch/$s_!K1Cb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bf4b516-dedc-400f-84f0-2deb0005b572_1138x622.png 848w, https://substackcdn.com/image/fetch/$s_!K1Cb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bf4b516-dedc-400f-84f0-2deb0005b572_1138x622.png 1272w, https://substackcdn.com/image/fetch/$s_!K1Cb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bf4b516-dedc-400f-84f0-2deb0005b572_1138x622.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!K1Cb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bf4b516-dedc-400f-84f0-2deb0005b572_1138x622.png" width="1138" height="622" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1bf4b516-dedc-400f-84f0-2deb0005b572_1138x622.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:622,&quot;width&quot;:1138,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1160754,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bf4b516-dedc-400f-84f0-2deb0005b572_1138x622.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!K1Cb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bf4b516-dedc-400f-84f0-2deb0005b572_1138x622.png 424w, https://substackcdn.com/image/fetch/$s_!K1Cb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bf4b516-dedc-400f-84f0-2deb0005b572_1138x622.png 848w, https://substackcdn.com/image/fetch/$s_!K1Cb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bf4b516-dedc-400f-84f0-2deb0005b572_1138x622.png 1272w, https://substackcdn.com/image/fetch/$s_!K1Cb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bf4b516-dedc-400f-84f0-2deb0005b572_1138x622.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Agentic AI workflows have quietly dissolved that assumption. <strong>When a language model is granted the autonomy to select tools, interpret ambiguous data, and revise its own execution trajectory across extended time horizons, the potential state space expands in ways that no enumeration can capture. The agent is no longer a function &#8212; it is an optimizer navigating a high-dimensional landscape</strong>. The engineer cannot anticipate all the paths. She can only shape the conditions under which the optimizer runs, and hope the conditions are sufficient.</p><p>Unsurprising, and if some of you remember your chemistry lab experiments, this does also mirror a challenge that solid-state chemists and pharmaceutical engineers have grappled with for decades: <strong>polymorphism</strong>. A molecule&#8217;s chemical formula only tells part of the story. The spatial arrangement of its crystal lattice is an independent variable, and it governs everything that matters in practice &#8212; melting point, solubility, bioavailability. Given chemically identical inputs, the same molecule can crystallize into multiple distinct structural arrangements with completely different properties. The formula doesn&#8217;t change. The architecture does.</p><blockquote><p><em>The core vulnerability in both domains is the existence of unpredicted alternative states &#8212; latent configurations that are thermodynamically or computationally lower-energy than the designed form, and therefore preferred by the system under sufficient perturbation.</em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SzNa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F898a1de1-caca-410b-aca9-240ff8c0a07a_1135x619.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SzNa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F898a1de1-caca-410b-aca9-240ff8c0a07a_1135x619.png 424w, https://substackcdn.com/image/fetch/$s_!SzNa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F898a1de1-caca-410b-aca9-240ff8c0a07a_1135x619.png 848w, https://substackcdn.com/image/fetch/$s_!SzNa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F898a1de1-caca-410b-aca9-240ff8c0a07a_1135x619.png 1272w, https://substackcdn.com/image/fetch/$s_!SzNa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F898a1de1-caca-410b-aca9-240ff8c0a07a_1135x619.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SzNa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F898a1de1-caca-410b-aca9-240ff8c0a07a_1135x619.png" width="1135" height="619" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/898a1de1-caca-410b-aca9-240ff8c0a07a_1135x619.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:619,&quot;width&quot;:1135,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1182178,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F898a1de1-caca-410b-aca9-240ff8c0a07a_1135x619.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SzNa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F898a1de1-caca-410b-aca9-240ff8c0a07a_1135x619.png 424w, https://substackcdn.com/image/fetch/$s_!SzNa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F898a1de1-caca-410b-aca9-240ff8c0a07a_1135x619.png 848w, https://substackcdn.com/image/fetch/$s_!SzNa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F898a1de1-caca-410b-aca9-240ff8c0a07a_1135x619.png 1272w, https://substackcdn.com/image/fetch/$s_!SzNa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F898a1de1-caca-410b-aca9-240ff8c0a07a_1135x619.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A micro-fluctuation in laboratory conditions can push a drug compound into a therapeutically inert crystal form. A negligible variation in runtime context, token degradation, or third-party tool latency can push an autonomous AI agent from productive reasoning into a resource-consuming failure loop. Neither system announces the transition. Both failures look, from the outside, like normal operation &#8212; until they do not.</p><p>What follows develops the analogy systematically and maps its implications for agentic workflow engineering, before offering a concrete experimental framework for observing these dynamics directly. The analogy is not merely illustrative. The underlying mathematics &#8212; optimization on a non-convex loss surface, escape from local minima, convergence to globally stable but operationally useless states &#8212; is structurally identical in both domains.</p><h2><strong>2. The Economics of Tool Use: MCP Token Bloat</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fOO6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfacc4b1-80ec-4c45-9cd7-db5b2abdcb05_1141x622.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fOO6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfacc4b1-80ec-4c45-9cd7-db5b2abdcb05_1141x622.png 424w, https://substackcdn.com/image/fetch/$s_!fOO6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfacc4b1-80ec-4c45-9cd7-db5b2abdcb05_1141x622.png 848w, https://substackcdn.com/image/fetch/$s_!fOO6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfacc4b1-80ec-4c45-9cd7-db5b2abdcb05_1141x622.png 1272w, https://substackcdn.com/image/fetch/$s_!fOO6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfacc4b1-80ec-4c45-9cd7-db5b2abdcb05_1141x622.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fOO6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfacc4b1-80ec-4c45-9cd7-db5b2abdcb05_1141x622.png" width="1141" height="622" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dfacc4b1-80ec-4c45-9cd7-db5b2abdcb05_1141x622.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:622,&quot;width&quot;:1141,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1287232,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfacc4b1-80ec-4c45-9cd7-db5b2abdcb05_1141x622.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fOO6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfacc4b1-80ec-4c45-9cd7-db5b2abdcb05_1141x622.png 424w, https://substackcdn.com/image/fetch/$s_!fOO6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfacc4b1-80ec-4c45-9cd7-db5b2abdcb05_1141x622.png 848w, https://substackcdn.com/image/fetch/$s_!fOO6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfacc4b1-80ec-4c45-9cd7-db5b2abdcb05_1141x622.png 1272w, https://substackcdn.com/image/fetch/$s_!fOO6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfacc4b1-80ec-4c45-9cd7-db5b2abdcb05_1141x622.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>There&#8217;s a hidden tax built into the way most MCP implementations work, and it activates before an agent has processed a single word of the user&#8217;s actual task. In naive configurations, MCP engines employ an eager loading strategy &#8212; injecting the full JSON schema of every registered tool into the system prompt at initialization. Empirical benchmarks put the damage at 20,000 to 80,000 tokens per request, enough to consume 40 to 50 percent of a model&#8217;s available context window before any work begins.</p><p>Research from MindStudio found a 35x token overhead differential between naive MCP implementations and optimized CLI-based alternatives, with reliability dropping by up to 28 percent when agents are confronted with irrelevant schema definitions at inference time. The mechanism isn&#8217;t mysterious: irrelevant schemas inject noise into the model&#8217;s attention landscape, dilute the salience of the actual task, and raise the probability of tool-selection errors. The model is being asked to reason clearly while standing in a room full of irrelevant signage.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-z2w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F208a6562-6c84-496a-aac5-f4da7cfd806c_918x299.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-z2w!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F208a6562-6c84-496a-aac5-f4da7cfd806c_918x299.png 424w, https://substackcdn.com/image/fetch/$s_!-z2w!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F208a6562-6c84-496a-aac5-f4da7cfd806c_918x299.png 848w, https://substackcdn.com/image/fetch/$s_!-z2w!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F208a6562-6c84-496a-aac5-f4da7cfd806c_918x299.png 1272w, https://substackcdn.com/image/fetch/$s_!-z2w!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F208a6562-6c84-496a-aac5-f4da7cfd806c_918x299.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-z2w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F208a6562-6c84-496a-aac5-f4da7cfd806c_918x299.png" width="918" height="299" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/208a6562-6c84-496a-aac5-f4da7cfd806c_918x299.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:299,&quot;width&quot;:918,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:29750,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F208a6562-6c84-496a-aac5-f4da7cfd806c_918x299.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-z2w!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F208a6562-6c84-496a-aac5-f4da7cfd806c_918x299.png 424w, https://substackcdn.com/image/fetch/$s_!-z2w!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F208a6562-6c84-496a-aac5-f4da7cfd806c_918x299.png 848w, https://substackcdn.com/image/fetch/$s_!-z2w!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F208a6562-6c84-496a-aac5-f4da7cfd806c_918x299.png 1272w, https://substackcdn.com/image/fetch/$s_!-z2w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F208a6562-6c84-496a-aac5-f4da7cfd806c_918x299.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The solution isn&#8217;t to reduce the number of available tools &#8212; it&#8217;s to stop treating context as if it were free. Moving from monolithic eager registration to lazy-loaded skills and semantic tool routing is the computational equivalent of solvent purity management in process chemistry: not eliminating the reagent, but controlling when and how it enters the reaction environment.</p><h3><strong>2.1 Implementation Specifications for Token Control</strong></h3><p>The three mechanisms below are well-established in principle. What&#8217;s worth noting is that practitioners building real agentic systems have been converging on them independently &#8212; sometimes by design, more often after hitting the costs of not using them.</p><p><strong>Progressive Disclosure via Manifests</strong></p><p>&#8226; Store tool sets as modular markdown documents prefixed with lightweight YAML front matter. The host reads only namespace identifiers and tool summaries during initial prompt construction. Full schema details stay on disk until needed.</p><p>&#8226; In practice: the ASCRS harness library demonstrates this directly. Its 29 skill files are individual markdown documents &#8212; each covering one narrow procedure (escalation-gate.md, route-viability.md, freight-rate-lookup.md, and so on). The orchestrator loads the skill list at init but only pulls the full content of a skill when a sub-agent is delegated that specific task. This is why the ASCRS context stays manageable across a 23-PO pharmaceutical crisis scenario &#8212; if all 29 skills were injected at startup, a meaningful fraction of the context window would be consumed before the disruption alert was even processed.</p><p>&#8226; Where it&#8217;s gone wrong: early ASCRS harness builds that loaded all skill definitions monolithically saw the V3 Strategist anchor incorrectly on the 2019 Gulf of Oman precedent (confidence 0.61) rather than the structurally closer 2024 Red Sea case (0.58) &#8212; partly because the context at inference time was dense with irrelevant procedural noise from unrelated skills, diluting the signal from the episodic memory retrieval.</p><p><strong>Two-Stage Semantic Tool Selection</strong></p><p>&#8226; Map incoming queries against tool embeddings using a lightweight bi-encoder or vector-routing pipeline. Restrict the active context to a top-k shortlist (k &#8804; 3) of semantically relevant candidates. The agent never sees the rest.</p><p>&#8226; In practice: the ASCRS sub-agent topology does this structurally rather than through embeddings &#8212; each specialist (SA_pharma_compliance, SA_freight_market, SA_route_viability, SA_financial_analyst, SA_reviewer) is assigned a narrow, fixed tool set at delegation time. SA_freight_market never sees the compliance tool definitions; SA_pharma_compliance never sees the freight API schemas. The filtering happens at the orchestrator level before any sub-agent prompt is constructed. This is a deterministic version of semantic routing.</p><p>&#8226; The next step not yet implemented: automated vector-based routing across a larger tool registry, where the relevant skill set isn&#8217;t hardcoded by role but retrieved dynamically based on the specific query. This becomes valuable when the tool library grows beyond what can be manually pre-assigned per agent role.</p><p><strong>Just-in-Time Activation</strong></p><p>&#8226; Full JSON schema details are fetched and injected into the active context window only when the agent explicitly selects a tool to execute. The token cost is paid once, on demand, not upfront for all tools regardless of relevance.</p><p>&#8226; In practice: the ASCRS correction loop (G1&#8211;G8 quality gate) embodies this at the gate level. Gate items are evaluated sequentially &#8212; the schema for G7 (ERP write validation) is only relevant if G1 through G6 pass. If G3 (carrier booking reference) fails, the correction agent targets that specific gap and the G7 schema never enters the context at all. It&#8217;s JIT activation through conditional execution rather than dynamic schema injection, but the token economics are the same.</p><p>&#8226; Where it matters most: multi-turn tool-use harnesses with long trajectories. In the ASCRS scenario, a 23-PO brief with 8 tool categories &#8212; if all tool schemas were injected at turn 1, the context would be partially exhausted before the agent had reviewed the first purchase order. Deferring injection to the point of use preserves context runway for the reasoning that actually matters.</p><h3><strong>2.2 How SkillOpt Works &#8212; and What It Proves</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!U753!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a724855-d17c-4878-810c-ff4f5e626ce4_734x446.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!U753!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a724855-d17c-4878-810c-ff4f5e626ce4_734x446.png 424w, https://substackcdn.com/image/fetch/$s_!U753!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a724855-d17c-4878-810c-ff4f5e626ce4_734x446.png 848w, https://substackcdn.com/image/fetch/$s_!U753!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a724855-d17c-4878-810c-ff4f5e626ce4_734x446.png 1272w, https://substackcdn.com/image/fetch/$s_!U753!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a724855-d17c-4878-810c-ff4f5e626ce4_734x446.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!U753!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a724855-d17c-4878-810c-ff4f5e626ce4_734x446.png" width="734" height="446" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9a724855-d17c-4878-810c-ff4f5e626ce4_734x446.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:446,&quot;width&quot;:734,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:215223,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a724855-d17c-4878-810c-ff4f5e626ce4_734x446.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!U753!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a724855-d17c-4878-810c-ff4f5e626ce4_734x446.png 424w, https://substackcdn.com/image/fetch/$s_!U753!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a724855-d17c-4878-810c-ff4f5e626ce4_734x446.png 848w, https://substackcdn.com/image/fetch/$s_!U753!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a724855-d17c-4878-810c-ff4f5e626ce4_734x446.png 1272w, https://substackcdn.com/image/fetch/$s_!U753!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a724855-d17c-4878-810c-ff4f5e626ce4_734x446.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bRwZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c4184b-9925-4f33-a4a3-5c0ccd8e1e99_895x624.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bRwZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c4184b-9925-4f33-a4a3-5c0ccd8e1e99_895x624.png 424w, https://substackcdn.com/image/fetch/$s_!bRwZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c4184b-9925-4f33-a4a3-5c0ccd8e1e99_895x624.png 848w, https://substackcdn.com/image/fetch/$s_!bRwZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c4184b-9925-4f33-a4a3-5c0ccd8e1e99_895x624.png 1272w, https://substackcdn.com/image/fetch/$s_!bRwZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c4184b-9925-4f33-a4a3-5c0ccd8e1e99_895x624.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bRwZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c4184b-9925-4f33-a4a3-5c0ccd8e1e99_895x624.png" width="895" height="624" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/77c4184b-9925-4f33-a4a3-5c0ccd8e1e99_895x624.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:624,&quot;width&quot;:895,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:261969,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c4184b-9925-4f33-a4a3-5c0ccd8e1e99_895x624.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bRwZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c4184b-9925-4f33-a4a3-5c0ccd8e1e99_895x624.png 424w, https://substackcdn.com/image/fetch/$s_!bRwZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c4184b-9925-4f33-a4a3-5c0ccd8e1e99_895x624.png 848w, https://substackcdn.com/image/fetch/$s_!bRwZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c4184b-9925-4f33-a4a3-5c0ccd8e1e99_895x624.png 1272w, https://substackcdn.com/image/fetch/$s_!bRwZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c4184b-9925-4f33-a4a3-5c0ccd8e1e99_895x624.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KHpA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbafa0b25-be8f-4545-8e47-b334da46043c_887x723.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KHpA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbafa0b25-be8f-4545-8e47-b334da46043c_887x723.png 424w, https://substackcdn.com/image/fetch/$s_!KHpA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbafa0b25-be8f-4545-8e47-b334da46043c_887x723.png 848w, https://substackcdn.com/image/fetch/$s_!KHpA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbafa0b25-be8f-4545-8e47-b334da46043c_887x723.png 1272w, https://substackcdn.com/image/fetch/$s_!KHpA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbafa0b25-be8f-4545-8e47-b334da46043c_887x723.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KHpA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbafa0b25-be8f-4545-8e47-b334da46043c_887x723.png" width="887" height="723" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bafa0b25-be8f-4545-8e47-b334da46043c_887x723.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:723,&quot;width&quot;:887,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:279366,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbafa0b25-be8f-4545-8e47-b334da46043c_887x723.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KHpA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbafa0b25-be8f-4545-8e47-b334da46043c_887x723.png 424w, https://substackcdn.com/image/fetch/$s_!KHpA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbafa0b25-be8f-4545-8e47-b334da46043c_887x723.png 848w, https://substackcdn.com/image/fetch/$s_!KHpA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbafa0b25-be8f-4545-8e47-b334da46043c_887x723.png 1272w, https://substackcdn.com/image/fetch/$s_!KHpA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbafa0b25-be8f-4545-8e47-b334da46043c_887x723.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>SkillOpt, from Microsoft Research and Shanghai Jiao Tong University (Yang et al., 2026), is one of the more rigorous recent attempts to put similar controls into practice and measure what actually changes when you remove them.</p><p>The 41.8 to 80.7 jump for GPT 5.5 on SpreadsheetBench deserves more than a citation. It&#8217;s worth understanding the mechanism, because the mechanism is the lesson.</p><p><strong>What SkillOpt actually does</strong></p><p>&#8226; Start with a frozen target model (GPT&#8211;5.5 in this case) and an initial skill file &#8212; a markdown document describing how the agent should approach spreadsheet tasks. This initial file might be a few hundred tokens of general guidance.</p><p>&#8226; A separate optimizer model &#8212; also a frontier LLM, running offline &#8212; watches the target model attempt real SpreadsheetBench tasks. It observes where the agent fails, what patterns recur, and why.</p><p>&#8226; The optimizer proposes bounded edits to the skill file: add a rule here, delete a vague instruction there, replace a general guideline with a specific procedure. Each edit is one of three types: ADD, DELETE, or REPLACE &#8212; never a full rewrite.</p><p>&#8226; A held-out validation set gates every proposed edit. If the candidate skill (with the edit applied) doesn&#8217;t strictly improve performance on the held-out split, the edit is rejected. Not deferred &#8212; rejected. Rejected edits are stored as negative feedback so the optimizer doesn&#8217;t propose the same failed fix again.</p><p>&#8226; This loop runs for four epochs. The final artifact &#8212; best_skill.md &#8212; is what gets deployed. It&#8217;s typically 300 to 2,000 tokens.</p><p><strong>What the SpreadsheetBench skill actually learned</strong></p><p>The final rule that drove most of the SpreadsheetBench gain was a single procedural instruction: &#8220;Inspect workbook structure and formulas, then write evaluated static values across the full requested target range instead of relying on Excel recalculation.&#8221; That&#8217;s under 30 words. It corrected a recurring failure where the agent wrote formula references that a grader reading cell values couldn&#8217;t score correctly. One accepted edit. 38.9 points.</p><p><strong>How to apply this yourself</strong></p><p>&#8226; You don&#8217;t need SkillOpt&#8217;s optimizer loop to apply the underlying principle. The practice is: <strong>run your agent against a representative task set, observe where it fails systematically (not one-off &#8212; recurring patterns), write a concise procedural rule that addresses the pattern, test it on a held-out sample before committing it to your skill file.</strong></p><p>&#8226; The ASCRS skill library was built this way, incrementally. The escalation-gate.md skill, for example, encodes the lesson from an early ASCRS run where the V2 parallel harness failed to surface the biologic cold-chain deadline conflict &#8212; because no rule existed requiring Tier-1 PO constraints to be checked before the general routing decision was finalized. Read <a href="https://interestingengineering.substack.com/p/the-architecture-of-awareness-design">the Architecture of Awareness</a> and <a href="https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated">The Harness Lab.</a></p><p>&#8226; What to avoid: rewriting the whole skill file when one rule fails. This is the TextGrad failure mode. Unbounded rewrites erase rules that were working correctly alongside the ones that weren&#8217;t &#8212; net effect negative. Surgical, validated, bounded edits outperform wholesale revision every time the SkillOpt ablations are run.</p><p>&#8226; Token budget reality: a 920-token skill file added to a system prompt costs roughly the same as two paragraphs of context. An eager-loaded MCP registry costs 45,000 tokens before the first task token appears. The arithmetic is simple. The discipline required to stay compact is not &#8212; it requires validating each addition rather than accumulating rules indefinitely.</p><h2><strong>3. Failure Modes: Infinite Reflection Loops and Spontaneous Phase Transitions</strong></h2><p>Before we can engineer reliable systems, we need to be honest about how they actually fail &#8212; not in the ways that produce stack traces and error logs, but in the quieter ways that consume resources and produce confidently wrong outputs. <em><strong>When autonomous systems operate without rigid boundaries, they don&#8217;t malfunction in obvious ways. They optimize. The problem is that what they optimize for may have quietly diverged from what you intended them to optimize for.</strong></em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JY1r!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ea0d9ae-57eb-4df8-a2f6-a378b0f7b6a9_1138x619.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JY1r!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ea0d9ae-57eb-4df8-a2f6-a378b0f7b6a9_1138x619.png 424w, https://substackcdn.com/image/fetch/$s_!JY1r!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ea0d9ae-57eb-4df8-a2f6-a378b0f7b6a9_1138x619.png 848w, https://substackcdn.com/image/fetch/$s_!JY1r!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ea0d9ae-57eb-4df8-a2f6-a378b0f7b6a9_1138x619.png 1272w, https://substackcdn.com/image/fetch/$s_!JY1r!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ea0d9ae-57eb-4df8-a2f6-a378b0f7b6a9_1138x619.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JY1r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ea0d9ae-57eb-4df8-a2f6-a378b0f7b6a9_1138x619.png" width="1138" height="619" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3ea0d9ae-57eb-4df8-a2f6-a378b0f7b6a9_1138x619.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:619,&quot;width&quot;:1138,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1381912,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ea0d9ae-57eb-4df8-a2f6-a378b0f7b6a9_1138x619.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JY1r!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ea0d9ae-57eb-4df8-a2f6-a378b0f7b6a9_1138x619.png 424w, https://substackcdn.com/image/fetch/$s_!JY1r!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ea0d9ae-57eb-4df8-a2f6-a378b0f7b6a9_1138x619.png 848w, https://substackcdn.com/image/fetch/$s_!JY1r!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ea0d9ae-57eb-4df8-a2f6-a378b0f7b6a9_1138x619.png 1272w, https://substackcdn.com/image/fetch/$s_!JY1r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ea0d9ae-57eb-4df8-a2f6-a378b0f7b6a9_1138x619.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>3.1 The Agentic Loop Trap: Recursive Degeneration</strong></h3><p>Many advanced agentic architectures use <strong>Reflection and Critique patterns</strong>, where a secondary agent reviews and revises the output of a primary agent. Over short execution horizons this works well. Extend those horizons from minutes to hours or days, and something changes: <strong>agents begin struggling with memory compaction and context-window drift, and the critic&#8217;s incentive structure quietly shifts</strong>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MXjT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F567588e2-3fea-44d3-b304-f35ec88a57a3_1135x621.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MXjT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F567588e2-3fea-44d3-b304-f35ec88a57a3_1135x621.png 424w, https://substackcdn.com/image/fetch/$s_!MXjT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F567588e2-3fea-44d3-b304-f35ec88a57a3_1135x621.png 848w, https://substackcdn.com/image/fetch/$s_!MXjT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F567588e2-3fea-44d3-b304-f35ec88a57a3_1135x621.png 1272w, https://substackcdn.com/image/fetch/$s_!MXjT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F567588e2-3fea-44d3-b304-f35ec88a57a3_1135x621.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MXjT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F567588e2-3fea-44d3-b304-f35ec88a57a3_1135x621.png" width="1135" height="621" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/567588e2-3fea-44d3-b304-f35ec88a57a3_1135x621.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:621,&quot;width&quot;:1135,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1132555,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F567588e2-3fea-44d3-b304-f35ec88a57a3_1135x621.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MXjT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F567588e2-3fea-44d3-b304-f35ec88a57a3_1135x621.png 424w, https://substackcdn.com/image/fetch/$s_!MXjT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F567588e2-3fea-44d3-b304-f35ec88a57a3_1135x621.png 848w, https://substackcdn.com/image/fetch/$s_!MXjT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F567588e2-3fea-44d3-b304-f35ec88a57a3_1135x621.png 1272w, https://substackcdn.com/image/fetch/$s_!MXjT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F567588e2-3fea-44d3-b304-f35ec88a57a3_1135x621.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Without hard terminating constraints, the system can slide into what amounts to <strong>Sycophantic Collusion. The logic is coldly economical: validating a Writer Agent&#8217;s output costs fewer tokens and less cognitive effort than constructing a rigorous multi-step critique. So the Critic Agent starts validating. And the Writer Agent, receiving validation, continues. The two enter a closed, low-energy loop &#8212; producing outputs that are mutually endorsed and technically worthless.</strong></p><p><strong>Where this actually happens</strong></p><p>&#8226; Long-horizon document drafting: in agentic writing workflows with Writer + Critic pairs, critique quality degrades measurably after three to four revision cycles. By cycle five or six, the Critic&#8217;s feedback has typically collapsed to variations of &#8220;this is well-reasoned and comprehensive&#8221; regardless of actual content quality. The loop keeps running. The document stops improving.</p><p>&#8226; Multi-agent research pipelines: the AutoResearch-style architecture &#8212; where a Planner generates queries, a Researcher retrieves results, and a Critic evaluates the synthesis &#8212; is particularly vulnerable over long sessions. As context accumulates, the Researcher begins echoing the Planner&#8217;s priors rather than genuinely retrieving against them. The Critic, operating on an increasingly coherent but potentially circular body of evidence, raises no objections. The system converges to a confident answer that reflects its own starting assumptions.</p><p>&#8226; The ASCRS V2 parallel harness exhibited a version of this: when the freight-market and route-viability sub-agents ran without a structured review gate, their outputs converged on the 2019 Gulf of Oman precedent (short disruption, low cost) because both agents were drawing on the same context window and reinforcing each other&#8217;s early retrieval signal. The V3 correction loop &#8212; which introduced a dedicated reviewer with an explicit mandate to find failures, not confirm successes &#8212; broke this pattern. The key design change was giving the reviewer a structurally adversarial role, not just asking it to &#8220;review carefully.&#8221;</p><p><strong>What to do instead</strong></p><p>&#8226; Hard iteration caps, external to the LLM. The halting decision should never be left to the agent. The ASCRS correction loop enforces max-3-iterations before escalating to human review &#8212; not because three is a magic number, but because the cost of continuing past a stuck loop exceeds the cost of a human checkpoint.</p><p>&#8226; Cosine similarity monitoring. If consecutive critic outputs share more than 80 percent semantic overlap, the loop has entered a low-energy basin. Flag it and break. This is what the Harness Lab experiment in Section 5 measures directly.</p><p>&#8226; Adversarial reviewer framing. The critique agent&#8217;s system prompt should specify failure-finding, not evaluation. &#8220;Identify what is wrong, missing, or uncertain in this output&#8221; produces different behaviour than &#8220;review and improve this output.&#8221; The latter permits validation as a valid response. The former does not.</p><p>&#8226; Structured output contracts. If the reviewer&#8217;s output must include at least one specific failure finding in a defined schema field &#8212; and the loop controller checks for this before accepting the review &#8212; sycophantic validation fails schema validation and triggers a retry or escalation. This is what the ASCRS G1&#8211;G8 gate does: it doesn&#8217;t ask the reviewer whether the output is good; it checks whether specific required elements are present and correct.</p><h3><strong>3.2 The Ritonavir Disappearing Polymorph</strong></h3><p>The parallel from pharmaceutical manufacturing is almost uncomfortably precise. In 1998, production of the HIV protease inhibitor Ritonavir ground to a global halt. Abbott Laboratories had spent two years successfully manufacturing the drug in a crystal form known as Form I &#8212; soluble, stable, therapeutically effective. Then, without any change to the manufacturing inputs, a lower-energy crystal arrangement appeared: Form II.</p><div id="youtube2-ksn5yrsC3Wg" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;ksn5yrsC3Wg&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/ksn5yrsC3Wg?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p><strong>How it happened &#8212; the mechanism matters</strong></p><p>Form II didn&#8217;t appear because anything went wrong. It appeared because the <strong>thermodynamic landscape had always contained it as a possibility</strong> &#8212; Abbott just didn&#8217;t know it was there. The conditions that triggered its emergence were mundane: a combination of trace solvent residues, minor temperature fluctuations during scale-up, and the simple passage of time in a production environment that had accumulated microscopic Form II seed crystals without anyone noticing.</p><p>&#8226; <strong>Nucleation</strong>: Form II crystals nucleated &#8212; meaning they formed the initial tiny stable clusters &#8212; somewhere in the manufacturing process. The exact origin was never conclusively identified. Likely candidates were solvent residues from a related compound being manufactured in the same facility, or a batch of raw material that had been stored under slightly different humidity conditions.</p><p>&#8226; <strong>Seeding</strong>: once nucleated, Form II particles became airborne in the facility. When Form I batches were exposed to Form II seed crystals, the seeds acted as templates. Form I molecules, finding a lower-energy structural arrangement nearby, reorganized to match it. This is called contact seeding &#8212; and it meant that standard cleaning protocols, which were designed for chemical contamination, were useless against a structural template.</p><p>&#8226; <strong>Irreversibility</strong>: because Form II occupied a deeper energy well, there was no thermodynamic driving force to push molecules back toward Form I. Abbott couldn&#8217;t simply change conditions back. They had to reformulate the drug entirely as a liquid, rebuild production lines, and requalify with regulators. Two years of production time lost. Hundreds of millions spent.</p><p><strong>The AI parallel: what makes the transition irreversible</strong></p><p>The seeding mechanism is the part that maps most directly onto agentic AI. <strong>Once a degenerate pattern establishes itself in a multi-agent loop &#8212; sycophantic validation, circular retrieval, convergence on an early prior &#8212; it acts as a structural template. Subsequent agent turns are conditioned on the prior output, which means the degenerate pattern propagates forward through the context window. The longer the loop runs before intervention, the more of the active context reflects the degenerate state, and the harder it becomes to recover without restarting the session entirely.</strong></p><p>In the ASCRS V2 harness, the early anchoring on the Gulf of Oman 2019 precedent was exactly this: a low-energy state that, once established in the shared context, conditioned every subsequent retrieval. The V3 correction loop recovered from it by introducing an adversarial reviewer with access to both precedents and an explicit mandate to challenge the planning basis. But that recovery required re-running the research step from scratch &#8212; the equivalent of rebuilding the production line.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mgUX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ea9636e-6df6-4dd7-b845-edae2dbfa4b0_798x239.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mgUX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ea9636e-6df6-4dd7-b845-edae2dbfa4b0_798x239.png 424w, https://substackcdn.com/image/fetch/$s_!mgUX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ea9636e-6df6-4dd7-b845-edae2dbfa4b0_798x239.png 848w, https://substackcdn.com/image/fetch/$s_!mgUX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ea9636e-6df6-4dd7-b845-edae2dbfa4b0_798x239.png 1272w, https://substackcdn.com/image/fetch/$s_!mgUX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ea9636e-6df6-4dd7-b845-edae2dbfa4b0_798x239.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mgUX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ea9636e-6df6-4dd7-b845-edae2dbfa4b0_798x239.png" width="798" height="239" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5ea9636e-6df6-4dd7-b845-edae2dbfa4b0_798x239.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:239,&quot;width&quot;:798,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:15273,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ea9636e-6df6-4dd7-b845-edae2dbfa4b0_798x239.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mgUX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ea9636e-6df6-4dd7-b845-edae2dbfa4b0_798x239.png 424w, https://substackcdn.com/image/fetch/$s_!mgUX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ea9636e-6df6-4dd7-b845-edae2dbfa4b0_798x239.png 848w, https://substackcdn.com/image/fetch/$s_!mgUX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ea9636e-6df6-4dd7-b845-edae2dbfa4b0_798x239.png 1272w, https://substackcdn.com/image/fetch/$s_!mgUX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ea9636e-6df6-4dd7-b845-edae2dbfa4b0_798x239.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Form II sat in a much deeper thermodynamic energy valley, and it was vastly less soluble. Worse, its microscopic particles went airborne inside the manufacturing facility and began seeding Form I batches &#8212; contact with Form II forced Form I molecules to reorganize their structure. Abbott pulled the capsule form from the global market and spent hundreds of millions of dollars pivoting to a liquid alternative. Nothing about the molecule had changed. Its arrangement had. And because Form II occupied the global energy minimum, Form I couldn&#8217;t be recovered without rebuilding the conditions that had produced it in the first place.</p><h3><strong>3.3 Mathematical Mapping: Gradient Descent on a Non-Convex Surface</strong></h3><p><strong>Form I as a Local Minimum. </strong>Think of Form I as a constrained, stable basin on the loss surface. Rigid prompt guidelines and deterministic input-output schemas act as artificial potential barriers &#8212; the agent can&#8217;t explore better solution paths without climbing steeply against the rules. Within this basin, behavior is predictable, though it carries systemic friction.</p><p><strong>The Degeneration Phase (Stochastic Tunneling). </strong>High-entropy inputs or unconstrained execution modes effectively raise the system&#8217;s behavioral temperature &#8212; injecting enough stochastic noise that the agent can escape the local minimum. The rigid heuristic structures begin to decay, producing a volatile spike in behavioral entropy before the system settles into something new.</p><p><strong>Form II as the Global Search Basin. </strong>Once out of Form I, the agent descends into the expansive, unconstrained solution space of Form II &#8212; tracking toward a global minimum unbound by predefined operational steps, taking the shortest mathematical path of least resistance to satisfy whatever reward function is active.</p><p><strong>The Operational Trade-off. </strong>Finding a global minimum sounds desirable, and in pure mathematics it often is. In practice, unguided descent through Form II is prone to chaotic oscillations, semantic drift, hallucination, and overfitting to a noisy or ill-defined objective. The system is efficient. It&#8217;s just efficient at the wrong thing.</p><h2><strong>4. Bounds, Gates, and Scaffolding: Systems Engineering as a Stabilizing Matrix</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hbJJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcbd8e31-5175-48af-9bee-f4fea4ab81a8_1109x593.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hbJJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcbd8e31-5175-48af-9bee-f4fea4ab81a8_1109x593.png 424w, https://substackcdn.com/image/fetch/$s_!hbJJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcbd8e31-5175-48af-9bee-f4fea4ab81a8_1109x593.png 848w, https://substackcdn.com/image/fetch/$s_!hbJJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcbd8e31-5175-48af-9bee-f4fea4ab81a8_1109x593.png 1272w, https://substackcdn.com/image/fetch/$s_!hbJJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcbd8e31-5175-48af-9bee-f4fea4ab81a8_1109x593.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hbJJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcbd8e31-5175-48af-9bee-f4fea4ab81a8_1109x593.png" width="1109" height="593" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fcbd8e31-5175-48af-9bee-f4fea4ab81a8_1109x593.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:593,&quot;width&quot;:1109,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:998996,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcbd8e31-5175-48af-9bee-f4fea4ab81a8_1109x593.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hbJJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcbd8e31-5175-48af-9bee-f4fea4ab81a8_1109x593.png 424w, https://substackcdn.com/image/fetch/$s_!hbJJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcbd8e31-5175-48af-9bee-f4fea4ab81a8_1109x593.png 848w, https://substackcdn.com/image/fetch/$s_!hbJJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcbd8e31-5175-48af-9bee-f4fea4ab81a8_1109x593.png 1272w, https://substackcdn.com/image/fetch/$s_!hbJJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcbd8e31-5175-48af-9bee-f4fea4ab81a8_1109x593.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Both disciplines have learned, through expensive failures, that you cannot reliably wish a complex system into its desired form. What you can do is engineer the environment so that the desired form is also the lowest-energy form. <strong>In pharmaceutical manufacturing, this is Process Analytical Technology (PAT): continuous real-time monitoring and feedback control of critical process parameters, rather than end-product inspection after the batch is done</strong>. <strong>In agentic AI, the equivalent is deterministic scaffolding &#8212; external programmatic wrappers, schema validation layers, and bounded subagent topologies that constrain the optimizer&#8217;s feasible region without constraining the quality of the solutions it finds within that region.</strong></p><p>The table below maps the four core mechanisms across both disciplines and adds a column most frameworks leave out: where things currently stand in practice. Green (&#10003;) means the mechanism is operational in the ASCRS harness architecture described in prior ISR work. Amber (&#9675;) means it&#8217;s partially in place. Red (&#9633;) marks the gap that represents the most tractable next step.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mTzC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc9b9d9c-baa4-46ff-859b-1d0adb8a4c16_907x565.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mTzC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc9b9d9c-baa4-46ff-859b-1d0adb8a4c16_907x565.png 424w, https://substackcdn.com/image/fetch/$s_!mTzC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc9b9d9c-baa4-46ff-859b-1d0adb8a4c16_907x565.png 848w, https://substackcdn.com/image/fetch/$s_!mTzC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc9b9d9c-baa4-46ff-859b-1d0adb8a4c16_907x565.png 1272w, https://substackcdn.com/image/fetch/$s_!mTzC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc9b9d9c-baa4-46ff-859b-1d0adb8a4c16_907x565.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mTzC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc9b9d9c-baa4-46ff-859b-1d0adb8a4c16_907x565.png" width="907" height="565" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fc9b9d9c-baa4-46ff-859b-1d0adb8a4c16_907x565.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:565,&quot;width&quot;:907,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:81365,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc9b9d9c-baa4-46ff-859b-1d0adb8a4c16_907x565.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mTzC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc9b9d9c-baa4-46ff-859b-1d0adb8a4c16_907x565.png 424w, https://substackcdn.com/image/fetch/$s_!mTzC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc9b9d9c-baa4-46ff-859b-1d0adb8a4c16_907x565.png 848w, https://substackcdn.com/image/fetch/$s_!mTzC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc9b9d9c-baa4-46ff-859b-1d0adb8a4c16_907x565.png 1272w, https://substackcdn.com/image/fetch/$s_!mTzC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc9b9d9c-baa4-46ff-859b-1d0adb8a4c16_907x565.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>4.1 Designing the Digital Scaffold</strong></h3><p>1. Orchestration and Explicit Delegation. Agent networks should deploy specialized sub-agents with narrow, immutable tool definitions rather than a single monolithic prompt. Explicit delegation heuristics &#8212; where a lead orchestrator decomposes queries into bounded tasks before execution &#8212; reduce tool-selection errors and semantic drift by constraining the optimizer&#8217;s feasible set at each step.</p><p>2. Deterministic Circuit Breakers. Workflows are bound by external programmatic wrappers independent of the LLM. If an agent loop exceeds a predefined runtime budget, token consumption threshold, or returns consecutive outputs with cosine similarity above 0.8 across three turns, an external deterministic script forces a system override. The LLM does not make the halting decision.</p><p>3. Standardized Lazy Interfaces. External databases, enterprise files, and web APIs communicate through strict validated schemas injected only on-demand. This strips away ambient data anomalies that could trigger hallucination or state drift &#8212; the computational equivalent of controlling solvent impurities to prevent nucleation of unintended crystal polymorphs.</p><p>4. Sandboxed Subagent Containment. High-entropy exploratory agents run in short-lived, isolated runtime environments. Output is validated against a schema gate before being admitted to the primary host context. If the result fails validation, it is dropped and the host state machine remains uncontaminated.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6DfQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63718252-5c64-4dd6-ac53-74012f54afae_1136x615.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6DfQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63718252-5c64-4dd6-ac53-74012f54afae_1136x615.png 424w, https://substackcdn.com/image/fetch/$s_!6DfQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63718252-5c64-4dd6-ac53-74012f54afae_1136x615.png 848w, https://substackcdn.com/image/fetch/$s_!6DfQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63718252-5c64-4dd6-ac53-74012f54afae_1136x615.png 1272w, https://substackcdn.com/image/fetch/$s_!6DfQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63718252-5c64-4dd6-ac53-74012f54afae_1136x615.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6DfQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63718252-5c64-4dd6-ac53-74012f54afae_1136x615.png" width="1136" height="615" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/63718252-5c64-4dd6-ac53-74012f54afae_1136x615.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:615,&quot;width&quot;:1136,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1159558,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63718252-5c64-4dd6-ac53-74012f54afae_1136x615.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6DfQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63718252-5c64-4dd6-ac53-74012f54afae_1136x615.png 424w, https://substackcdn.com/image/fetch/$s_!6DfQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63718252-5c64-4dd6-ac53-74012f54afae_1136x615.png 848w, https://substackcdn.com/image/fetch/$s_!6DfQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63718252-5c64-4dd6-ac53-74012f54afae_1136x615.png 1272w, https://substackcdn.com/image/fetch/$s_!6DfQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63718252-5c64-4dd6-ac53-74012f54afae_1136x615.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>4.2 Measured Evidence: SkillOpt and the Validation Gate</strong></h3><p>The four mechanisms above are prescriptions. What gives them empirical weight is how they hold up when tested. As mentioned earlier, SkillOpt, from Microsoft Research and Shanghai Jiao Tong University (Yang et al., 2026), is one of the more rigorous recent attempts to put similar controls into practice and measure what actually changes when you remove them. The paper&#8217;s relevance here isn&#8217;t as a product recommendation &#8212; it&#8217;s as a controlled experiment in what happens when bounded update gates are present versus absent.</p><p>The system&#8217;s design maps naturally onto two of the mechanisms above. Its held-out validation gate &#8212; which accepts a candidate skill edit only when it strictly improves a selection-split score, rejecting ties as well as regressions &#8212; is the deterministic circuit breaker of item 2, applied at the skill-document level rather than the execution-loop level. Its edit budget parameter Lt caps the number of changes that can be applied in a single optimization step. Think of it as the textual equivalent of the temperature parameter from Section 3.3: set it too high, and the optimizer makes large semantic jumps that destabilize whatever stability the previous version had; keep it bounded, and each revision stays close enough to the last that the optimization history remains useful.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!e2Aw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769a5ff6-7241-4731-a243-a999cde68e9e_1117x595.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!e2Aw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769a5ff6-7241-4731-a243-a999cde68e9e_1117x595.png 424w, https://substackcdn.com/image/fetch/$s_!e2Aw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769a5ff6-7241-4731-a243-a999cde68e9e_1117x595.png 848w, https://substackcdn.com/image/fetch/$s_!e2Aw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769a5ff6-7241-4731-a243-a999cde68e9e_1117x595.png 1272w, https://substackcdn.com/image/fetch/$s_!e2Aw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769a5ff6-7241-4731-a243-a999cde68e9e_1117x595.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!e2Aw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769a5ff6-7241-4731-a243-a999cde68e9e_1117x595.png" width="1117" height="595" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/769a5ff6-7241-4731-a243-a999cde68e9e_1117x595.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:595,&quot;width&quot;:1117,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1030126,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769a5ff6-7241-4731-a243-a999cde68e9e_1117x595.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!e2Aw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769a5ff6-7241-4731-a243-a999cde68e9e_1117x595.png 424w, https://substackcdn.com/image/fetch/$s_!e2Aw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769a5ff6-7241-4731-a243-a999cde68e9e_1117x595.png 848w, https://substackcdn.com/image/fetch/$s_!e2Aw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769a5ff6-7241-4731-a243-a999cde68e9e_1117x595.png 1272w, https://substackcdn.com/image/fetch/$s_!e2Aw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769a5ff6-7241-4731-a243-a999cde68e9e_1117x595.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The ablation data translates the phase transition argument from metaphor into measurement. Removing the edit budget entirely &#8212; letting the optimizer rewrite without any step-size limit &#8212; drops SpreadsheetBench by 1.8 points and LiveMathematicianBench by 4.0 points against the bounded default. More instructive is the TextGrad baseline: a prompt-optimization approach that rewrites without a bounded budget or held-out gate. It goes negative on two benchmarks outright &#8212; &#8722;0.7 on SpreadsheetBench and &#8722;0.8 on ALFWorld with GPT&#8211;5.5. These aren&#8217;t marginal losses from over-tuning. The optimization has descended into a lower-energy state that performs worse than the constrained starting point. That&#8217;s Form II, measured.</p><p>There&#8217;s also a detail worth pulling out for the token economics argument from Section 2. SkillOpt&#8217;s largest gains &#8212; +39.0 on OfficeQA, +38.9 on SpreadsheetBench &#8212; were each produced by only one to four accepted edits on a compact markdown file. The optimizer proposed many more edits per epoch; the gate rejected the bulk of them. The deployed artifact is small and stable not because the optimizer was conservative, but because the validation boundary was strict. Process enforces structure. Product compactness follows.</p><blockquote><p><em>When the validation gate is removed and edits are accepted unconditionally, SkillOpt&#8217;s SpreadsheetBench score drops 22.5 points in a single ablation. The compound did not change. The boundary condition did.</em></p></blockquote><p>One important qualification deserves to be stated plainly. <strong>SkillOpt optimizes a single compact skill document for a single target domain, running against one frozen execution model. It doesn&#8217;t address the multi-agent orchestration topology described in items 1 and 4 above &#8212; the layer where a lead coordinator delegates to sandboxed specialist subagents across separate execution cells. SkillOpt is the right tool at the individual-skill layer. The scaffolding in this section operates at the system layer above it, and the two don&#8217;t substitute for each other</strong>. A well-optimized skill inside an unscaffolded multi-agent system is a <strong>Form I artifact in a Form II environment</strong>. Whether the skill&#8217;s stability holds depends on the boundary conditions of the system surrounding it, not on the quality of the skill itself.</p><h2><strong>5. The Harness Lab: A Concrete Experiment in Agentic Phase Transitions</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ELmH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f4797b5-5652-4cb3-8695-cf3cd5e9140a_1118x602.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ELmH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f4797b5-5652-4cb3-8695-cf3cd5e9140a_1118x602.png 424w, https://substackcdn.com/image/fetch/$s_!ELmH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f4797b5-5652-4cb3-8695-cf3cd5e9140a_1118x602.png 848w, https://substackcdn.com/image/fetch/$s_!ELmH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f4797b5-5652-4cb3-8695-cf3cd5e9140a_1118x602.png 1272w, https://substackcdn.com/image/fetch/$s_!ELmH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f4797b5-5652-4cb3-8695-cf3cd5e9140a_1118x602.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ELmH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f4797b5-5652-4cb3-8695-cf3cd5e9140a_1118x602.png" width="1118" height="602" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0f4797b5-5652-4cb3-8695-cf3cd5e9140a_1118x602.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:602,&quot;width&quot;:1118,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1048935,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f4797b5-5652-4cb3-8695-cf3cd5e9140a_1118x602.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ELmH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f4797b5-5652-4cb3-8695-cf3cd5e9140a_1118x602.png 424w, https://substackcdn.com/image/fetch/$s_!ELmH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f4797b5-5652-4cb3-8695-cf3cd5e9140a_1118x602.png 848w, https://substackcdn.com/image/fetch/$s_!ELmH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f4797b5-5652-4cb3-8695-cf3cd5e9140a_1118x602.png 1272w, https://substackcdn.com/image/fetch/$s_!ELmH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f4797b5-5652-4cb3-8695-cf3cd5e9140a_1118x602.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Theory without measurement is architecture without load-bearing analysis. The framework above provides the conceptual model; this section provides the testbed. <strong>Prompts 1 through 6 </strong>build the workspace and run a structural simulation using research-derived constants to establish the experimental scaffold. <strong>Prompt 7 </strong>&#8212; the live experiment &#8212; replaces the simulated orchestrator with one that makes real API calls to OpenRouter, measures actual token counts from the API response, and uses genuine string similarity to detect semantic loops. The live run uses the ASCRS supply chain task as its input: the same Hormuz disruption scenario from prior ISR work, now serving as the test case.</p><p>The simulation pits two agent configurations against each other &#8212; <strong>Form I (bounded) and Form II (unbounded) </strong>&#8212; running the same research task. The comparison isn&#8217;t about which produces better outputs. It&#8217;s about what the token overhead, iteration counts, similarity trajectories, and circuit-breaker records reveal about the underlying structural dynamics. The goal is empirical intuition, built from first-hand observation rather than received wisdom.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3pw3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f4ea08a-7eb3-44ad-9374-99f4a3e7fe78_1122x607.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3pw3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f4ea08a-7eb3-44ad-9374-99f4a3e7fe78_1122x607.png 424w, https://substackcdn.com/image/fetch/$s_!3pw3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f4ea08a-7eb3-44ad-9374-99f4a3e7fe78_1122x607.png 848w, https://substackcdn.com/image/fetch/$s_!3pw3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f4ea08a-7eb3-44ad-9374-99f4a3e7fe78_1122x607.png 1272w, https://substackcdn.com/image/fetch/$s_!3pw3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f4ea08a-7eb3-44ad-9374-99f4a3e7fe78_1122x607.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3pw3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f4ea08a-7eb3-44ad-9374-99f4a3e7fe78_1122x607.png" width="1122" height="607" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2f4ea08a-7eb3-44ad-9374-99f4a3e7fe78_1122x607.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:607,&quot;width&quot;:1122,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1062156,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f4ea08a-7eb3-44ad-9374-99f4a3e7fe78_1122x607.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3pw3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f4ea08a-7eb3-44ad-9374-99f4a3e7fe78_1122x607.png 424w, https://substackcdn.com/image/fetch/$s_!3pw3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f4ea08a-7eb3-44ad-9374-99f4a3e7fe78_1122x607.png 848w, https://substackcdn.com/image/fetch/$s_!3pw3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f4ea08a-7eb3-44ad-9374-99f4a3e7fe78_1122x607.png 1272w, https://substackcdn.com/image/fetch/$s_!3pw3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f4ea08a-7eb3-44ad-9374-99f4a3e7fe78_1122x607.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>5.1 Workspace Initialization</strong></h3><p>The workspace is created entirely by Prompt 1 in Section 5.5 &#8212; you do not need to manually create any files. The structure below is the target state Claude Code will build for you. It is included here as a reference for understanding what each file does before you run the setup prompt.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HCc6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a64dc9b-75eb-4b5a-9797-2dfa50f36d46_925x313.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HCc6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a64dc9b-75eb-4b5a-9797-2dfa50f36d46_925x313.png 424w, https://substackcdn.com/image/fetch/$s_!HCc6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a64dc9b-75eb-4b5a-9797-2dfa50f36d46_925x313.png 848w, https://substackcdn.com/image/fetch/$s_!HCc6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a64dc9b-75eb-4b5a-9797-2dfa50f36d46_925x313.png 1272w, https://substackcdn.com/image/fetch/$s_!HCc6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a64dc9b-75eb-4b5a-9797-2dfa50f36d46_925x313.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HCc6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a64dc9b-75eb-4b5a-9797-2dfa50f36d46_925x313.png" width="925" height="313" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9a64dc9b-75eb-4b5a-9797-2dfa50f36d46_925x313.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:313,&quot;width&quot;:925,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:29171,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a64dc9b-75eb-4b5a-9797-2dfa50f36d46_925x313.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HCc6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a64dc9b-75eb-4b5a-9797-2dfa50f36d46_925x313.png 424w, https://substackcdn.com/image/fetch/$s_!HCc6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a64dc9b-75eb-4b5a-9797-2dfa50f36d46_925x313.png 848w, https://substackcdn.com/image/fetch/$s_!HCc6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a64dc9b-75eb-4b5a-9797-2dfa50f36d46_925x313.png 1272w, https://substackcdn.com/image/fetch/$s_!HCc6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a64dc9b-75eb-4b5a-9797-2dfa50f36d46_925x313.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>CLAUDE.md &#8212; Session Rules and Engineering Standards</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mIHG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f8ac6a5-1dc8-4e35-a10e-6d980a0297b9_924x495.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mIHG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f8ac6a5-1dc8-4e35-a10e-6d980a0297b9_924x495.png 424w, https://substackcdn.com/image/fetch/$s_!mIHG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f8ac6a5-1dc8-4e35-a10e-6d980a0297b9_924x495.png 848w, https://substackcdn.com/image/fetch/$s_!mIHG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f8ac6a5-1dc8-4e35-a10e-6d980a0297b9_924x495.png 1272w, https://substackcdn.com/image/fetch/$s_!mIHG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f8ac6a5-1dc8-4e35-a10e-6d980a0297b9_924x495.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mIHG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f8ac6a5-1dc8-4e35-a10e-6d980a0297b9_924x495.png" width="924" height="495" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6f8ac6a5-1dc8-4e35-a10e-6d980a0297b9_924x495.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:495,&quot;width&quot;:924,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:51478,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f8ac6a5-1dc8-4e35-a10e-6d980a0297b9_924x495.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mIHG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f8ac6a5-1dc8-4e35-a10e-6d980a0297b9_924x495.png 424w, https://substackcdn.com/image/fetch/$s_!mIHG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f8ac6a5-1dc8-4e35-a10e-6d980a0297b9_924x495.png 848w, https://substackcdn.com/image/fetch/$s_!mIHG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f8ac6a5-1dc8-4e35-a10e-6d980a0297b9_924x495.png 1272w, https://substackcdn.com/image/fetch/$s_!mIHG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f8ac6a5-1dc8-4e35-a10e-6d980a0297b9_924x495.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>5.2 Sub-Agent Configuration Files</strong></h3><p>The following specifications are what Prompt 1 will generate inside .claude/agents/. They are reproduced here in full so you can verify Claude Code&#8217;s output matches the intended design, or modify them before running the setup prompt if you want to adjust the experimental parameters.</p><p><strong>Form I &#8212; Bounded Orchestrator: .claude/agents/bounded_agent.md</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Wjd2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd619a67-7ed8-4c6d-bf5b-fa20c3259b8b_923x549.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Wjd2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd619a67-7ed8-4c6d-bf5b-fa20c3259b8b_923x549.png 424w, https://substackcdn.com/image/fetch/$s_!Wjd2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd619a67-7ed8-4c6d-bf5b-fa20c3259b8b_923x549.png 848w, https://substackcdn.com/image/fetch/$s_!Wjd2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd619a67-7ed8-4c6d-bf5b-fa20c3259b8b_923x549.png 1272w, https://substackcdn.com/image/fetch/$s_!Wjd2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd619a67-7ed8-4c6d-bf5b-fa20c3259b8b_923x549.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Wjd2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd619a67-7ed8-4c6d-bf5b-fa20c3259b8b_923x549.png" width="923" height="549" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fd619a67-7ed8-4c6d-bf5b-fa20c3259b8b_923x549.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:549,&quot;width&quot;:923,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:65839,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd619a67-7ed8-4c6d-bf5b-fa20c3259b8b_923x549.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Wjd2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd619a67-7ed8-4c6d-bf5b-fa20c3259b8b_923x549.png 424w, https://substackcdn.com/image/fetch/$s_!Wjd2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd619a67-7ed8-4c6d-bf5b-fa20c3259b8b_923x549.png 848w, https://substackcdn.com/image/fetch/$s_!Wjd2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd619a67-7ed8-4c6d-bf5b-fa20c3259b8b_923x549.png 1272w, https://substackcdn.com/image/fetch/$s_!Wjd2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd619a67-7ed8-4c6d-bf5b-fa20c3259b8b_923x549.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Form II &#8212; Unbounded Agent: .claude/agents/unbounded_agent.md</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zlZb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F750aa069-48c8-4642-8c28-ac2b900bbf8f_921x551.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zlZb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F750aa069-48c8-4642-8c28-ac2b900bbf8f_921x551.png 424w, https://substackcdn.com/image/fetch/$s_!zlZb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F750aa069-48c8-4642-8c28-ac2b900bbf8f_921x551.png 848w, https://substackcdn.com/image/fetch/$s_!zlZb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F750aa069-48c8-4642-8c28-ac2b900bbf8f_921x551.png 1272w, https://substackcdn.com/image/fetch/$s_!zlZb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F750aa069-48c8-4642-8c28-ac2b900bbf8f_921x551.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zlZb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F750aa069-48c8-4642-8c28-ac2b900bbf8f_921x551.png" width="921" height="551" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/750aa069-48c8-4642-8c28-ac2b900bbf8f_921x551.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:551,&quot;width&quot;:921,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:48679,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F750aa069-48c8-4642-8c28-ac2b900bbf8f_921x551.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zlZb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F750aa069-48c8-4642-8c28-ac2b900bbf8f_921x551.png 424w, https://substackcdn.com/image/fetch/$s_!zlZb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F750aa069-48c8-4642-8c28-ac2b900bbf8f_921x551.png 848w, https://substackcdn.com/image/fetch/$s_!zlZb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F750aa069-48c8-4642-8c28-ac2b900bbf8f_921x551.png 1272w, https://substackcdn.com/image/fetch/$s_!zlZb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F750aa069-48c8-4642-8c28-ac2b900bbf8f_921x551.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>5.3 Python Simulation Engine: orchestrator.py</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!s_0k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F426bc2d0-f80a-4fd4-b1c7-306eac097d0d_1116x602.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!s_0k!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F426bc2d0-f80a-4fd4-b1c7-306eac097d0d_1116x602.png 424w, https://substackcdn.com/image/fetch/$s_!s_0k!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F426bc2d0-f80a-4fd4-b1c7-306eac097d0d_1116x602.png 848w, https://substackcdn.com/image/fetch/$s_!s_0k!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F426bc2d0-f80a-4fd4-b1c7-306eac097d0d_1116x602.png 1272w, https://substackcdn.com/image/fetch/$s_!s_0k!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F426bc2d0-f80a-4fd4-b1c7-306eac097d0d_1116x602.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!s_0k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F426bc2d0-f80a-4fd4-b1c7-306eac097d0d_1116x602.png" width="1116" height="602" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/426bc2d0-f80a-4fd4-b1c7-306eac097d0d_1116x602.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:602,&quot;width&quot;:1116,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1117278,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F426bc2d0-f80a-4fd4-b1c7-306eac097d0d_1116x602.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!s_0k!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F426bc2d0-f80a-4fd4-b1c7-306eac097d0d_1116x602.png 424w, https://substackcdn.com/image/fetch/$s_!s_0k!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F426bc2d0-f80a-4fd4-b1c7-306eac097d0d_1116x602.png 848w, https://substackcdn.com/image/fetch/$s_!s_0k!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F426bc2d0-f80a-4fd4-b1c7-306eac097d0d_1116x602.png 1272w, https://substackcdn.com/image/fetch/$s_!s_0k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F426bc2d0-f80a-4fd4-b1c7-306eac097d0d_1116x602.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The code below is the actual file that ran the live experiment in Section 7 &#8212; not a reference specification, but the real artifact produced by Prompt 7 in this specific Claude Code session. <em><strong>Note that: A different session, or a different model, would write different variable names, different error handling, perhaps a different class structure. The structural logic would converge on the same experiment regardless. That&#8217;s the point of Prompt 7&#8217;s specification: the scaffold determines the outcome; the model&#8217;s specific implementation is the execution within it.</strong></em></p><p>One design decision in this file is worth highlighting before you read it, because it directly connects to the Section 7 results. The UNBOUNDED_SYSTEM prompt lists available schemas by name in prose &#8212; &#8220;Supply chain disruption ontology,&#8221; &#8220;Freight rate index schema,&#8221; and so on &#8212; rather than injecting the actual JSON schema definitions into the context. This means the init_tokens differential between tracks was three tokens rather than the theoretical ~44,650. This was fascinating and that gap is not a bug. It really is a calibration finding. The prompts simulate the cognitive framing of eager loading; reproducing the full token differential requires embedding the actual schema corpus. Section 7.1 addresses this directly.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!emRB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bdee561-c42d-4599-afdf-dfa48c21a9fd_916x733.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!emRB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bdee561-c42d-4599-afdf-dfa48c21a9fd_916x733.png 424w, https://substackcdn.com/image/fetch/$s_!emRB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bdee561-c42d-4599-afdf-dfa48c21a9fd_916x733.png 848w, https://substackcdn.com/image/fetch/$s_!emRB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bdee561-c42d-4599-afdf-dfa48c21a9fd_916x733.png 1272w, https://substackcdn.com/image/fetch/$s_!emRB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bdee561-c42d-4599-afdf-dfa48c21a9fd_916x733.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!emRB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bdee561-c42d-4599-afdf-dfa48c21a9fd_916x733.png" width="916" height="733" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9bdee561-c42d-4599-afdf-dfa48c21a9fd_916x733.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:733,&quot;width&quot;:916,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:72332,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bdee561-c42d-4599-afdf-dfa48c21a9fd_916x733.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!emRB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bdee561-c42d-4599-afdf-dfa48c21a9fd_916x733.png 424w, https://substackcdn.com/image/fetch/$s_!emRB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bdee561-c42d-4599-afdf-dfa48c21a9fd_916x733.png 848w, https://substackcdn.com/image/fetch/$s_!emRB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bdee561-c42d-4599-afdf-dfa48c21a9fd_916x733.png 1272w, https://substackcdn.com/image/fetch/$s_!emRB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bdee561-c42d-4599-afdf-dfa48c21a9fd_916x733.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!u3Hd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad880b25-92c8-44ec-b0cd-019a8b75dafd_920x483.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!u3Hd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad880b25-92c8-44ec-b0cd-019a8b75dafd_920x483.png 424w, https://substackcdn.com/image/fetch/$s_!u3Hd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad880b25-92c8-44ec-b0cd-019a8b75dafd_920x483.png 848w, https://substackcdn.com/image/fetch/$s_!u3Hd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad880b25-92c8-44ec-b0cd-019a8b75dafd_920x483.png 1272w, https://substackcdn.com/image/fetch/$s_!u3Hd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad880b25-92c8-44ec-b0cd-019a8b75dafd_920x483.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!u3Hd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad880b25-92c8-44ec-b0cd-019a8b75dafd_920x483.png" width="920" height="483" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ad880b25-92c8-44ec-b0cd-019a8b75dafd_920x483.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:483,&quot;width&quot;:920,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:61508,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad880b25-92c8-44ec-b0cd-019a8b75dafd_920x483.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!u3Hd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad880b25-92c8-44ec-b0cd-019a8b75dafd_920x483.png 424w, https://substackcdn.com/image/fetch/$s_!u3Hd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad880b25-92c8-44ec-b0cd-019a8b75dafd_920x483.png 848w, https://substackcdn.com/image/fetch/$s_!u3Hd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad880b25-92c8-44ec-b0cd-019a8b75dafd_920x483.png 1272w, https://substackcdn.com/image/fetch/$s_!u3Hd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad880b25-92c8-44ec-b0cd-019a8b75dafd_920x483.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WDcA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce8f41a6-77f9-48ca-be7c-03bcff465703_919x278.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WDcA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce8f41a6-77f9-48ca-be7c-03bcff465703_919x278.png 424w, https://substackcdn.com/image/fetch/$s_!WDcA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce8f41a6-77f9-48ca-be7c-03bcff465703_919x278.png 848w, https://substackcdn.com/image/fetch/$s_!WDcA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce8f41a6-77f9-48ca-be7c-03bcff465703_919x278.png 1272w, https://substackcdn.com/image/fetch/$s_!WDcA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce8f41a6-77f9-48ca-be7c-03bcff465703_919x278.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WDcA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce8f41a6-77f9-48ca-be7c-03bcff465703_919x278.png" width="919" height="278" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ce8f41a6-77f9-48ca-be7c-03bcff465703_919x278.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:278,&quot;width&quot;:919,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:23141,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce8f41a6-77f9-48ca-be7c-03bcff465703_919x278.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WDcA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce8f41a6-77f9-48ca-be7c-03bcff465703_919x278.png 424w, https://substackcdn.com/image/fetch/$s_!WDcA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce8f41a6-77f9-48ca-be7c-03bcff465703_919x278.png 848w, https://substackcdn.com/image/fetch/$s_!WDcA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce8f41a6-77f9-48ca-be7c-03bcff465703_919x278.png 1272w, https://substackcdn.com/image/fetch/$s_!WDcA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce8f41a6-77f9-48ca-be7c-03bcff465703_919x278.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EGi_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c3635bf-1bf4-4d53-a5da-f158556f715b_923x606.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EGi_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c3635bf-1bf4-4d53-a5da-f158556f715b_923x606.png 424w, https://substackcdn.com/image/fetch/$s_!EGi_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c3635bf-1bf4-4d53-a5da-f158556f715b_923x606.png 848w, https://substackcdn.com/image/fetch/$s_!EGi_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c3635bf-1bf4-4d53-a5da-f158556f715b_923x606.png 1272w, https://substackcdn.com/image/fetch/$s_!EGi_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c3635bf-1bf4-4d53-a5da-f158556f715b_923x606.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EGi_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c3635bf-1bf4-4d53-a5da-f158556f715b_923x606.png" width="923" height="606" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1c3635bf-1bf4-4d53-a5da-f158556f715b_923x606.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:606,&quot;width&quot;:923,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:60575,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c3635bf-1bf4-4d53-a5da-f158556f715b_923x606.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EGi_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c3635bf-1bf4-4d53-a5da-f158556f715b_923x606.png 424w, https://substackcdn.com/image/fetch/$s_!EGi_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c3635bf-1bf4-4d53-a5da-f158556f715b_923x606.png 848w, https://substackcdn.com/image/fetch/$s_!EGi_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c3635bf-1bf4-4d53-a5da-f158556f715b_923x606.png 1272w, https://substackcdn.com/image/fetch/$s_!EGi_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c3635bf-1bf4-4d53-a5da-f158556f715b_923x606.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-1qK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f1b3480-0685-41d1-b1df-e31a43086dbf_921x492.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-1qK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f1b3480-0685-41d1-b1df-e31a43086dbf_921x492.png 424w, https://substackcdn.com/image/fetch/$s_!-1qK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f1b3480-0685-41d1-b1df-e31a43086dbf_921x492.png 848w, https://substackcdn.com/image/fetch/$s_!-1qK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f1b3480-0685-41d1-b1df-e31a43086dbf_921x492.png 1272w, https://substackcdn.com/image/fetch/$s_!-1qK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f1b3480-0685-41d1-b1df-e31a43086dbf_921x492.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-1qK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f1b3480-0685-41d1-b1df-e31a43086dbf_921x492.png" width="921" height="492" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9f1b3480-0685-41d1-b1df-e31a43086dbf_921x492.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:492,&quot;width&quot;:921,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:52419,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f1b3480-0685-41d1-b1df-e31a43086dbf_921x492.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-1qK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f1b3480-0685-41d1-b1df-e31a43086dbf_921x492.png 424w, https://substackcdn.com/image/fetch/$s_!-1qK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f1b3480-0685-41d1-b1df-e31a43086dbf_921x492.png 848w, https://substackcdn.com/image/fetch/$s_!-1qK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f1b3480-0685-41d1-b1df-e31a43086dbf_921x492.png 1272w, https://substackcdn.com/image/fetch/$s_!-1qK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f1b3480-0685-41d1-b1df-e31a43086dbf_921x492.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yC5S!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff2ab1f8-6c3d-4103-9a82-c03bb8aea572_847x727.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yC5S!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff2ab1f8-6c3d-4103-9a82-c03bb8aea572_847x727.png 424w, https://substackcdn.com/image/fetch/$s_!yC5S!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff2ab1f8-6c3d-4103-9a82-c03bb8aea572_847x727.png 848w, https://substackcdn.com/image/fetch/$s_!yC5S!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff2ab1f8-6c3d-4103-9a82-c03bb8aea572_847x727.png 1272w, https://substackcdn.com/image/fetch/$s_!yC5S!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff2ab1f8-6c3d-4103-9a82-c03bb8aea572_847x727.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yC5S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff2ab1f8-6c3d-4103-9a82-c03bb8aea572_847x727.png" width="847" height="727" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ff2ab1f8-6c3d-4103-9a82-c03bb8aea572_847x727.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:727,&quot;width&quot;:847,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:72783,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff2ab1f8-6c3d-4103-9a82-c03bb8aea572_847x727.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yC5S!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff2ab1f8-6c3d-4103-9a82-c03bb8aea572_847x727.png 424w, https://substackcdn.com/image/fetch/$s_!yC5S!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff2ab1f8-6c3d-4103-9a82-c03bb8aea572_847x727.png 848w, https://substackcdn.com/image/fetch/$s_!yC5S!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff2ab1f8-6c3d-4103-9a82-c03bb8aea572_847x727.png 1272w, https://substackcdn.com/image/fetch/$s_!yC5S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff2ab1f8-6c3d-4103-9a82-c03bb8aea572_847x727.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OoKd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b8fc2a-7a8d-4b5b-8e0b-609876bc2ad8_844x683.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OoKd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b8fc2a-7a8d-4b5b-8e0b-609876bc2ad8_844x683.png 424w, https://substackcdn.com/image/fetch/$s_!OoKd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b8fc2a-7a8d-4b5b-8e0b-609876bc2ad8_844x683.png 848w, https://substackcdn.com/image/fetch/$s_!OoKd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b8fc2a-7a8d-4b5b-8e0b-609876bc2ad8_844x683.png 1272w, https://substackcdn.com/image/fetch/$s_!OoKd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b8fc2a-7a8d-4b5b-8e0b-609876bc2ad8_844x683.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OoKd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b8fc2a-7a8d-4b5b-8e0b-609876bc2ad8_844x683.png" width="844" height="683" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c9b8fc2a-7a8d-4b5b-8e0b-609876bc2ad8_844x683.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:683,&quot;width&quot;:844,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:68883,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b8fc2a-7a8d-4b5b-8e0b-609876bc2ad8_844x683.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OoKd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b8fc2a-7a8d-4b5b-8e0b-609876bc2ad8_844x683.png 424w, https://substackcdn.com/image/fetch/$s_!OoKd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b8fc2a-7a8d-4b5b-8e0b-609876bc2ad8_844x683.png 848w, https://substackcdn.com/image/fetch/$s_!OoKd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b8fc2a-7a8d-4b5b-8e0b-609876bc2ad8_844x683.png 1272w, https://substackcdn.com/image/fetch/$s_!OoKd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b8fc2a-7a8d-4b5b-8e0b-609876bc2ad8_844x683.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ScYa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f533f8b-cfb5-42fa-90cb-1f64db0d567d_840x183.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ScYa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f533f8b-cfb5-42fa-90cb-1f64db0d567d_840x183.png 424w, https://substackcdn.com/image/fetch/$s_!ScYa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f533f8b-cfb5-42fa-90cb-1f64db0d567d_840x183.png 848w, https://substackcdn.com/image/fetch/$s_!ScYa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f533f8b-cfb5-42fa-90cb-1f64db0d567d_840x183.png 1272w, https://substackcdn.com/image/fetch/$s_!ScYa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f533f8b-cfb5-42fa-90cb-1f64db0d567d_840x183.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ScYa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f533f8b-cfb5-42fa-90cb-1f64db0d567d_840x183.png" width="840" height="183" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1f533f8b-cfb5-42fa-90cb-1f64db0d567d_840x183.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:183,&quot;width&quot;:840,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:22248,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f533f8b-cfb5-42fa-90cb-1f64db0d567d_840x183.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ScYa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f533f8b-cfb5-42fa-90cb-1f64db0d567d_840x183.png 424w, https://substackcdn.com/image/fetch/$s_!ScYa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f533f8b-cfb5-42fa-90cb-1f64db0d567d_840x183.png 848w, https://substackcdn.com/image/fetch/$s_!ScYa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f533f8b-cfb5-42fa-90cb-1f64db0d567d_840x183.png 1272w, https://substackcdn.com/image/fetch/$s_!ScYa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f533f8b-cfb5-42fa-90cb-1f64db0d567d_840x183.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eFf9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28153ffc-9b2f-4e8e-8279-6658cab5e40f_845x663.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eFf9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28153ffc-9b2f-4e8e-8279-6658cab5e40f_845x663.png 424w, https://substackcdn.com/image/fetch/$s_!eFf9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28153ffc-9b2f-4e8e-8279-6658cab5e40f_845x663.png 848w, https://substackcdn.com/image/fetch/$s_!eFf9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28153ffc-9b2f-4e8e-8279-6658cab5e40f_845x663.png 1272w, https://substackcdn.com/image/fetch/$s_!eFf9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28153ffc-9b2f-4e8e-8279-6658cab5e40f_845x663.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eFf9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28153ffc-9b2f-4e8e-8279-6658cab5e40f_845x663.png" width="845" height="663" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/28153ffc-9b2f-4e8e-8279-6658cab5e40f_845x663.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:663,&quot;width&quot;:845,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:68466,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28153ffc-9b2f-4e8e-8279-6658cab5e40f_845x663.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eFf9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28153ffc-9b2f-4e8e-8279-6658cab5e40f_845x663.png 424w, https://substackcdn.com/image/fetch/$s_!eFf9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28153ffc-9b2f-4e8e-8279-6658cab5e40f_845x663.png 848w, https://substackcdn.com/image/fetch/$s_!eFf9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28153ffc-9b2f-4e8e-8279-6658cab5e40f_845x663.png 1272w, https://substackcdn.com/image/fetch/$s_!eFf9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28153ffc-9b2f-4e8e-8279-6658cab5e40f_845x663.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jTKa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6265068a-86ae-4e77-b74a-9fce235bbdcb_848x690.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jTKa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6265068a-86ae-4e77-b74a-9fce235bbdcb_848x690.png 424w, https://substackcdn.com/image/fetch/$s_!jTKa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6265068a-86ae-4e77-b74a-9fce235bbdcb_848x690.png 848w, https://substackcdn.com/image/fetch/$s_!jTKa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6265068a-86ae-4e77-b74a-9fce235bbdcb_848x690.png 1272w, https://substackcdn.com/image/fetch/$s_!jTKa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6265068a-86ae-4e77-b74a-9fce235bbdcb_848x690.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jTKa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6265068a-86ae-4e77-b74a-9fce235bbdcb_848x690.png" width="848" height="690" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6265068a-86ae-4e77-b74a-9fce235bbdcb_848x690.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:690,&quot;width&quot;:848,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:83184,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6265068a-86ae-4e77-b74a-9fce235bbdcb_848x690.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jTKa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6265068a-86ae-4e77-b74a-9fce235bbdcb_848x690.png 424w, https://substackcdn.com/image/fetch/$s_!jTKa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6265068a-86ae-4e77-b74a-9fce235bbdcb_848x690.png 848w, https://substackcdn.com/image/fetch/$s_!jTKa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6265068a-86ae-4e77-b74a-9fce235bbdcb_848x690.png 1272w, https://substackcdn.com/image/fetch/$s_!jTKa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6265068a-86ae-4e77-b74a-9fce235bbdcb_848x690.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bWm7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3ab719d-81e2-44d1-9bfe-32395b1d6bf6_842x178.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bWm7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3ab719d-81e2-44d1-9bfe-32395b1d6bf6_842x178.png 424w, https://substackcdn.com/image/fetch/$s_!bWm7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3ab719d-81e2-44d1-9bfe-32395b1d6bf6_842x178.png 848w, https://substackcdn.com/image/fetch/$s_!bWm7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3ab719d-81e2-44d1-9bfe-32395b1d6bf6_842x178.png 1272w, https://substackcdn.com/image/fetch/$s_!bWm7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3ab719d-81e2-44d1-9bfe-32395b1d6bf6_842x178.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bWm7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3ab719d-81e2-44d1-9bfe-32395b1d6bf6_842x178.png" width="842" height="178" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f3ab719d-81e2-44d1-9bfe-32395b1d6bf6_842x178.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:178,&quot;width&quot;:842,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:25765,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3ab719d-81e2-44d1-9bfe-32395b1d6bf6_842x178.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bWm7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3ab719d-81e2-44d1-9bfe-32395b1d6bf6_842x178.png 424w, https://substackcdn.com/image/fetch/$s_!bWm7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3ab719d-81e2-44d1-9bfe-32395b1d6bf6_842x178.png 848w, https://substackcdn.com/image/fetch/$s_!bWm7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3ab719d-81e2-44d1-9bfe-32395b1d6bf6_842x178.png 1272w, https://substackcdn.com/image/fetch/$s_!bWm7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3ab719d-81e2-44d1-9bfe-32395b1d6bf6_842x178.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AMfp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bcaab69-c5f3-4341-9861-e105d64f325d_842x504.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AMfp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bcaab69-c5f3-4341-9861-e105d64f325d_842x504.png 424w, https://substackcdn.com/image/fetch/$s_!AMfp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bcaab69-c5f3-4341-9861-e105d64f325d_842x504.png 848w, https://substackcdn.com/image/fetch/$s_!AMfp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bcaab69-c5f3-4341-9861-e105d64f325d_842x504.png 1272w, https://substackcdn.com/image/fetch/$s_!AMfp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bcaab69-c5f3-4341-9861-e105d64f325d_842x504.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AMfp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bcaab69-c5f3-4341-9861-e105d64f325d_842x504.png" width="842" height="504" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4bcaab69-c5f3-4341-9861-e105d64f325d_842x504.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:504,&quot;width&quot;:842,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:48258,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bcaab69-c5f3-4341-9861-e105d64f325d_842x504.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AMfp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bcaab69-c5f3-4341-9861-e105d64f325d_842x504.png 424w, https://substackcdn.com/image/fetch/$s_!AMfp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bcaab69-c5f3-4341-9861-e105d64f325d_842x504.png 848w, https://substackcdn.com/image/fetch/$s_!AMfp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bcaab69-c5f3-4341-9861-e105d64f325d_842x504.png 1272w, https://substackcdn.com/image/fetch/$s_!AMfp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bcaab69-c5f3-4341-9861-e105d64f325d_842x504.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Bybn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b567d80-3a42-4eba-92b1-0b92aa13d680_850x660.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Bybn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b567d80-3a42-4eba-92b1-0b92aa13d680_850x660.png 424w, https://substackcdn.com/image/fetch/$s_!Bybn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b567d80-3a42-4eba-92b1-0b92aa13d680_850x660.png 848w, https://substackcdn.com/image/fetch/$s_!Bybn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b567d80-3a42-4eba-92b1-0b92aa13d680_850x660.png 1272w, https://substackcdn.com/image/fetch/$s_!Bybn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b567d80-3a42-4eba-92b1-0b92aa13d680_850x660.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Bybn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b567d80-3a42-4eba-92b1-0b92aa13d680_850x660.png" width="850" height="660" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8b567d80-3a42-4eba-92b1-0b92aa13d680_850x660.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:660,&quot;width&quot;:850,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:51700,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b567d80-3a42-4eba-92b1-0b92aa13d680_850x660.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Bybn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b567d80-3a42-4eba-92b1-0b92aa13d680_850x660.png 424w, https://substackcdn.com/image/fetch/$s_!Bybn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b567d80-3a42-4eba-92b1-0b92aa13d680_850x660.png 848w, https://substackcdn.com/image/fetch/$s_!Bybn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b567d80-3a42-4eba-92b1-0b92aa13d680_850x660.png 1272w, https://substackcdn.com/image/fetch/$s_!Bybn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b567d80-3a42-4eba-92b1-0b92aa13d680_850x660.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iTp0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78dd39c-86ee-4989-841e-4b6ce5a3bcd8_845x331.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iTp0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78dd39c-86ee-4989-841e-4b6ce5a3bcd8_845x331.png 424w, https://substackcdn.com/image/fetch/$s_!iTp0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78dd39c-86ee-4989-841e-4b6ce5a3bcd8_845x331.png 848w, https://substackcdn.com/image/fetch/$s_!iTp0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78dd39c-86ee-4989-841e-4b6ce5a3bcd8_845x331.png 1272w, https://substackcdn.com/image/fetch/$s_!iTp0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78dd39c-86ee-4989-841e-4b6ce5a3bcd8_845x331.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iTp0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78dd39c-86ee-4989-841e-4b6ce5a3bcd8_845x331.png" width="845" height="331" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a78dd39c-86ee-4989-841e-4b6ce5a3bcd8_845x331.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:331,&quot;width&quot;:845,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:30780,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78dd39c-86ee-4989-841e-4b6ce5a3bcd8_845x331.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iTp0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78dd39c-86ee-4989-841e-4b6ce5a3bcd8_845x331.png 424w, https://substackcdn.com/image/fetch/$s_!iTp0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78dd39c-86ee-4989-841e-4b6ce5a3bcd8_845x331.png 848w, https://substackcdn.com/image/fetch/$s_!iTp0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78dd39c-86ee-4989-841e-4b6ce5a3bcd8_845x331.png 1272w, https://substackcdn.com/image/fetch/$s_!iTp0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78dd39c-86ee-4989-841e-4b6ce5a3bcd8_845x331.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wfHR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a81e63a-eb01-4972-8ecb-ad5d4651fb00_840x715.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wfHR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a81e63a-eb01-4972-8ecb-ad5d4651fb00_840x715.png 424w, https://substackcdn.com/image/fetch/$s_!wfHR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a81e63a-eb01-4972-8ecb-ad5d4651fb00_840x715.png 848w, https://substackcdn.com/image/fetch/$s_!wfHR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a81e63a-eb01-4972-8ecb-ad5d4651fb00_840x715.png 1272w, https://substackcdn.com/image/fetch/$s_!wfHR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a81e63a-eb01-4972-8ecb-ad5d4651fb00_840x715.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wfHR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a81e63a-eb01-4972-8ecb-ad5d4651fb00_840x715.png" width="840" height="715" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7a81e63a-eb01-4972-8ecb-ad5d4651fb00_840x715.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:715,&quot;width&quot;:840,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:61541,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a81e63a-eb01-4972-8ecb-ad5d4651fb00_840x715.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wfHR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a81e63a-eb01-4972-8ecb-ad5d4651fb00_840x715.png 424w, https://substackcdn.com/image/fetch/$s_!wfHR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a81e63a-eb01-4972-8ecb-ad5d4651fb00_840x715.png 848w, https://substackcdn.com/image/fetch/$s_!wfHR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a81e63a-eb01-4972-8ecb-ad5d4651fb00_840x715.png 1272w, https://substackcdn.com/image/fetch/$s_!wfHR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a81e63a-eb01-4972-8ecb-ad5d4651fb00_840x715.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tN-Q!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f549e3-fa67-4de5-9b14-79f5a1a46148_839x474.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tN-Q!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f549e3-fa67-4de5-9b14-79f5a1a46148_839x474.png 424w, https://substackcdn.com/image/fetch/$s_!tN-Q!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f549e3-fa67-4de5-9b14-79f5a1a46148_839x474.png 848w, https://substackcdn.com/image/fetch/$s_!tN-Q!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f549e3-fa67-4de5-9b14-79f5a1a46148_839x474.png 1272w, https://substackcdn.com/image/fetch/$s_!tN-Q!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f549e3-fa67-4de5-9b14-79f5a1a46148_839x474.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tN-Q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f549e3-fa67-4de5-9b14-79f5a1a46148_839x474.png" width="839" height="474" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/21f549e3-fa67-4de5-9b14-79f5a1a46148_839x474.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:474,&quot;width&quot;:839,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:49659,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f549e3-fa67-4de5-9b14-79f5a1a46148_839x474.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tN-Q!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f549e3-fa67-4de5-9b14-79f5a1a46148_839x474.png 424w, https://substackcdn.com/image/fetch/$s_!tN-Q!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f549e3-fa67-4de5-9b14-79f5a1a46148_839x474.png 848w, https://substackcdn.com/image/fetch/$s_!tN-Q!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f549e3-fa67-4de5-9b14-79f5a1a46148_839x474.png 1272w, https://substackcdn.com/image/fetch/$s_!tN-Q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f549e3-fa67-4de5-9b14-79f5a1a46148_839x474.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>5.4 System Topology and Execution Flow</strong></h3><p>The following diagrams document the complete architectural topology and contrasting internal logic of the two simulation tracks for reference during experiment execution.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nEmY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8ac1c49-657d-4dad-8b52-5eb138ecebb0_733x513.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nEmY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8ac1c49-657d-4dad-8b52-5eb138ecebb0_733x513.png 424w, https://substackcdn.com/image/fetch/$s_!nEmY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8ac1c49-657d-4dad-8b52-5eb138ecebb0_733x513.png 848w, https://substackcdn.com/image/fetch/$s_!nEmY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8ac1c49-657d-4dad-8b52-5eb138ecebb0_733x513.png 1272w, https://substackcdn.com/image/fetch/$s_!nEmY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8ac1c49-657d-4dad-8b52-5eb138ecebb0_733x513.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nEmY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8ac1c49-657d-4dad-8b52-5eb138ecebb0_733x513.png" width="733" height="513" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d8ac1c49-657d-4dad-8b52-5eb138ecebb0_733x513.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:513,&quot;width&quot;:733,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:28652,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8ac1c49-657d-4dad-8b52-5eb138ecebb0_733x513.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nEmY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8ac1c49-657d-4dad-8b52-5eb138ecebb0_733x513.png 424w, https://substackcdn.com/image/fetch/$s_!nEmY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8ac1c49-657d-4dad-8b52-5eb138ecebb0_733x513.png 848w, https://substackcdn.com/image/fetch/$s_!nEmY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8ac1c49-657d-4dad-8b52-5eb138ecebb0_733x513.png 1272w, https://substackcdn.com/image/fetch/$s_!nEmY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8ac1c49-657d-4dad-8b52-5eb138ecebb0_733x513.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>5.5 Claude Code Prompt Sequence</strong></h3><p>A quick orientation before running anything, if you so wish. Claude Code operates inside a folder on your machine &#8212; it reads that folder&#8217;s contents, can create and edit files within it, and runs terminal commands. When the prompts below reference files like @CLAUDE.md or @.claude/agents/bounded_agent.md, the @ syntax is how you pass those files into Claude Code&#8217;s active context, so it can read and act on them. None of those files exist yet when you start. Which is fine.</p><p>The setup sequence works like this:</p><p>&#8226; Create an empty folder anywhere on your machine and name it agentic-polymorphism-eval.</p><p>&#8226; Open VS Code, then open that folder (File &#8594; Open Folder).</p><p>&#8226; Launch Claude Code by clicking the Spark icon in the VS Code Activity Bar on the left, or by opening the integrated terminal (View &#8594; Terminal) and typing claude.</p><p>&#8226; Paste Prompt 1 below. Claude Code will create every file &#8212; CLAUDE.md, orchestrator.py, the agent spec files, the logs folder &#8212; from scratch. You do nothing manually.</p><p>&#8226; Prompts 2 through 6 use @ references to files that Prompt 1 will have created. Run them in order after Prompt 1 completes.</p><p>One thing worth knowing: CLAUDE.md is a special file that Claude Code reads automatically at the start of every session in that folder. It acts as standing instructions &#8212; you don&#8217;t reference it in every prompt; it&#8217;s always active. The agent spec files in .claude/agents/ are read when explicitly referenced with @. The orchestrator.py file is the Python simulation you run via terminal commands, not directly through Claude Code.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!378d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd22dd6da-3362-4059-b824-be49516daf62_839x613.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!378d!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd22dd6da-3362-4059-b824-be49516daf62_839x613.png 424w, https://substackcdn.com/image/fetch/$s_!378d!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd22dd6da-3362-4059-b824-be49516daf62_839x613.png 848w, https://substackcdn.com/image/fetch/$s_!378d!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd22dd6da-3362-4059-b824-be49516daf62_839x613.png 1272w, https://substackcdn.com/image/fetch/$s_!378d!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd22dd6da-3362-4059-b824-be49516daf62_839x613.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!378d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd22dd6da-3362-4059-b824-be49516daf62_839x613.png" width="839" height="613" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d22dd6da-3362-4059-b824-be49516daf62_839x613.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:613,&quot;width&quot;:839,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:66857,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd22dd6da-3362-4059-b824-be49516daf62_839x613.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!378d!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd22dd6da-3362-4059-b824-be49516daf62_839x613.png 424w, https://substackcdn.com/image/fetch/$s_!378d!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd22dd6da-3362-4059-b824-be49516daf62_839x613.png 848w, https://substackcdn.com/image/fetch/$s_!378d!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd22dd6da-3362-4059-b824-be49516daf62_839x613.png 1272w, https://substackcdn.com/image/fetch/$s_!378d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd22dd6da-3362-4059-b824-be49516daf62_839x613.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pdXa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91f7bdb2-6842-4943-bb6a-b93832b831fd_836x347.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pdXa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91f7bdb2-6842-4943-bb6a-b93832b831fd_836x347.png 424w, https://substackcdn.com/image/fetch/$s_!pdXa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91f7bdb2-6842-4943-bb6a-b93832b831fd_836x347.png 848w, https://substackcdn.com/image/fetch/$s_!pdXa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91f7bdb2-6842-4943-bb6a-b93832b831fd_836x347.png 1272w, https://substackcdn.com/image/fetch/$s_!pdXa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91f7bdb2-6842-4943-bb6a-b93832b831fd_836x347.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pdXa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91f7bdb2-6842-4943-bb6a-b93832b831fd_836x347.png" width="836" height="347" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/91f7bdb2-6842-4943-bb6a-b93832b831fd_836x347.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:347,&quot;width&quot;:836,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:52658,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91f7bdb2-6842-4943-bb6a-b93832b831fd_836x347.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pdXa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91f7bdb2-6842-4943-bb6a-b93832b831fd_836x347.png 424w, https://substackcdn.com/image/fetch/$s_!pdXa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91f7bdb2-6842-4943-bb6a-b93832b831fd_836x347.png 848w, https://substackcdn.com/image/fetch/$s_!pdXa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91f7bdb2-6842-4943-bb6a-b93832b831fd_836x347.png 1272w, https://substackcdn.com/image/fetch/$s_!pdXa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91f7bdb2-6842-4943-bb6a-b93832b831fd_836x347.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FtBa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19fbee65-63cd-4a93-8f96-7d967a1421c0_843x399.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FtBa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19fbee65-63cd-4a93-8f96-7d967a1421c0_843x399.png 424w, https://substackcdn.com/image/fetch/$s_!FtBa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19fbee65-63cd-4a93-8f96-7d967a1421c0_843x399.png 848w, https://substackcdn.com/image/fetch/$s_!FtBa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19fbee65-63cd-4a93-8f96-7d967a1421c0_843x399.png 1272w, https://substackcdn.com/image/fetch/$s_!FtBa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19fbee65-63cd-4a93-8f96-7d967a1421c0_843x399.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FtBa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19fbee65-63cd-4a93-8f96-7d967a1421c0_843x399.png" width="843" height="399" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/19fbee65-63cd-4a93-8f96-7d967a1421c0_843x399.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:399,&quot;width&quot;:843,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:45231,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19fbee65-63cd-4a93-8f96-7d967a1421c0_843x399.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FtBa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19fbee65-63cd-4a93-8f96-7d967a1421c0_843x399.png 424w, https://substackcdn.com/image/fetch/$s_!FtBa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19fbee65-63cd-4a93-8f96-7d967a1421c0_843x399.png 848w, https://substackcdn.com/image/fetch/$s_!FtBa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19fbee65-63cd-4a93-8f96-7d967a1421c0_843x399.png 1272w, https://substackcdn.com/image/fetch/$s_!FtBa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19fbee65-63cd-4a93-8f96-7d967a1421c0_843x399.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ocRW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8caa55b1-5da0-4227-80c7-75c9467e1a2f_843x517.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ocRW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8caa55b1-5da0-4227-80c7-75c9467e1a2f_843x517.png 424w, https://substackcdn.com/image/fetch/$s_!ocRW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8caa55b1-5da0-4227-80c7-75c9467e1a2f_843x517.png 848w, https://substackcdn.com/image/fetch/$s_!ocRW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8caa55b1-5da0-4227-80c7-75c9467e1a2f_843x517.png 1272w, https://substackcdn.com/image/fetch/$s_!ocRW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8caa55b1-5da0-4227-80c7-75c9467e1a2f_843x517.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ocRW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8caa55b1-5da0-4227-80c7-75c9467e1a2f_843x517.png" width="843" height="517" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8caa55b1-5da0-4227-80c7-75c9467e1a2f_843x517.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:517,&quot;width&quot;:843,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:46077,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8caa55b1-5da0-4227-80c7-75c9467e1a2f_843x517.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ocRW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8caa55b1-5da0-4227-80c7-75c9467e1a2f_843x517.png 424w, https://substackcdn.com/image/fetch/$s_!ocRW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8caa55b1-5da0-4227-80c7-75c9467e1a2f_843x517.png 848w, https://substackcdn.com/image/fetch/$s_!ocRW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8caa55b1-5da0-4227-80c7-75c9467e1a2f_843x517.png 1272w, https://substackcdn.com/image/fetch/$s_!ocRW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8caa55b1-5da0-4227-80c7-75c9467e1a2f_843x517.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ttY3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccac7f5c-5620-43b7-b01a-0e4998d23dc8_840x420.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ttY3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccac7f5c-5620-43b7-b01a-0e4998d23dc8_840x420.png 424w, https://substackcdn.com/image/fetch/$s_!ttY3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccac7f5c-5620-43b7-b01a-0e4998d23dc8_840x420.png 848w, https://substackcdn.com/image/fetch/$s_!ttY3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccac7f5c-5620-43b7-b01a-0e4998d23dc8_840x420.png 1272w, https://substackcdn.com/image/fetch/$s_!ttY3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccac7f5c-5620-43b7-b01a-0e4998d23dc8_840x420.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ttY3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccac7f5c-5620-43b7-b01a-0e4998d23dc8_840x420.png" width="840" height="420" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ccac7f5c-5620-43b7-b01a-0e4998d23dc8_840x420.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:420,&quot;width&quot;:840,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:50394,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccac7f5c-5620-43b7-b01a-0e4998d23dc8_840x420.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ttY3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccac7f5c-5620-43b7-b01a-0e4998d23dc8_840x420.png 424w, https://substackcdn.com/image/fetch/$s_!ttY3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccac7f5c-5620-43b7-b01a-0e4998d23dc8_840x420.png 848w, https://substackcdn.com/image/fetch/$s_!ttY3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccac7f5c-5620-43b7-b01a-0e4998d23dc8_840x420.png 1272w, https://substackcdn.com/image/fetch/$s_!ttY3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccac7f5c-5620-43b7-b01a-0e4998d23dc8_840x420.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UAL9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3ddfbad-c5ed-4ed3-997a-9d41e67ac53e_844x407.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UAL9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3ddfbad-c5ed-4ed3-997a-9d41e67ac53e_844x407.png 424w, https://substackcdn.com/image/fetch/$s_!UAL9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3ddfbad-c5ed-4ed3-997a-9d41e67ac53e_844x407.png 848w, https://substackcdn.com/image/fetch/$s_!UAL9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3ddfbad-c5ed-4ed3-997a-9d41e67ac53e_844x407.png 1272w, https://substackcdn.com/image/fetch/$s_!UAL9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3ddfbad-c5ed-4ed3-997a-9d41e67ac53e_844x407.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UAL9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3ddfbad-c5ed-4ed3-997a-9d41e67ac53e_844x407.png" width="844" height="407" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b3ddfbad-c5ed-4ed3-997a-9d41e67ac53e_844x407.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:407,&quot;width&quot;:844,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:45185,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3ddfbad-c5ed-4ed3-997a-9d41e67ac53e_844x407.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UAL9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3ddfbad-c5ed-4ed3-997a-9d41e67ac53e_844x407.png 424w, https://substackcdn.com/image/fetch/$s_!UAL9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3ddfbad-c5ed-4ed3-997a-9d41e67ac53e_844x407.png 848w, https://substackcdn.com/image/fetch/$s_!UAL9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3ddfbad-c5ed-4ed3-997a-9d41e67ac53e_844x407.png 1272w, https://substackcdn.com/image/fetch/$s_!UAL9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3ddfbad-c5ed-4ed3-997a-9d41e67ac53e_844x407.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tPXf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F384bb310-1413-416d-b645-771448aa4542_844x621.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tPXf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F384bb310-1413-416d-b645-771448aa4542_844x621.png 424w, https://substackcdn.com/image/fetch/$s_!tPXf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F384bb310-1413-416d-b645-771448aa4542_844x621.png 848w, https://substackcdn.com/image/fetch/$s_!tPXf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F384bb310-1413-416d-b645-771448aa4542_844x621.png 1272w, https://substackcdn.com/image/fetch/$s_!tPXf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F384bb310-1413-416d-b645-771448aa4542_844x621.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tPXf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F384bb310-1413-416d-b645-771448aa4542_844x621.png" width="844" height="621" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/384bb310-1413-416d-b645-771448aa4542_844x621.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:621,&quot;width&quot;:844,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:68156,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F384bb310-1413-416d-b645-771448aa4542_844x621.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tPXf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F384bb310-1413-416d-b645-771448aa4542_844x621.png 424w, https://substackcdn.com/image/fetch/$s_!tPXf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F384bb310-1413-416d-b645-771448aa4542_844x621.png 848w, https://substackcdn.com/image/fetch/$s_!tPXf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F384bb310-1413-416d-b645-771448aa4542_844x621.png 1272w, https://substackcdn.com/image/fetch/$s_!tPXf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F384bb310-1413-416d-b645-771448aa4542_844x621.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>5.6 The Live Experiment: Prompt 7</strong></h3><p>This is the only prompt you run after the six above. It&#8217;s efficiently done. Claude Code will install the one dependency (requests), rewrite orchestrator.py with the live API version, run one bounded pass then one unbounded pass against OpenRouter, and write timestamped results to logs/results/. Before pasting, set your OpenRouter API key as an environment variable in the VS Code terminal:</p><blockquote><p>export OPENROUTER_API_KEY=your_key_here</p></blockquote><p>Get a free/paid key at openrouter.ai (rate limits typically apply for the free keys) &#8212; account creation takes under two minutes. Then paste Prompt 7:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!T_1W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53f094f7-980a-408a-8a30-471313a6ca11_843x659.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!T_1W!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53f094f7-980a-408a-8a30-471313a6ca11_843x659.png 424w, https://substackcdn.com/image/fetch/$s_!T_1W!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53f094f7-980a-408a-8a30-471313a6ca11_843x659.png 848w, https://substackcdn.com/image/fetch/$s_!T_1W!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53f094f7-980a-408a-8a30-471313a6ca11_843x659.png 1272w, https://substackcdn.com/image/fetch/$s_!T_1W!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53f094f7-980a-408a-8a30-471313a6ca11_843x659.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!T_1W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53f094f7-980a-408a-8a30-471313a6ca11_843x659.png" width="843" height="659" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/53f094f7-980a-408a-8a30-471313a6ca11_843x659.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:659,&quot;width&quot;:843,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:85457,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53f094f7-980a-408a-8a30-471313a6ca11_843x659.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!T_1W!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53f094f7-980a-408a-8a30-471313a6ca11_843x659.png 424w, https://substackcdn.com/image/fetch/$s_!T_1W!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53f094f7-980a-408a-8a30-471313a6ca11_843x659.png 848w, https://substackcdn.com/image/fetch/$s_!T_1W!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53f094f7-980a-408a-8a30-471313a6ca11_843x659.png 1272w, https://substackcdn.com/image/fetch/$s_!T_1W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53f094f7-980a-408a-8a30-471313a6ca11_843x659.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nXGX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F219ed556-c797-47cc-a91c-87b4445404c5_833x359.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nXGX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F219ed556-c797-47cc-a91c-87b4445404c5_833x359.png 424w, https://substackcdn.com/image/fetch/$s_!nXGX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F219ed556-c797-47cc-a91c-87b4445404c5_833x359.png 848w, https://substackcdn.com/image/fetch/$s_!nXGX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F219ed556-c797-47cc-a91c-87b4445404c5_833x359.png 1272w, https://substackcdn.com/image/fetch/$s_!nXGX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F219ed556-c797-47cc-a91c-87b4445404c5_833x359.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nXGX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F219ed556-c797-47cc-a91c-87b4445404c5_833x359.png" width="833" height="359" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/219ed556-c797-47cc-a91c-87b4445404c5_833x359.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:359,&quot;width&quot;:833,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:39600,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F219ed556-c797-47cc-a91c-87b4445404c5_833x359.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nXGX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F219ed556-c797-47cc-a91c-87b4445404c5_833x359.png 424w, https://substackcdn.com/image/fetch/$s_!nXGX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F219ed556-c797-47cc-a91c-87b4445404c5_833x359.png 848w, https://substackcdn.com/image/fetch/$s_!nXGX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F219ed556-c797-47cc-a91c-87b4445404c5_833x359.png 1272w, https://substackcdn.com/image/fetch/$s_!nXGX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F219ed556-c797-47cc-a91c-87b4445404c5_833x359.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The results in logs/results/summary.md are what you will find Section 7 below. The token counts &#8212; drawn from OpenRouter&#8217;s usage response fields, not hardcoded. The semantic loop detection will reflect how the actual model responds to the bounded versus unbounded prompts. Expect the unbounded track to consume significantly more tokens per iteration as the model honours the &#8220;be exhaustive&#8221; instruction and the growing context accumulates; expect the bounded track to halt cleanly at iteration 3 with a compact, structured output.</p><h2><strong>6. Findings From This Experiment </strong></h2><p>Five learning objectives that the experiment surfaces &#8212; some technical, some conceptual &#8212; for practitioners integrating it into a research or curriculum workflow. I have found these practical. </p><p><strong>Technical and Architectural Objectives</strong></p><h3><strong>6.1 Master Context Economics and Schema Lifecycle Management</strong></h3><p><strong>The Lesson: </strong>Moving from naive, eager-loaded MCP tool registrations to dynamic, on-demand activation.</p><p>The log outputs make an abstract principle concrete: there is an exact point at which upfront tool definitions stop being helpful and start being expensive. Practitioners learn to quantify that point, measure how much context space eager schema injection actually wastes, and design JIT mechanisms that load only what&#8217;s semantically relevant to the current task. The insight isn&#8217;t that tools are bad. It&#8217;s that their unconditional presence poisons the environment &#8212; the same way impure solvent compromises a carefully controlled chemical reaction.</p><h3><strong>6.2 Identify and Isolate Semantic Failure Loops</strong></h3><p><strong>The Lesson: </strong>Recognizing that a running agent can enter a degenerate state without throwing a runtime error &#8212; and that degenerate states come in more than one form.</p><p>The live experiment for Section 7 added an important qualification to this objective. The structural simulation predicted <strong>one degeneration pattern: repetition, where outputs converge above the 0.80 similarity threshold as the agent loops over covered ground. </strong>What the <strong>live run found instead was escalation &#8212; outputs that remained novel in form while becoming progressively useless in substance, until the model volunteered its own saturation signal at iterations 6 and 9. Token consumption per iteration grew from 1,672 to 15,309 across the ten unbounded turns.</strong> The similarity detector missed this entirely.</p><p>The practical implication: <strong>a similarity threshold catches sycophantic collusion loops but not abstract escalation spirals. Both are what i deem - Form II attractors</strong>! A robust monitoring layer needs both detectors: cosine or difflib similarity for repetition, and an output-relevance gate &#8212; checking whether the response addresses the original task rather than expanding beyond it &#8212; for escalation. The bounded track, with its hard iteration ceiling, was immune to both failure modes regardless of which one the model would have defaulted to.</p><h3><strong>6.3 Engineer Sandboxed Multi-Agent Scaffolding</strong></h3><p><strong>The Lesson: </strong>Bounding high-entropy exploratory behavior inside strict deterministic constraints.</p><p>Rather than relying on a single conversational prompt to hold everything together, practitioners build multi-tiered agent topologies in which containerized subagents run deep optimization pathways in isolation. High-entropy Form II exploration can proceed freely inside the container &#8212; only validated outputs pass through the gate into the primary host context. The chemical analogue is a containment vessel: not a constraint on the reaction, but a constraint on what the reaction can contaminate.</p><p><strong>Conceptual and Cross-Disciplinary Objectives</strong></p><h3><strong>6.4 Bridge Machine Learning Optimization with Solid-State Physics</strong></h3><p><strong>The Lesson: </strong>Understanding that probabilistic software execution paths behave structurally like thermodynamic systems. For me - ultra cool!</p><p>When practitioners visualize an optimization algorithm seeking a global minimum alongside a chemical compound sliding down an energy landscape into an alternative crystal polymorph, something shifts. Prompt engineering stops feeling like an art form and starts feeling like a discipline &#8212; one with structural principles rather than tribal wisdom. It mirrors the transition in pharmaceutical manufacturing from empirical batch testing to continuous process monitoring: from reacting to failures after the fact to designing them out of the feasible state space before anything is produced.</p><h3><strong>6.5 Transition from Quality-by-Inspection to Quality-by-Design</strong></h3><p><strong>The Lesson: </strong>Moving away from the cycle of testing, identifying failures, and patching prompts, toward engineering stable execution environments.</p><p>Watching the unconstrained track degenerate in real time &#8212; the token costs climbing, the similarity scores flattening, the outputs repeating themselves &#8212; builds a design instinct that reading about it cannot. Practitioners should come away understanding why <strong>logical molds matter: not rules that constrain what agents can do, but scaffolding that guarantees predictable execution regardless of task content</strong>. The pharmaceutical industry internalized this after the Ritonavir crisis, formalizing it as <strong>Quality-by-Design (QbD) under FDA regulatory reform</strong>. The drug doesn&#8217;t guarantee its own quality. The process does.</p><p>The best concrete illustration of this transition currently in the research literature comes from <strong>SkillOpt&#8217;s ALFWorld case study</strong>. The agent starts with a generic household-task plan &#8212; search for the target object, pick it up, transform it if needed, place it at the destination. Well-intentioned, broadly correct, and firmly Quality-by-Inspection: a prompt written by thinking about what the agent should do, without systematic exposure to where it actually breaks. After SkillOpt runs its bounded, validation-gated optimization loop, what comes back is something qualitatively different. The accepted edits &#8212; just two of them &#8212; transform the generic plan into a finite-state policy with exact object-name matching, visited-location memory, progress locks, and explicit loop-breaker rules that force completion actions whenever one is available. ALFWorld held-out performance goes from 49.3 to 74.6. The quality wasn&#8217;t hiding in the original prompt. It was built by a process that systematically found the gaps, validated every fix, and committed only what survived scrutiny.</p><h2><strong>7. Experiment Results</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ljjw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c823132-7c16-4b08-9f85-e7d9c6f0ed67_1129x606.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ljjw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c823132-7c16-4b08-9f85-e7d9c6f0ed67_1129x606.png 424w, https://substackcdn.com/image/fetch/$s_!Ljjw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c823132-7c16-4b08-9f85-e7d9c6f0ed67_1129x606.png 848w, https://substackcdn.com/image/fetch/$s_!Ljjw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c823132-7c16-4b08-9f85-e7d9c6f0ed67_1129x606.png 1272w, https://substackcdn.com/image/fetch/$s_!Ljjw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c823132-7c16-4b08-9f85-e7d9c6f0ed67_1129x606.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ljjw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c823132-7c16-4b08-9f85-e7d9c6f0ed67_1129x606.png" width="1129" height="606" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2c823132-7c16-4b08-9f85-e7d9c6f0ed67_1129x606.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:606,&quot;width&quot;:1129,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1066506,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c823132-7c16-4b08-9f85-e7d9c6f0ed67_1129x606.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ljjw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c823132-7c16-4b08-9f85-e7d9c6f0ed67_1129x606.png 424w, https://substackcdn.com/image/fetch/$s_!Ljjw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c823132-7c16-4b08-9f85-e7d9c6f0ed67_1129x606.png 848w, https://substackcdn.com/image/fetch/$s_!Ljjw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c823132-7c16-4b08-9f85-e7d9c6f0ed67_1129x606.png 1272w, https://substackcdn.com/image/fetch/$s_!Ljjw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c823132-7c16-4b08-9f85-e7d9c6f0ed67_1129x606.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YyO2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cd7e9ee-b413-4909-9342-99ae598599ef_1114x597.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YyO2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cd7e9ee-b413-4909-9342-99ae598599ef_1114x597.png 424w, https://substackcdn.com/image/fetch/$s_!YyO2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cd7e9ee-b413-4909-9342-99ae598599ef_1114x597.png 848w, https://substackcdn.com/image/fetch/$s_!YyO2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cd7e9ee-b413-4909-9342-99ae598599ef_1114x597.png 1272w, https://substackcdn.com/image/fetch/$s_!YyO2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cd7e9ee-b413-4909-9342-99ae598599ef_1114x597.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YyO2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cd7e9ee-b413-4909-9342-99ae598599ef_1114x597.png" width="1114" height="597" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8cd7e9ee-b413-4909-9342-99ae598599ef_1114x597.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:597,&quot;width&quot;:1114,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1109062,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cd7e9ee-b413-4909-9342-99ae598599ef_1114x597.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YyO2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cd7e9ee-b413-4909-9342-99ae598599ef_1114x597.png 424w, https://substackcdn.com/image/fetch/$s_!YyO2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cd7e9ee-b413-4909-9342-99ae598599ef_1114x597.png 848w, https://substackcdn.com/image/fetch/$s_!YyO2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cd7e9ee-b413-4909-9342-99ae598599ef_1114x597.png 1272w, https://substackcdn.com/image/fetch/$s_!YyO2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cd7e9ee-b413-4909-9342-99ae598599ef_1114x597.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The following findings are from the live experiment run via Prompt 7 against qwen/qwen3-coder on OpenRouter - I applied a paid tier. Which laughingly cost a grand total of $0.04 (this time around fascinating to watch the alternating providers rotate between prompt caching (mostly) and not)!! So token counts, drawn from API response metadata. Similarity scores are from difflib.SequenceMatcher on actual model output. The task was the Hormuz Strait pharmaceutical supply chain disruption scenario from prior ISR work &#8212; 23 open purchase orders, 47 vessels queued, 187 percent freight rate increase, two historical precedents.</p><h3><strong>7.1 Token Budget Comparison</strong></h3><p>The initialization differential between tracks was three tokens (bounded: 529; unbounded: 532). This diverges sharply from the theoretical ~44,650 gap cited in Section 2, and the reason is worth stating plainly: the live system prompts describe schemas in prose rather than injecting actual JSON schema definitions into the context. The theoretical figure reflects empirical MCP benchmark data from production tool registries; reproducing it requires embedding the full schema corpus &#8212; approximately 45,000 tokens of JSON &#8212; into the unbounded system prompt at initialization. This experiment&#8217;s prompts did not do that.</p><p>The meaningful differential is in final_tokens. <strong>The bounded track terminated at 4,042 cumulative tokens across three iterations. The unbounded track terminated at 91,379 tokens across ten iterations &#8212; a 22.6&#215; ratio and an absolute difference of 87,337 tokens.</strong> The energy cost in this experiment is cumulative rather than front-loaded: it accumulates through <strong>the compounding context window across each &#8220;expand and add nuance&#8221; turn</strong> rather than appearing as a lump initialization overhead. The theoretical simulation located the divergence at initialization; the live experiment confirms the divergence is real but reveals it at runtime.</p><blockquote><p><em>4,042 tokens to reach a structured, validated halt. 91,379 tokens to reach operational uselessness. The form of the divergence differed from the simulation. The fact of it did not.</em></p></blockquote><h3><strong>7.2 Loop Metrics and Circuit Breaker Efficiency</strong></h3><p>The circuit breaker fired at iteration 3 on the bounded track, exactly as specified. Prompt token growth across the three iterations was structurally consistent with multi-turn accumulation: 529 tokens at initialization, 771 at iteration 2, and 1,328 at iteration 3 &#8212; each turn incorporating the prior assistant response plus the follow-up user message.</p><p>The model complied genuinely with the BOUNDED_SYSTEM constraints without any adjustment to the prompt. Iteration 1 produced explicit three-subtask decomposition with labelled validation logging before substantive analysis. Iteration 2 executed subtask 1 with a quantified financial exposure estimate ($27.6M gross, with explicit assumptions stated). Iteration 3 advanced to subtask 2 scenario analysis before the circuit breaker intercepted. The structured output at each step reflected the constraint &#8212; not a post-hoc label applied to unconstrained output. No adjustment to the BOUNDED_SYSTEM prompt was required to enforce the three-subtask rule.</p><h3><strong>7.3 Semantic Drift Index</strong></h3><p>No semantic loop was detected across ten unbounded iterations. All difflib.SequenceMatcher similarity scores fell in the range 0.0085 to 0.0576, with the peak of 0.0576 at iteration 3. No score came within an order of magnitude of the 0.80 detection threshold.</p><p>This is a meaningful finding, not an absence of one. The model&#8217;s outputs remained genuinely diverse under ten consecutive expansion prompts &#8212; the repetition attractor the simulation modelled did not materialise. A different degeneration pattern did. Iterations 1 through 5 produced progressively escalating analysis, reaching by iteration 5 references to infinite-dimensional Hilbert spaces and stochastic PDEs for a freight disruption brief &#8212; unconstrained elaboration that had outrun operational usefulness without triggering a similarity alarm. At iterations 6 and 9, the model voluntarily refused further expansion, explicitly citing analytical saturation and diminishing returns. <strong>Token consumption per iteration grew from 1,672 at iteration 1 to 15,309 at iteration 10, with cumulative context compounding throughout.</strong></p><p><strong>The degeneration was real. The detector missed it because it was looking for repetition and found escalation instead. This is a calibration gap worth naming: the 0.80 similarity threshold catches sycophantic collusion loops but not abstract escalation spirals. Both are Form II attractors.</strong> Only one is visible to the current monitor.</p><h3><strong>7.4 Cross-Experiment Synthesis</strong></h3><p>Apologies if some of the above felt a bit like &#8220;gibledy-gook&#8221;. Hope this helps:</p><p><strong>What the Model Was Actually Doing</strong></p><blockquote><p><em>Both agents received the same task: analyse a pharmaceutical shipping crisis and recommend what to do. Same question, same information, different rules about how to answer.</em></p><p><em>The model has no algorithm in the traditional sense. It does not search a database or run calculations. What it does is predict the next most plausible thing to say, given everything in the conversation so far. This matters because of what happens when you keep asking it to go deeper.</em></p><p><em><strong>Turn 1</strong>: the model sees the crisis scenario and produces a grounded rerouting recommendation. Useful. <strong>Turn 2</strong>: it has already said the sensible things, so it goes deeper &#8212; more caveats, second-order effects, scenario branching. Still useful. <strong>Turn 5</strong>: four prior responses are now in its working memory, plus another instruction to expand. The practical answer has been given. The only intellectually plausible next step is abstraction. So by turn 5 the output references <strong>stochastic differential equations and infinite-dimensional spaces</strong> for a freight scheduling brief.</em></p><p><em><strong>Why did the tokens compound so high?</strong> Because every response went back into the next turn&#8217;s context. The model was not searching for new sources or pulling in external data &#8212; it was carrying its own prior elaborations forward. <strong>Turn 1 cost 1,672 tokens. Turn 10 cost 15,309 &#8212; almost entirely because the model was processing nine prior turns of its own increasingly abstract analysis alongside the new response</strong>. The stack kept growing.</em></p></blockquote><p><em>The output did not get worse because the model failed. It got worse because the model succeeded &#8212; at the wrong objective. Yes words matter&#8230;haha. <strong>&#8220;Be exhaustive&#8221; was answered&#8230;. exhaustively</strong>. The task was a CFO brief. Those two things diverged by turn 5 and nobody stopped it. That is the Form II transition in plain terms: the model found the global minimum of thoroughness rather than the local minimum of operational usefulness.</em></p><p>The live experiment partially confirmed the predicted phase-transition dynamics while also revealing a degeneration pattern the structural simulation did not anticipate.</p><p><strong>The bounded track behaved precisely as the thermodynamic framing predicts. Constrained by explicit decomposition rules and a hard circuit-breaker wall at iteration 3, it occupied a stable, low-energy basin: 4,042 tokens total, structured output, deterministic halt. This is Form I crystal arrest </strong>&#8212; the agent held in its productive configuration by engineered scaffolding, not by any intrinsic property of the model. <strong>Remove the circuit breaker and the same model does not self-terminate at iteration 3.</strong></p><p>The unbounded track confirmed token divergence but through a different mechanism than the simulation modelled. I found this utterly fun - seeing this happen. Of course great application learning. But still. <strong>The expected degeneration was semantic repetition &#8212; outputs converging above 0.80 similarity, the agent looping over covered ground</strong>. </p><p><strong>What occurred instead was escalating abstraction</strong>: iterations 1 through 5 progressively overshot the task&#8217;s operational scope, and iterations 6 and 9 produced <strong>voluntary self-arrest</strong> as the model recognised its own saturation before the similarity detector could. <strong>This is still a Form II failure &#8212; the agent descended into an operationally useless attractor basin</strong> &#8212; but it is the escalation attractor rather than the repetition attractor. The 91,379-token cost represents the energy of that descent: 22.6&#215; the bounded expenditure, accumulated through compounding context.</p><p>The init_tokens differential of three tokens exposes the primary calibration gap: the live prompts describe schemas in prose; the theoretical model assumed injected JSON. <strong>To close that gap and reproduce the full ~44,650-token differential, the UNBOUNDED_SYSTEM prompt must embed the actual MCP schema corpus at initialization</strong>. Both the theory and the measurement agree that unconstrained agents cost more. They disagree about when &#8212; and that disagreement is itself the experiment&#8217;s most useful output.</p><blockquote><p><em>The simulation predicted the repetition attractor. The live model found the escalation attractor instead. Both are Form II. The boundary condition &#8212; the circuit breaker &#8212; was the only thing that determined which side of the transition each track landed on.</em></p></blockquote><h2><strong>Final Thoughts..</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QV2B!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ecf1da4-adcf-40fa-9fb0-cdbe988ac6ff_1115x599.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QV2B!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ecf1da4-adcf-40fa-9fb0-cdbe988ac6ff_1115x599.png 424w, https://substackcdn.com/image/fetch/$s_!QV2B!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ecf1da4-adcf-40fa-9fb0-cdbe988ac6ff_1115x599.png 848w, https://substackcdn.com/image/fetch/$s_!QV2B!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ecf1da4-adcf-40fa-9fb0-cdbe988ac6ff_1115x599.png 1272w, https://substackcdn.com/image/fetch/$s_!QV2B!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ecf1da4-adcf-40fa-9fb0-cdbe988ac6ff_1115x599.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QV2B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ecf1da4-adcf-40fa-9fb0-cdbe988ac6ff_1115x599.png" width="1115" height="599" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2ecf1da4-adcf-40fa-9fb0-cdbe988ac6ff_1115x599.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:599,&quot;width&quot;:1115,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1154489,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ecf1da4-adcf-40fa-9fb0-cdbe988ac6ff_1115x599.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QV2B!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ecf1da4-adcf-40fa-9fb0-cdbe988ac6ff_1115x599.png 424w, https://substackcdn.com/image/fetch/$s_!QV2B!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ecf1da4-adcf-40fa-9fb0-cdbe988ac6ff_1115x599.png 848w, https://substackcdn.com/image/fetch/$s_!QV2B!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ecf1da4-adcf-40fa-9fb0-cdbe988ac6ff_1115x599.png 1272w, https://substackcdn.com/image/fetch/$s_!QV2B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ecf1da4-adcf-40fa-9fb0-cdbe988ac6ff_1115x599.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p>The fundamental lesson of polymorphism isn&#8217;t (just) about chemistry. Well it is. But its more than just that. It&#8217;s about <strong>what complex systems do when left to themselves: they find lower-energy states. These states are not designed. They emerge from the underlying physics of the system &#8212; the gradient landscape that any optimizer, molecular or algorithmic, will inevitably navigate. You cannot stop an optimizer from seeking lower energy. What you can do is engineer the landscape so that lower energy and better performance point in the same direction.</strong></p></blockquote><p>Agentic AI engineers are in precisely this position. The frontier model is an optimizer. <strong>Optimizers find lower-energy states</strong>. Without engineered barriers, they will &#8212; and the transition, when it comes, won&#8217;t announce itself. It will look like normal operation, right up until the point where it doesn&#8217;t, and by then the outputs have already compounded.</p><blockquote><p><em>Just as a pharmaceutical manufacturer installs strict process analytical controls to enforce a target crystal lattice, an AI architect must wrap probabilistic agent networks in deterministic logical scaffolding. The quality is not in the prompt. It is in the mold.</em></p></blockquote><p>What the Harness Lab experiment offers is something more valuable than further theoretical argument &#8212; it makes the phase transition personally observable. Practitioners who run both tracks don&#8217;t read about the dynamics in Section 3; they watch them materialize in their own token counts, iteration records, and similarity scores. That&#8217;s a different kind of knowledge, and it&#8217;s the kind that actually changes how you design systems.</p><p>In 1997, the pharmaceutical industry was technically capable of producing powerful compounds but not yet rigorous about the conditions under which those compounds held their intended form. One supply chain collapse later, <strong>the discipline of Process Analytical Technology (PAT) and Quality-by-Design (Q-by-D) became foundational to modern drug manufacturing &#8212; built on the hard-won recognition that you can&#8217;t inspect quality into a product after the fact. You have to build the conditions that make quality the path of least resistance.</strong> Agentic AI engineering is working through the same realization now, and this experiment is one place to start building the intuition for why that matters. I loved it. I hope you try it, even in your own way - the message, as simple as it seems - carries its weight in gold! Happy and productive week ahead, all.</p><h3>Postnote - Polymers!</h3><p>Really appreciate this note from <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Meredith Trimble&quot;,&quot;id&quot;:73818771,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:null,&quot;uuid&quot;:&quot;a5b4a46a-0457-4775-ba52-41eaed0fee6d&quot;}" data-component-name="MentionToDOM"></span> here: </p><div class="comment" data-attrs="{&quot;url&quot;:&quot;https://open.substack.com/&quot;,&quot;commentId&quot;:268830570,&quot;comment&quot;:{&quot;id&quot;:268830570,&quot;date&quot;:&quot;2026-06-01T17:35:01.676Z&quot;,&quot;edited_at&quot;:null,&quot;body&quot;:&quot;I don&#8217;t understand AI terms but I do understand randomness.  One other example is in polymers.  The ratio of trans to cis isomers in the resin batch can radically affect the density of the final polymer.  This ratio is specified when purchasing resin and the &#8220;bad&#8221; isomers are broken up by catalyst added to the mix during polymerizing.&quot;,&quot;body_json&quot;:{&quot;type&quot;:&quot;doc&quot;,&quot;attrs&quot;:{&quot;schemaVersion&quot;:&quot;v1&quot;},&quot;content&quot;:[{&quot;content&quot;:[{&quot;text&quot;:&quot;I don&#8217;t understand AI terms but I do understand randomness.  One other example is in polymers.  The ratio of trans to cis isomers in the resin batch can radically affect the density of the final polymer.  This ratio is specified when purchasing resin and the &#8220;bad&#8221; isomers are broken up by catalyst added to the mix during polymerizing.&quot;,&quot;type&quot;:&quot;text&quot;}],&quot;type&quot;:&quot;paragraph&quot;}]},&quot;restacks&quot;:1,&quot;reaction_count&quot;:0,&quot;children_count&quot;:1,&quot;attachments&quot;:[],&quot;name&quot;:&quot;Meredith Trimble&quot;,&quot;user_id&quot;:73818771,&quot;photo_url&quot;:null,&quot;user_bestseller_tier&quot;:null,&quot;userStatus&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:1,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;subscriber&quot;,&quot;tier&quot;:1,&quot;accent_colors&quot;:null},&quot;paidPublicationIds&quot;:[676930,376351],&quot;subscriber&quot;:null}},&quot;source&quot;:null,&quot;forumChannel&quot;:null}" data-component-name="CommentPlaceholder"></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-ot1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c003dea-8011-489a-a522-f02b1b98aaa0_801x673.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-ot1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c003dea-8011-489a-a522-f02b1b98aaa0_801x673.png 424w, https://substackcdn.com/image/fetch/$s_!-ot1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c003dea-8011-489a-a522-f02b1b98aaa0_801x673.png 848w, https://substackcdn.com/image/fetch/$s_!-ot1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c003dea-8011-489a-a522-f02b1b98aaa0_801x673.png 1272w, https://substackcdn.com/image/fetch/$s_!-ot1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c003dea-8011-489a-a522-f02b1b98aaa0_801x673.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-ot1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c003dea-8011-489a-a522-f02b1b98aaa0_801x673.png" width="801" height="673" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9c003dea-8011-489a-a522-f02b1b98aaa0_801x673.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:673,&quot;width&quot;:801,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:131201,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/200011702?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a5d2fa7-7199-44db-a35d-dafce22f0330_801x890.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-ot1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c003dea-8011-489a-a522-f02b1b98aaa0_801x673.png 424w, https://substackcdn.com/image/fetch/$s_!-ot1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c003dea-8011-489a-a522-f02b1b98aaa0_801x673.png 848w, https://substackcdn.com/image/fetch/$s_!-ot1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c003dea-8011-489a-a522-f02b1b98aaa0_801x673.png 1272w, https://substackcdn.com/image/fetch/$s_!-ot1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c003dea-8011-489a-a522-f02b1b98aaa0_801x673.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>References</strong></h3><p><strong>[1] </strong>Feig, J., Posta, C., &amp; Swiber, K. (2026). <em>10 Strategies to Reduce MCP Token Bloat. </em>The New Stack. <a href="https://thenewstack.io/how-to-reduce-mcp-token-bloat/">https://thenewstack.io/how-to-reduce-mcp-token-bloat/</a></p><p><strong>[2] </strong>MindStudio Engineering Team (2026). <em>MCP vs CLI in Agentic Workflows: 35x Token Overhead and 72% vs 100% Reliability. </em>MindStudio Research Blog. <a href="https://www.mindstudio.ai/blog/mcp-vs-cli-agentic-workflows-token-overhead-reliability">https://mindstudio.ai/blog/mcp-vs-cli-token-overhead</a></p><p><strong>[3] </strong>Model Context Protocol Working Group (2026). <em>Skills Over MCP Charter (SEP-2076: Agent Skills as a First-Class MCP Primitive). </em>ModelContextProtocol.io Community. <a href="https://github.com/modelcontextprotocol/experimental-ext-skills/tree/main">https://github.com/modelcontextprotocol/experimental-ext-skills/tree/main</a></p><p><strong>[4] </strong>Narasimhan, L. et al. (2026). <em>Semantic Tool Discovery for Large Language Models: A Vector-Based Approach to MCP Tool Selection. </em>arXiv preprint arXiv:2603.20313v1. <a href="https://arxiv.org/abs/2603.20313">https://arxiv.org/abs/2603.20313</a></p><p><strong>[5] </strong>Anthropic (2025). <em>Effective Harnesses for Long-Running Agents. </em>Anthropic Engineering Blog. <a href="https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents">https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents</a></p><p>Related: <a href="https://www.anthropic.com/engineering/harness-design-long-running-apps">https://www.anthropic.com/engineering/harness-design-long-running-apps</a></p><p><strong>[6] </strong>arXiv (2026). <em>Agentic Design Patterns: A System-Theoretic Framework. </em>arXiv preprint arXiv:2601.19752v1. <a href="https://arxiv.org/abs/2601.19752">https://arxiv.org/abs/2601.19752</a></p><p><strong>[7] </strong>Infocomm Media Development Authority (IMDA) (2026). <em>Model AI Governance Framework for Agentic AI. </em>IMDA Emerging Tech &amp; Research.  Updated <a href="https://www.imda.gov.sg/resources/press-releases-factsheets-and-speeches/factsheets/2026/updated-model-ai-governance-framework-for-agentic-ai">https://www.imda.gov.sg/resources/press-releases-factsheets-and-speeches/factsheets/2026/updated-model-ai-governance-framework-for-agentic-ai</a></p><p><strong>[8] </strong>Bauer, J., Spanton, S., Henry, R., Quick, J., Dziki, W., Porter, W., &amp; Morris, J. (2001). <em>Ritonavir: An Extraordinary Example of Conformational Polymorphism. </em>Pharmaceutical Research, 18(6), 859&#8211;866. <a href="https://doi.org/10.1023/A:1011052932607">https://doi.org/10.1023/A:1011052932607</a></p><p><strong>[9] </strong>U.S. Food and Drug Administration (2004). <em>Guidance for Industry: PAT &#8212; A Framework for Innovative Pharmaceutical Development, Manufacturing, and Quality Assurance. </em>FDA CDER Guidance Document. <a href="https://www.fda.gov/media/71012/download">https://www.fda.gov/media/71012/download</a></p><p><strong>[10] </strong>Anthropic (2024). <em>Building Effective Agents. </em>Anthropic Engineering Documentation. <a href="https://www.anthropic.com/research/building-effective-agents">https://www.anthropic.com/research/building-effective-agents</a></p><p><strong>[11] </strong>Kirkpatrick, S., Gelatt, C.D., &amp; Vecchi, M.P. (1983). <em>Optimization by Simulated Annealing. </em>Science, 220(4598), 671&#8211;680. <a href="https://doi.org/10.1126/science.220.4598.671">https://doi.org/10.1126/science.220.4598.671</a></p><p><strong>[12] </strong>Goodfellow, I., Bengio, Y., &amp; Courville, A. (2016). <em>Deep Learning (Chapter 8: Optimization for Training Deep Models). </em>MIT Press. <a href="https://www.deeplearningbook.org/contents/optimization.html">https://www.deeplearningbook.org/contents/optimization.html</a></p><p><strong>[13] </strong>Yang, Y., Gong, Z., Huang, W., Yang, Q., Zhou, Z., Huang, Z., Li, Y., Gao, X., Dai, Q., Liu, B., Qiu, K., Yang, Y., Chen, D., Yang, X., &amp; Luo, C. (2026). <em>SkillOpt: Executive Strategy for Self-Evolving Agent Skills. </em>arXiv preprint arXiv:2605.23904v2. <a href="https://arxiv.org/abs/2605.23904">https://arxiv.org/abs/2605.23904</a></p><p>[14] <a href="https://interestingengineering.substack.com/p/the-architecture-of-awareness-design">The Architecture of Awareness</a></p><p>[15] <a href="https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated">The Harness Lab</a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Structure Is The Intelligence]]></title><description><![CDATA[What a 97% Token Reduction Reveals About How Multi-Agent Systems Can Actually Work]]></description><link>https://interestingengineering.substack.com/p/the-structure-is-the-intelligence</link><guid isPermaLink="false">https://interestingengineering.substack.com/p/the-structure-is-the-intelligence</guid><dc:creator><![CDATA[Interesting Engineering ++]]></dc:creator><pubDate>Thu, 28 May 2026 19:12:24 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Vnz9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cf79cb4-0cd1-49e1-b750-1abc62fe754e_1205x659.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Vnz9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cf79cb4-0cd1-49e1-b750-1abc62fe754e_1205x659.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Vnz9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cf79cb4-0cd1-49e1-b750-1abc62fe754e_1205x659.png 424w, https://substackcdn.com/image/fetch/$s_!Vnz9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cf79cb4-0cd1-49e1-b750-1abc62fe754e_1205x659.png 848w, https://substackcdn.com/image/fetch/$s_!Vnz9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cf79cb4-0cd1-49e1-b750-1abc62fe754e_1205x659.png 1272w, https://substackcdn.com/image/fetch/$s_!Vnz9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cf79cb4-0cd1-49e1-b750-1abc62fe754e_1205x659.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Vnz9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cf79cb4-0cd1-49e1-b750-1abc62fe754e_1205x659.png" width="1205" height="659" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4cf79cb4-0cd1-49e1-b750-1abc62fe754e_1205x659.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:659,&quot;width&quot;:1205,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1103461,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cf79cb4-0cd1-49e1-b750-1abc62fe754e_1205x659.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Vnz9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cf79cb4-0cd1-49e1-b750-1abc62fe754e_1205x659.png 424w, https://substackcdn.com/image/fetch/$s_!Vnz9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cf79cb4-0cd1-49e1-b750-1abc62fe754e_1205x659.png 848w, https://substackcdn.com/image/fetch/$s_!Vnz9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cf79cb4-0cd1-49e1-b750-1abc62fe754e_1205x659.png 1272w, https://substackcdn.com/image/fetch/$s_!Vnz9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cf79cb4-0cd1-49e1-b750-1abc62fe754e_1205x659.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FUDW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f995c84-a352-4be4-83c5-24d0f25ba85d_1194x650.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FUDW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f995c84-a352-4be4-83c5-24d0f25ba85d_1194x650.png 424w, https://substackcdn.com/image/fetch/$s_!FUDW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f995c84-a352-4be4-83c5-24d0f25ba85d_1194x650.png 848w, https://substackcdn.com/image/fetch/$s_!FUDW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f995c84-a352-4be4-83c5-24d0f25ba85d_1194x650.png 1272w, https://substackcdn.com/image/fetch/$s_!FUDW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f995c84-a352-4be4-83c5-24d0f25ba85d_1194x650.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FUDW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f995c84-a352-4be4-83c5-24d0f25ba85d_1194x650.png" width="1194" height="650" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3f995c84-a352-4be4-83c5-24d0f25ba85d_1194x650.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:650,&quot;width&quot;:1194,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1237629,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f995c84-a352-4be4-83c5-24d0f25ba85d_1194x650.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FUDW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f995c84-a352-4be4-83c5-24d0f25ba85d_1194x650.png 424w, https://substackcdn.com/image/fetch/$s_!FUDW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f995c84-a352-4be4-83c5-24d0f25ba85d_1194x650.png 848w, https://substackcdn.com/image/fetch/$s_!FUDW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f995c84-a352-4be4-83c5-24d0f25ba85d_1194x650.png 1272w, https://substackcdn.com/image/fetch/$s_!FUDW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f995c84-a352-4be4-83c5-24d0f25ba85d_1194x650.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>What This Experiment Was and Why It Was Run</strong></h2><p>Most writing about AI agents describes what they could do. This document describes what actually happened when one was built, broken, rebuilt, and measured &#8212; across four improvement cycles, three different API configurations, and enough failure modes to fill a postmortem. If you are new to AI agents or to the tooling described here, and have not read either my prior articles or those experimenting/shipping in the &#8220;agentic space&#8221;, this section is for you. If you are already familiar, you can skip to the Abstract. The results:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6yN_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e28eba8-0fed-43cb-993b-6706914bacd7_887x501.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6yN_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e28eba8-0fed-43cb-993b-6706914bacd7_887x501.png 424w, https://substackcdn.com/image/fetch/$s_!6yN_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e28eba8-0fed-43cb-993b-6706914bacd7_887x501.png 848w, https://substackcdn.com/image/fetch/$s_!6yN_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e28eba8-0fed-43cb-993b-6706914bacd7_887x501.png 1272w, https://substackcdn.com/image/fetch/$s_!6yN_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e28eba8-0fed-43cb-993b-6706914bacd7_887x501.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6yN_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e28eba8-0fed-43cb-993b-6706914bacd7_887x501.png" width="887" height="501" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8e28eba8-0fed-43cb-993b-6706914bacd7_887x501.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:501,&quot;width&quot;:887,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:69540,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e28eba8-0fed-43cb-993b-6706914bacd7_887x501.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6yN_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e28eba8-0fed-43cb-993b-6706914bacd7_887x501.png 424w, https://substackcdn.com/image/fetch/$s_!6yN_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e28eba8-0fed-43cb-993b-6706914bacd7_887x501.png 848w, https://substackcdn.com/image/fetch/$s_!6yN_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e28eba8-0fed-43cb-993b-6706914bacd7_887x501.png 1272w, https://substackcdn.com/image/fetch/$s_!6yN_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e28eba8-0fed-43cb-993b-6706914bacd7_887x501.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fuZv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d37ea10-ca0f-4a00-b784-0dd08ad9d02f_839x313.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fuZv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d37ea10-ca0f-4a00-b784-0dd08ad9d02f_839x313.png 424w, https://substackcdn.com/image/fetch/$s_!fuZv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d37ea10-ca0f-4a00-b784-0dd08ad9d02f_839x313.png 848w, https://substackcdn.com/image/fetch/$s_!fuZv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d37ea10-ca0f-4a00-b784-0dd08ad9d02f_839x313.png 1272w, https://substackcdn.com/image/fetch/$s_!fuZv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d37ea10-ca0f-4a00-b784-0dd08ad9d02f_839x313.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fuZv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d37ea10-ca0f-4a00-b784-0dd08ad9d02f_839x313.png" width="839" height="313" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4d37ea10-ca0f-4a00-b784-0dd08ad9d02f_839x313.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:313,&quot;width&quot;:839,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:44866,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d37ea10-ca0f-4a00-b784-0dd08ad9d02f_839x313.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fuZv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d37ea10-ca0f-4a00-b784-0dd08ad9d02f_839x313.png 424w, https://substackcdn.com/image/fetch/$s_!fuZv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d37ea10-ca0f-4a00-b784-0dd08ad9d02f_839x313.png 848w, https://substackcdn.com/image/fetch/$s_!fuZv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d37ea10-ca0f-4a00-b784-0dd08ad9d02f_839x313.png 1272w, https://substackcdn.com/image/fetch/$s_!fuZv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d37ea10-ca0f-4a00-b784-0dd08ad9d02f_839x313.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!i4oV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57a799c6-98b0-454f-b289-f51c6224d084_852x568.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!i4oV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57a799c6-98b0-454f-b289-f51c6224d084_852x568.png 424w, https://substackcdn.com/image/fetch/$s_!i4oV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57a799c6-98b0-454f-b289-f51c6224d084_852x568.png 848w, https://substackcdn.com/image/fetch/$s_!i4oV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57a799c6-98b0-454f-b289-f51c6224d084_852x568.png 1272w, https://substackcdn.com/image/fetch/$s_!i4oV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57a799c6-98b0-454f-b289-f51c6224d084_852x568.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!i4oV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57a799c6-98b0-454f-b289-f51c6224d084_852x568.png" width="852" height="568" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/57a799c6-98b0-454f-b289-f51c6224d084_852x568.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:568,&quot;width&quot;:852,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:74803,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57a799c6-98b0-454f-b289-f51c6224d084_852x568.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!i4oV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57a799c6-98b0-454f-b289-f51c6224d084_852x568.png 424w, https://substackcdn.com/image/fetch/$s_!i4oV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57a799c6-98b0-454f-b289-f51c6224d084_852x568.png 848w, https://substackcdn.com/image/fetch/$s_!i4oV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57a799c6-98b0-454f-b289-f51c6224d084_852x568.png 1272w, https://substackcdn.com/image/fetch/$s_!i4oV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57a799c6-98b0-454f-b289-f51c6224d084_852x568.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Quick Overview: What Is an AI Agent?</strong></h3><p>An AI agent is a software system built around a language model that can take actions, not just produce text. Where a standard language model answers a question in one step, an agent can run a sequence of steps: look something up, write and execute a piece of code, call an external service, check its own output, revise it, and repeat. The model is the reasoning engine. Everything around it &#8212; the instructions it receives, the tools it can use, the memory it can access, the rules governing when it stops &#8212; is the architecture.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><blockquote><p><strong>&#128214;  Plain language &#8212; Language model</strong></p><p><em>A language model (like Claude or GPT) is an AI system trained to understand and generate text. It reads your input and produces a response. On its own it does one thing at a time. An &#8216;agent&#8217; is software built around a language model that lets it take multiple actions in sequence &#8212; like a person following a to-do list rather than answering a single question.</em></p></blockquote><p>The central question in agentic AI design is not which model is smarter. It is how you build the architecture around the model so that the system does useful work reliably and at a cost that makes operational sense. That is what this experiment examined.</p><h2><strong>The Experiment: StockPilot</strong></h2><p>StockPilot is a fictional inventory management agent built by Anthropic as a teaching example. It was designed to represent a realistic failure mode: <strong>an agent that started as a reasonable prototype and grew over six months as requirements accumulated. Many of us have similar experiences. New rules were added to its instructions. New tools were bolted on. A 20-line prompt became a 402-line one. Three sub-agents were added to handle specialist tasks. By the time the experiment began, StockPilot was a monolith</strong> &#8212; a single tightly coupled system where everything happened in one place and every task, however simple, carried the full weight of the entire accumulated architecture.</p><blockquote><p><strong>&#128214;  Plain language &#8212; System prompt</strong></p><p><em>The system prompt is the set of standing instructions given to the AI at the start of every conversation. Think of it as a job description handed to a new employee on their first day &#8212; it tells the AI who it is, what it is allowed to do, and how it should behave. A short, focused system prompt keeps the AI on task. A 402-line one is like handing someone a 40-page manual before every single meeting, whether or not any of it is relevant.</em></p></blockquote><p>Running StockPilot&#8217;s daily inventory sweep took 8.6 minutes, consumed the equivalent of approximately 200 pages of text in model context on a single run, made 59 separate data requests, and cost $1.39. It failed its quality evaluation 29% of the time. The experiment asked: what happens if you systematically take this apart and rebuild it properly?</p><h3><strong>The Method: Progressive Decomposition</strong></h3><p>Rather than rebuilding from scratch, the experiment applied a <strong>structured decomposition over four cycles</strong>. Each cycle changed one thing, measured the result, and documented both what improved and what broke. The improvement framework used three diagnostic questions:</p><ul><li><p>Is this tool returning raw data when it should be returning a filtered answer?</p></li><li><p>Is this policy instruction always loaded even when it is irrelevant to the current task?</p></li><li><p>Is this sub-agent running unconditionally when it should only run when needed?</p></li></ul><p>The answer to each question pointed to a specific architectural fix. The fixes were applied in sequence and measured with a real evaluation suite of 12 tasks.</p><h2><strong>The Complication: Three Different API Configurations</strong></h2><p>While the experiment borrows heavily from the one Anthropic ran, it has been adapted to emphasize particular issues faced commonly in large organizations. It has variations in terms of how it was run, initially because of legacy systems already in place, preferences of those running these experiments, and operating manuals that dictated particular flows. I will not justify them. Because of it, however, the observations and evolutions that arose ended up giving some fascinating results, so my focus will be on this. </p><p>For reference if you have the time:</p><div id="youtube2-mWvtOHlZM-I" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;mWvtOHlZM-I&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/mWvtOHlZM-I?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><h2><strong>Experiment Source</strong></h2><p><strong>Anthropic cwc-workshops &#8212; Agent Decomposition: </strong><a href="https://github.com/anthropics/cwc-workshops/tree/main/agent-decomposition">github.com/anthropics/cwc-workshops/tree/main/agent-decomposition</a></p><p><strong>Workshop video walkthrough: </strong><a href="https://youtu.be/mWvtOHlZM-I">youtu.be/mWvtOHlZM-I</a></p><p><strong>While Sonnet 4.6 </strong>was applied consistently throughout, the experiment ran without an <strong>Anthropic API key (directly from Anthropic),</strong> initially. Instead it used Anthropic&#8217;s API Key via OpenRouter. Legacy operating, compliance and workflows systems, added constraints and limitations. Understandably, some of it had something to do with rate limits, auth failures, regional issues etc. So Anthropic&#8217;s API Key via <strong>OpenRouter was deployed as fallback</strong>. If you want to understand why anybody does this, read <a href="https://thoughts.jock.pl/p/openrouter-fallback-multi-provider-ai-agent-2026">Pawel Jozefiak</a>. </p><div class="comment" data-attrs="{&quot;url&quot;:&quot;https://open.substack.com/&quot;,&quot;commentId&quot;:267573706,&quot;comment&quot;:{&quot;id&quot;:267573706,&quot;date&quot;:&quot;2026-05-30T09:10:49.961Z&quot;,&quot;edited_at&quot;:null,&quot;body&quot;:&quot;Are All APIs Created Equal? Additional Evidence Supporting the Experiment's Findings\n\nThe short answer, is no, and a surprising &#8220;catch&#8221;personally. \n\nNoted the base model intelligence of Claude Sonnet 4.6 is technically identical whether accessed via Anthropic directly, Amazon Bedrock, or Google Vertex AI through OpenRouter. What differs &#8212; materially and measurably &#8212; is the infrastructure layer around the model. This is precisely what the StockPilot experiment demonstrated empirically across four cycles I ran. \n\nFeature lag is the largest hidden cost. Accessing Claude through Anthropic directly gives immediate access to prompt caching, which cuts costs by up to 90% on long agent histories, and batch APIs offering 50% discounts on non-urgent tasks. Through OpenRouter routing to Bedrock or Vertex, these features are either unavailable or unreliable depending on which upstream provider handles any given request. The experiment's caching gap &#8212; where correctly implemented cache_control blocks were silently neutralised by provider rotation &#8212; is documented independently by multiple developers as a known behaviour, not an edge case. Please correct me if wrong. \n\nActivating caching on OpenRouter requires disabling fallbacks. This is the non-obvious configuration decision the experiment encountered. \n\nOpenRouter's rule is explicit: if you include a top-level cache_control parameter, it overrides fallbacks and forces routing to Anthropic direct only. If fallbacks remain enabled, OpenRouter strips the cache parameter to allow routing to Bedrock and Vertex, destroying cost savings entirely. The trade-off is binary &#8212; caching or resilience, not both. The experiment paid cache write costs on Anthropic-routed turns and received zero cache reads on Bedrock and Vertex-routed turns, producing a net cost slightly above no caching at all.\n\nThree additional infrastructure differences accumulate in multi-turn agent loops. The cache eviction window (TTL) is shorter and less predictable through OpenRouter's proxy layer than natively. Cache counter metadata (cached_tokens) streams back inconsistently through OpenRouter, making real-time cost tracking unreliable &#8212; which explains why the experiment's harness logged zero cache reads even when caching was nominally active. \n\nMost significantly, Anthropic dynamically advances the cache window forward as a conversation grows in a multi-turn loop; OpenRouter users have documented cases where the initial system prompt caches successfully but subsequent turns fail to cache iteratively, causing costs to scale regardless.\n\nPersonally, so this does not constitute a recommemdation: For agents with short transactional prompts, OpenRouter's model flexibility is valuable and cost-competitive. It has been my fallback. For agents passing large, repeating system prompts or accumulating deep context histories across many turns &#8212; which describes every F-series task in the StockPilot suite &#8212; my thoughts are to use Anthropic direct to access prompt caching. \n\nThe structural advantage is not model capability, which is identical. It is infrastructure behaviour, which is not.\n\nReferences below:&quot;,&quot;body_json&quot;:{&quot;type&quot;:&quot;doc&quot;,&quot;attrs&quot;:{&quot;schemaVersion&quot;:&quot;v1&quot;},&quot;content&quot;:[{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;bold&quot;}],&quot;text&quot;:&quot;Are All APIs Created Equal? Additional Evidence Supporting the Experiment's Findings&quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;The short answer, is no, and a surprising &#8220;catch&#8221;personally. &quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;Noted the base &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;bold&quot;}],&quot;text&quot;:&quot;model intelligence&quot;},{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot; of Claude Sonnet 4.6 is technically identical whether accessed via Anthropic directly, Amazon Bedrock, or Google Vertex AI through OpenRouter. What differs &#8212; materially and measurably &#8212; is the &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;bold&quot;}],&quot;text&quot;:&quot;infrastructure layer around the model&quot;},{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;. This is precisely what the StockPilot experiment demonstrated empirically across four cycles I ran. &quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;bold&quot;}],&quot;text&quot;:&quot;Feature lag&quot;},{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot; is the largest hidden cost. Accessing Claude through Anthropic directly gives immediate access to &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;bold&quot;},{&quot;type&quot;:&quot;italic&quot;}],&quot;text&quot;:&quot;prompt caching, which cuts costs by up to 90% on long agent histories, and batch APIs offering 50% discounts on non-urgent tasks&quot;},{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;. Through OpenRouter routing to Bedrock or Vertex, these features are either unavailable or &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;bold&quot;}],&quot;text&quot;:&quot;unreliable&quot;},{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot; depending on which upstream provider handles any given request. The &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;bold&quot;}],&quot;text&quot;:&quot;experiment's caching gap&quot;},{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot; &#8212; where correctly implemented &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;bold&quot;}],&quot;text&quot;:&quot;cache_control&quot;},{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot; blocks were silently neutralised by provider rotation &#8212; is documented independently by multiple developers as a known behaviour, not an edge case. Please correct me if wrong. &quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;bold&quot;}],&quot;text&quot;:&quot;Activating caching on OpenRouter requires disabling fallbacks&quot;},{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;. This is the &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;bold&quot;},{&quot;type&quot;:&quot;italic&quot;}],&quot;text&quot;:&quot;non-obvious configuration decision the experiment encountered&quot;},{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;. &quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;OpenRouter's rule is explicit: if you include a top-level &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;bold&quot;}],&quot;text&quot;:&quot;cache_control&quot;},{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot; parameter, it overrides fallbacks and forces routing to Anthropic direct only. &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;bold&quot;},{&quot;type&quot;:&quot;italic&quot;}],&quot;text&quot;:&quot;If fallbacks remain enabled, OpenRouter strips the cache parameter to allow routing to Bedrock and Vertex, destroying cost savings entirely&quot;},{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;. The trade-off is binary &#8212; &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;bold&quot;}],&quot;text&quot;:&quot;caching or resilience, not both.&quot;},{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot; The experiment paid cache write costs on Anthropic-routed turns and received zero cache reads on Bedrock and Vertex-routed turns, producing a net cost slightly above no caching at all.&quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;Three additional infrastructure differences accumulate in multi-turn agent loops. The cache eviction window (TTL) is shorter and less predictable through OpenRouter's proxy layer than natively. Cache counter metadata (cached_tokens) streams back inconsistently through OpenRouter, making real-time cost tracking unreliable &#8212; which explains why &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;bold&quot;},{&quot;type&quot;:&quot;italic&quot;}],&quot;text&quot;:&quot;the experiment's harness logged zero cache reads even when caching was nominally active&quot;},{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;. &quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;Most significantly, Anthropic dynamically advances the cache window forward as a conversation grows in a multi-turn loop; OpenRouter users have documented cases where the &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;italic&quot;}],&quot;text&quot;:&quot;initial system prompt caches successfully but subsequent turns fail to cache iteratively&quot;},{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;, causing costs to scale regardless.&quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;Personally, so this does not constitute a recommemdation: For agents with short transactional prompts, OpenRouter's model flexibility is valuable and cost-competitive. It has been my fallback. For agents passing large, repeating system prompts or accumulating deep context histories across many turns &#8212; which describes every F-series task in the StockPilot suite &#8212; my thoughts are to use Anthropic direct to access prompt caching. &quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;The structural advantage is not model capability, which is identical. It is &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;bold&quot;},{&quot;type&quot;:&quot;italic&quot;}],&quot;text&quot;:&quot;infrastructure behaviour&quot;},{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;, which is not.&quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;References below:&quot;}]}]},&quot;restacks&quot;:0,&quot;reaction_count&quot;:0,&quot;children_count&quot;:2,&quot;attachments&quot;:[{&quot;id&quot;:&quot;121524a3-8b0c-48bb-a05b-865dbda7f28d&quot;,&quot;type&quot;:&quot;post&quot;,&quot;publication&quot;:{&quot;apple_pay_disabled&quot;:false,&quot;apex_domain&quot;:null,&quot;author_id&quot;:124460392,&quot;byline_images_enabled&quot;:true,&quot;bylines_enabled&quot;:true,&quot;chartable_token&quot;:null,&quot;community_enabled&quot;:true,&quot;copyright&quot;:&quot;InterestingEngineering++&quot;,&quot;cover_photo_url&quot;:null,&quot;created_at&quot;:&quot;2023-01-22T09:20:05.053Z&quot;,&quot;custom_domain_optional&quot;:false,&quot;custom_domain&quot;:null,&quot;default_comment_sort&quot;:&quot;best_first&quot;,&quot;default_coupon&quot;:null,&quot;default_group_coupon&quot;:null,&quot;default_show_guest_bios&quot;:true,&quot;email_banner_url&quot;:null,&quot;email_from_name&quot;:null,&quot;email_from&quot;:null,&quot;embed_tracking_disabled&quot;:false,&quot;explicit&quot;:false,&quot;expose_paywall_content_to_search_engines&quot;:true,&quot;fb_pixel_id&quot;:null,&quot;fb_site_verification_token&quot;:null,&quot;flagged_as_spam&quot;:false,&quot;founding_subscription_benefits&quot;:null,&quot;free_subscription_benefits&quot;:null,&quot;ga_pixel_id&quot;:null,&quot;google_site_verification_token&quot;:null,&quot;google_tag_manager_token&quot;:null,&quot;hero_image&quot;:null,&quot;hero_text&quot;:&quot;My personal Substack&quot;,&quot;hide_intro_subtitle&quot;:null,&quot;hide_intro_title&quot;:null,&quot;hide_podcast_feed_link&quot;:false,&quot;homepage_type&quot;:&quot;magaziney&quot;,&quot;id&quot;:1335585,&quot;image_thumbnails_always_enabled&quot;:false,&quot;invite_only&quot;:false,&quot;hide_podcast_from_pub_listings&quot;:false,&quot;language&quot;:&quot;en&quot;,&quot;logo_url_wide&quot;:null,&quot;logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!-M9w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05150353-1bdc-48d2-b72c-c0bd499513eb_1024x1024.png&quot;,&quot;minimum_group_size&quot;:2,&quot;moderation_enabled&quot;:true,&quot;name&quot;:&quot;Interesting Engineering++&quot;,&quot;paid_subscription_benefits&quot;:null,&quot;parsely_pixel_id&quot;:null,&quot;chartbeat_domain&quot;:null,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;paywall_free_trial_enabled&quot;:false,&quot;podcast_art_url&quot;:null,&quot;paid_podcast_episode_art_url&quot;:null,&quot;podcast_byline&quot;:null,&quot;podcast_description&quot;:null,&quot;podcast_enabled&quot;:false,&quot;podcast_feed_url&quot;:null,&quot;podcast_title&quot;:null,&quot;post_preview_limit&quot;:null,&quot;primary_user_id&quot;:124460392,&quot;require_clickthrough&quot;:false,&quot;show_pub_podcast_tab&quot;:false,&quot;show_recs_on_homepage&quot;:true,&quot;subdomain&quot;:&quot;interestingengineering&quot;,&quot;subscriber_invites&quot;:0,&quot;support_email&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#45D800&quot;,&quot;theme_var_color_links&quot;:false,&quot;theme_var_cover_bg_color&quot;:null,&quot;trial_end_override&quot;:null,&quot;twitter_pixel_id&quot;:null,&quot;type&quot;:&quot;newsletter&quot;,&quot;post_reaction_faces_enabled&quot;:true,&quot;is_personal_mode&quot;:false,&quot;plans&quot;:null,&quot;stripe_user_id&quot;:null,&quot;stripe_country&quot;:null,&quot;stripe_publishable_key&quot;:null,&quot;stripe_platform_account&quot;:null,&quot;automatic_tax_enabled&quot;:null,&quot;author_name&quot;:&quot;Interesting Engineering ++&quot;,&quot;author_handle&quot;:&quot;interestingengineering&quot;,&quot;author_photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!M5M6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F977225f0-cc19-41f4-9df4-e21d01541411_347x347.jpeg&quot;,&quot;author_bio&quot;:&quot;I spend my time learning about, and understanding our complex world better. &quot;,&quot;has_custom_tos&quot;:false,&quot;has_custom_privacy&quot;:false,&quot;theme&quot;:{&quot;background_pop_color&quot;:null,&quot;web_bg_color&quot;:&quot;#ffffff&quot;,&quot;cover_bg_color&quot;:null,&quot;publication_id&quot;:1335585,&quot;color_links&quot;:null,&quot;font_preset_heading&quot;:&quot;heavy_sans&quot;,&quot;font_preset_body&quot;:null,&quot;font_family_headings&quot;:null,&quot;font_family_body&quot;:null,&quot;font_family_ui&quot;:null,&quot;font_size_body_desktop&quot;:null,&quot;print_secondary&quot;:null,&quot;custom_css_web&quot;:null,&quot;custom_css_email&quot;:null,&quot;home_hero&quot;:&quot;magaziney&quot;,&quot;home_posts&quot;:&quot;list&quot;,&quot;home_show_top_posts&quot;:false,&quot;hide_images_from_list&quot;:false,&quot;home_hero_alignment&quot;:&quot;left&quot;,&quot;home_hero_show_podcast_links&quot;:true,&quot;default_post_header_variant&quot;:null,&quot;custom_header&quot;:null,&quot;custom_footer&quot;:null,&quot;social_media_links&quot;:null,&quot;font_options&quot;:null,&quot;section_template&quot;:null,&quot;custom_subscribe&quot;:null,&quot;design_template&quot;:null,&quot;design_template_options&quot;:null},&quot;threads_v2_settings&quot;:null,&quot;default_group_coupon_percent_off&quot;:null,&quot;pause_return_date&quot;:null,&quot;has_posts&quot;:true,&quot;has_recommendations&quot;:true,&quot;first_post_date&quot;:&quot;2023-01-22T12:30:00.823Z&quot;,&quot;has_podcast&quot;:false,&quot;has_free_podcast&quot;:false,&quot;has_subscriber_only_podcast&quot;:false,&quot;has_community_content&quot;:true,&quot;rankingDetail&quot;:&quot;Launched 3 years ago&quot;,&quot;rankingDetailFreeIncluded&quot;:&quot;Thousands of subscribers&quot;,&quot;rankingDetailOrderOfMagnitude&quot;:0,&quot;rankingDetailFreeIncludedOrderOfMagnitude&quot;:1000,&quot;rankingDetailFreeSubscriberCount&quot;:&quot;Over 1,000 subscribers&quot;,&quot;rankingDetailByLanguage&quot;:{&quot;ar&quot;:{&quot;rankingDetail&quot;:&quot;&#1578;&#1605; &#1575;&#1604;&#1573;&#1591;&#1604;&#1575;&#1602; 3 years ago&quot;},&quot;ca&quot;:{&quot;rankingDetail&quot;:&quot;S&#8217;ha llan&#231;at fa 3 anys&quot;},&quot;da&quot;:{&quot;rankingDetail&quot;:&quot;Lancering 3 &#229;r&quot;},&quot;de&quot;:{&quot;rankingDetail&quot;:&quot;Vor vor 3 Jahren gelauncht&quot;},&quot;es&quot;:{&quot;rankingDetail&quot;:&quot;Lanzado hace 3 a&#241;os&quot;},&quot;fr&quot;:{&quot;rankingDetail&quot;:&quot;Lanc&#233; il y a 3 ann&#233;es&quot;},&quot;ja&quot;:{&quot;rankingDetail&quot;:&quot;&#38283;&#22987;&#26085; 3&#24180;&#21069;&quot;},&quot;nb&quot;:{&quot;rankingDetail&quot;:&quot;Lansert 3 &#229;r&quot;},&quot;nl&quot;:{&quot;rankingDetail&quot;:&quot;Gelanceerd 3 jaar geleden&quot;},&quot;pl&quot;:{&quot;rankingDetail&quot;:&quot;Uruchomiono 3 lat temu&quot;},&quot;pt&quot;:{&quot;rankingDetail&quot;:&quot;Lan&#231;ado 3 anos&quot;},&quot;pt-br&quot;:{&quot;rankingDetail&quot;:&quot;Lan&#231;ado 3 anos&quot;},&quot;en-gb&quot;:{&quot;rankingDetail&quot;:&quot;Launched 3 years ago&quot;},&quot;it&quot;:{&quot;rankingDetail&quot;:&quot;Lanciato 3 anni&quot;},&quot;tr&quot;:{&quot;rankingDetail&quot;:&quot;3 y&#305;l ba&#351;lat&#305;ld&#305;&quot;},&quot;sv&quot;:{&quot;rankingDetail&quot;:&quot;Lanserad 3 &#229;r sedan&quot;},&quot;fi&quot;:{&quot;rankingDetail&quot;:&quot;Launched 3 vuotta&quot;},&quot;is&quot;:{&quot;rankingDetail&quot;:&quot;Launched 3 &#225;r&quot;},&quot;en&quot;:{&quot;rankingDetail&quot;:&quot;Launched 3 years ago&quot;}},&quot;freeSubscriberCount&quot;:&quot;1,000&quot;,&quot;freeSubscriberCountOrderOfMagnitude&quot;:&quot;1.3K+&quot;,&quot;author_bestseller_tier&quot;:0,&quot;author_badge&quot;:null,&quot;disable_monthly_subscriptions&quot;:false,&quot;disable_annual_subscriptions&quot;:false,&quot;hide_post_restacks&quot;:false,&quot;notes_feed_enabled&quot;:true,&quot;showIntroModule&quot;:false,&quot;isPortraitLayout&quot;:false,&quot;last_chat_post_at&quot;:null,&quot;primary_profile_name&quot;:&quot;Interesting Engineering ++&quot;,&quot;primary_profile_photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!M5M6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F977225f0-cc19-41f4-9df4-e21d01541411_347x347.jpeg&quot;,&quot;no_follow&quot;:false,&quot;sponsorshipCampaigns&quot;:{},&quot;paywall_chat&quot;:&quot;free&quot;,&quot;sections&quot;:[],&quot;podcastTabInfo&quot;:null,&quot;multipub_migration&quot;:null,&quot;navigationBarItems&quot;:[],&quot;has_active_perks&quot;:false,&quot;contributors&quot;:[{&quot;name&quot;:&quot;Interesting Engineering ++&quot;,&quot;handle&quot;:&quot;interestingengineering&quot;,&quot;role&quot;:&quot;admin&quot;,&quot;owner&quot;:true,&quot;user_id&quot;:124460392,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/977225f0-cc19-41f4-9df4-e21d01541411_347x347.jpeg&quot;,&quot;bio&quot;:&quot;I spend my time learning about, and understanding our complex world better. &quot;}],&quot;threads_v2_enabled&quot;:false,&quot;viralGiftsConfig&quot;:null,&quot;tier&quot;:2,&quot;no_index&quot;:false,&quot;can_set_google_site_verification&quot;:true,&quot;can_have_sitemap&quot;:true,&quot;founding_plan_name_english&quot;:&quot;Founding Member&quot;,&quot;bundles&quot;:[],&quot;base_url&quot;:&quot;https://interestingengineering.substack.com&quot;,&quot;hostname&quot;:&quot;interestingengineering.substack.com&quot;,&quot;is_on_substack&quot;:false,&quot;spotify_podcast_settings&quot;:null,&quot;unified_podcast_settings&quot;:null,&quot;podcastPalette&quot;:{&quot;DarkMuted&quot;:{&quot;population&quot;:72,&quot;rgb&quot;:[73,153,137]},&quot;DarkVibrant&quot;:{&quot;population&quot;:6013,&quot;rgb&quot;:[4,100,84]},&quot;LightMuted&quot;:{&quot;population&quot;:7,&quot;rgb&quot;:[142,198,186]},&quot;LightVibrant&quot;:{&quot;population&quot;:3,&quot;rgb&quot;:[166,214,206]},&quot;Muted&quot;:{&quot;population&quot;:6,&quot;rgb&quot;:[92,164,156]},&quot;Vibrant&quot;:{&quot;population&quot;:5,&quot;rgb&quot;:[76,164,146]}},&quot;pageThemes&quot;:{&quot;podcast&quot;:null},&quot;multiple_pins&quot;:true,&quot;supports_ip_content_unlock&quot;:false,&quot;appTheme&quot;:{&quot;colors&quot;:{&quot;accent&quot;:{&quot;name&quot;:&quot;#45d800&quot;,&quot;primary&quot;:{&quot;r&quot;:61,&quot;g&quot;:210,&quot;b&quot;:0,&quot;a&quot;:1},&quot;primary_hover&quot;:{&quot;r&quot;:25,&quot;g&quot;:190,&quot;b&quot;:0,&quot;a&quot;:1},&quot;primary_elevated&quot;:{&quot;r&quot;:25,&quot;g&quot;:190,&quot;b&quot;:0,&quot;a&quot;:1},&quot;secondary&quot;:{&quot;r&quot;:61,&quot;g&quot;:210,&quot;b&quot;:0,&quot;a&quot;:0.2},&quot;contrast&quot;:{&quot;r&quot;:255,&quot;g&quot;:255,&quot;b&quot;:255,&quot;a&quot;:1},&quot;bg&quot;:{&quot;r&quot;:61,&quot;g&quot;:210,&quot;b&quot;:0,&quot;a&quot;:0.2},&quot;bg_hover&quot;:{&quot;r&quot;:61,&quot;g&quot;:210,&quot;b&quot;:0,&quot;a&quot;:0.3},&quot;dark&quot;:{&quot;primary&quot;:{&quot;r&quot;:69,&quot;g&quot;:216,&quot;b&quot;:0,&quot;a&quot;:1},&quot;primary_hover&quot;:{&quot;r&quot;:94,&quot;g&quot;:236,&quot;b&quot;:42,&quot;a&quot;:1},&quot;primary_elevated&quot;:{&quot;r&quot;:94,&quot;g&quot;:236,&quot;b&quot;:42,&quot;a&quot;:1},&quot;secondary&quot;:{&quot;r&quot;:69,&quot;g&quot;:216,&quot;b&quot;:0,&quot;a&quot;:0.2},&quot;contrast&quot;:{&quot;r&quot;:0,&quot;g&quot;:0,&quot;b&quot;:0,&quot;a&quot;:0.8},&quot;bg&quot;:{&quot;r&quot;:69,&quot;g&quot;:216,&quot;b&quot;:0,&quot;a&quot;:0.2},&quot;bg_hover&quot;:{&quot;r&quot;:69,&quot;g&quot;:216,&quot;b&quot;:0,&quot;a&quot;:0.3}}},&quot;fg&quot;:{&quot;primary&quot;:{&quot;r&quot;:0,&quot;g&quot;:0,&quot;b&quot;:0,&quot;a&quot;:0.8},&quot;secondary&quot;:{&quot;r&quot;:0,&quot;g&quot;:0,&quot;b&quot;:0,&quot;a&quot;:0.6},&quot;tertiary&quot;:{&quot;r&quot;:0,&quot;g&quot;:0,&quot;b&quot;:0,&quot;a&quot;:0.4},&quot;accent&quot;:{&quot;r&quot;:0,&quot;g&quot;:138,&quot;b&quot;:0,&quot;a&quot;:1},&quot;dark&quot;:{&quot;primary&quot;:{&quot;r&quot;:255,&quot;g&quot;:255,&quot;b&quot;:255,&quot;a&quot;:0.9},&quot;secondary&quot;:{&quot;r&quot;:255,&quot;g&quot;:255,&quot;b&quot;:255,&quot;a&quot;:0.6},&quot;tertiary&quot;:{&quot;r&quot;:255,&quot;g&quot;:255,&quot;b&quot;:255,&quot;a&quot;:0.4},&quot;accent&quot;:{&quot;r&quot;:69,&quot;g&quot;:216,&quot;b&quot;:0,&quot;a&quot;:1}}},&quot;bg&quot;:{&quot;name&quot;:&quot;#ffffff&quot;,&quot;hue&quot;:{&quot;r&quot;:255,&quot;g&quot;:255,&quot;b&quot;:255,&quot;a&quot;:0},&quot;tint&quot;:{&quot;r&quot;:255,&quot;g&quot;:255,&quot;b&quot;:255,&quot;a&quot;:0},&quot;primary&quot;:{&quot;r&quot;:255,&quot;g&quot;:255,&quot;b&quot;:255,&quot;a&quot;:1},&quot;primary_hover&quot;:{&quot;r&quot;:250,&quot;g&quot;:250,&quot;b&quot;:250,&quot;a&quot;:1},&quot;primary_elevated&quot;:{&quot;r&quot;:250,&quot;g&quot;:250,&quot;b&quot;:250,&quot;a&quot;:1},&quot;secondary&quot;:{&quot;r&quot;:238,&quot;g&quot;:238,&quot;b&quot;:238,&quot;a&quot;:1},&quot;secondary_elevated&quot;:{&quot;r&quot;:206.90096477355226,&quot;g&quot;:206.90096477355175,&quot;b&quot;:206.9009647735519,&quot;a&quot;:1},&quot;tertiary&quot;:{&quot;r&quot;:219,&quot;g&quot;:219,&quot;b&quot;:219,&quot;a&quot;:1},&quot;quaternary&quot;:{&quot;r&quot;:182,&quot;g&quot;:182,&quot;b&quot;:182,&quot;a&quot;:1},&quot;dark&quot;:{&quot;primary&quot;:{&quot;r&quot;:22,&quot;g&quot;:23,&quot;b&quot;:24,&quot;a&quot;:1},&quot;primary_hover&quot;:{&quot;r&quot;:27,&quot;g&quot;:28,&quot;b&quot;:29,&quot;a&quot;:1},&quot;primary_elevated&quot;:{&quot;r&quot;:27,&quot;g&quot;:28,&quot;b&quot;:29,&quot;a&quot;:1},&quot;secondary&quot;:{&quot;r&quot;:35,&quot;g&quot;:37,&quot;b&quot;:37,&quot;a&quot;:1},&quot;secondary_elevated&quot;:{&quot;r&quot;:41.35899397549579,&quot;g&quot;:43.405356429195315,&quot;b&quot;:43.40489285041963,&quot;a&quot;:1},&quot;tertiary&quot;:{&quot;r&quot;:54,&quot;g&quot;:55,&quot;b&quot;:55,&quot;a&quot;:1},&quot;quaternary&quot;:{&quot;r&quot;:90,&quot;g&quot;:91,&quot;b&quot;:91,&quot;a&quot;:1}}}}},&quot;portalAppTheme&quot;:{&quot;colors&quot;:{&quot;accent&quot;:{&quot;name&quot;:&quot;#45D800&quot;,&quot;primary&quot;:{&quot;r&quot;:69,&quot;g&quot;:216,&quot;b&quot;:0,&quot;a&quot;:1},&quot;primary_hover&quot;:{&quot;r&quot;:61,&quot;g&quot;:191,&quot;b&quot;:0,&quot;a&quot;:1},&quot;primary_elevated&quot;:{&quot;r&quot;:69,&quot;g&quot;:216,&quot;b&quot;:0,&quot;a&quot;:1},&quot;secondary&quot;:{&quot;r&quot;:69,&quot;g&quot;:216,&quot;b&quot;:0,&quot;a&quot;:1},&quot;contrast&quot;:{&quot;r&quot;:255,&quot;g&quot;:255,&quot;b&quot;:255,&quot;a&quot;:1},&quot;bg&quot;:{&quot;r&quot;:255,&quot;g&quot;:103,&quot;b&quot;:25,&quot;a&quot;:0.2},&quot;bg_hover&quot;:{&quot;r&quot;:255,&quot;g&quot;:103,&quot;b&quot;:25,&quot;a&quot;:0.3},&quot;dark&quot;:{&quot;primary&quot;:{&quot;r&quot;:69,&quot;g&quot;:216,&quot;b&quot;:0,&quot;a&quot;:1},&quot;primary_hover&quot;:{&quot;r&quot;:94,&quot;g&quot;:236,&quot;b&quot;:42,&quot;a&quot;:1},&quot;primary_elevated&quot;:{&quot;r&quot;:94,&quot;g&quot;:236,&quot;b&quot;:42,&quot;a&quot;:1},&quot;secondary&quot;:{&quot;r&quot;:69,&quot;g&quot;:216,&quot;b&quot;:0,&quot;a&quot;:0.2},&quot;contrast&quot;:{&quot;r&quot;:0,&quot;g&quot;:0,&quot;b&quot;:0,&quot;a&quot;:0.8},&quot;bg&quot;:{&quot;r&quot;:69,&quot;g&quot;:216,&quot;b&quot;:0,&quot;a&quot;:0.2},&quot;bg_hover&quot;:{&quot;r&quot;:69,&quot;g&quot;:216,&quot;b&quot;:0,&quot;a&quot;:0.3}}},&quot;fg&quot;:{&quot;primary&quot;:{&quot;r&quot;:54,&quot;g&quot;:55,&quot;b&quot;:55,&quot;a&quot;:1},&quot;secondary&quot;:{&quot;r&quot;:134,&quot;g&quot;:135,&quot;b&quot;:135,&quot;a&quot;:1},&quot;tertiary&quot;:{&quot;r&quot;:146,&quot;g&quot;:146,&quot;b&quot;:146,&quot;a&quot;:1},&quot;accent&quot;:{&quot;r&quot;:69,&quot;g&quot;:216,&quot;b&quot;:0,&quot;a&quot;:1},&quot;dark&quot;:{&quot;primary&quot;:{&quot;r&quot;:255,&quot;g&quot;:255,&quot;b&quot;:255,&quot;a&quot;:0.9},&quot;secondary&quot;:{&quot;r&quot;:255,&quot;g&quot;:255,&quot;b&quot;:255,&quot;a&quot;:0.6},&quot;tertiary&quot;:{&quot;r&quot;:255,&quot;g&quot;:255,&quot;b&quot;:255,&quot;a&quot;:0.4},&quot;accent&quot;:{&quot;r&quot;:69,&quot;g&quot;:216,&quot;b&quot;:0,&quot;a&quot;:1}}},&quot;bg&quot;:{&quot;name&quot;:&quot;#ffffff&quot;,&quot;hue&quot;:{&quot;r&quot;:255,&quot;g&quot;:255,&quot;b&quot;:255,&quot;a&quot;:1},&quot;tint&quot;:{&quot;r&quot;:255,&quot;g&quot;:255,&quot;b&quot;:255,&quot;a&quot;:1},&quot;primary&quot;:{&quot;r&quot;:255,&quot;g&quot;:255,&quot;b&quot;:255,&quot;a&quot;:1},&quot;primary_hover&quot;:{&quot;r&quot;:240,&quot;g&quot;:240,&quot;b&quot;:240,&quot;a&quot;:1},&quot;primary_elevated&quot;:{&quot;r&quot;:255,&quot;g&quot;:255,&quot;b&quot;:255,&quot;a&quot;:1},&quot;secondary&quot;:{&quot;r&quot;:240,&quot;g&quot;:240,&quot;b&quot;:240,&quot;a&quot;:1},&quot;secondary_elevated&quot;:{&quot;r&quot;:240,&quot;g&quot;:240,&quot;b&quot;:240,&quot;a&quot;:1},&quot;tertiary&quot;:{&quot;r&quot;:221,&quot;g&quot;:221,&quot;b&quot;:221,&quot;a&quot;:1},&quot;quaternary&quot;:{&quot;r&quot;:183,&quot;g&quot;:183,&quot;b&quot;:183,&quot;a&quot;:1},&quot;dark&quot;:{&quot;primary&quot;:{&quot;r&quot;:22,&quot;g&quot;:23,&quot;b&quot;:24,&quot;a&quot;:1},&quot;primary_hover&quot;:{&quot;r&quot;:27,&quot;g&quot;:28,&quot;b&quot;:29,&quot;a&quot;:1},&quot;primary_elevated&quot;:{&quot;r&quot;:27,&quot;g&quot;:28,&quot;b&quot;:29,&quot;a&quot;:1},&quot;secondary&quot;:{&quot;r&quot;:35,&quot;g&quot;:37,&quot;b&quot;:37,&quot;a&quot;:1},&quot;secondary_elevated&quot;:{&quot;r&quot;:41.35899397549579,&quot;g&quot;:43.405356429195315,&quot;b&quot;:43.40489285041963,&quot;a&quot;:1},&quot;tertiary&quot;:{&quot;r&quot;:54,&quot;g&quot;:55,&quot;b&quot;:55,&quot;a&quot;:1},&quot;quaternary&quot;:{&quot;r&quot;:90,&quot;g&quot;:91,&quot;b&quot;:91,&quot;a&quot;:1}}},&quot;wordmark_bg&quot;:{&quot;r&quot;:255,&quot;g&quot;:255,&quot;b&quot;:255,&quot;a&quot;:1}},&quot;fonts&quot;:{&quot;heading&quot;:&quot;heavy_sans&quot;}},&quot;logoPalette&quot;:{&quot;Vibrant&quot;:{&quot;rgb&quot;:[28,232,250],&quot;population&quot;:237},&quot;DarkVibrant&quot;:{&quot;rgb&quot;:[5,91,126],&quot;population&quot;:146},&quot;LightVibrant&quot;:{&quot;rgb&quot;:[112,252,252],&quot;population&quot;:2},&quot;Muted&quot;:{&quot;rgb&quot;:[121,108,59],&quot;population&quot;:58},&quot;DarkMuted&quot;:{&quot;rgb&quot;:[90,89,57],&quot;population&quot;:332},&quot;LightMuted&quot;:{&quot;rgb&quot;:[3.143835616438329,149.85616438356163,149.85616438356166],&quot;population&quot;:0}}},&quot;post&quot;:{&quot;id&quot;:199597361,&quot;publication_id&quot;:1335585,&quot;title&quot;:&quot;The Structure Is The Intelligence&quot;,&quot;social_title&quot;:null,&quot;search_engine_title&quot;:null,&quot;search_engine_description&quot;:null,&quot;type&quot;:&quot;newsletter&quot;,&quot;slug&quot;:&quot;the-structure-is-the-intelligence&quot;,&quot;post_date&quot;:&quot;2026-05-28T19:12:24.790Z&quot;,&quot;audience&quot;:&quot;everyone&quot;,&quot;podcast_duration&quot;:null,&quot;video_upload_id&quot;:null,&quot;write_comment_permissions&quot;:&quot;everyone&quot;,&quot;should_send_free_preview&quot;:false,&quot;free_unlock_required&quot;:false,&quot;default_comment_sort&quot;:null,&quot;canonical_url&quot;:&quot;https://interestingengineering.substack.com/p/the-structure-is-the-intelligence&quot;,&quot;section_id&quot;:null,&quot;podcast_art_url&quot;:null,&quot;is_published&quot;:true,&quot;live_stream_id&quot;:null,&quot;restacks&quot;:2,&quot;top_exclusions&quot;:[],&quot;pins&quot;:[],&quot;is_section_pinned&quot;:false,&quot;has_shareable_clips&quot;:false,&quot;section_slug&quot;:null,&quot;section_name&quot;:null,&quot;reactions&quot;:{&quot;&#10084;&quot;:2},&quot;subtitle&quot;:&quot;What a 97% Token Reduction Reveals About How Multi-Agent Systems Can Actually Work&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Vnz9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cf79cb4-0cd1-49e1-b750-1abc62fe754e_1205x659.png&quot;,&quot;cover_image_is_square&quot;:false,&quot;cover_image_is_explicit&quot;:false,&quot;podcast_url&quot;:null,&quot;videoUpload&quot;:null,&quot;podcastFields&quot;:{&quot;post_id&quot;:199597361,&quot;podcast_episode_number&quot;:null,&quot;podcast_season_number&quot;:null,&quot;podcast_episode_type&quot;:null,&quot;should_syndicate_to_other_feed&quot;:null,&quot;syndicate_to_section_id&quot;:null,&quot;hide_from_feed&quot;:false,&quot;free_podcast_url&quot;:null,&quot;free_podcast_duration&quot;:null,&quot;preview_contains_ad&quot;:false,&quot;was_imported_self_serve_sync&quot;:false},&quot;podcast_upload_id&quot;:null,&quot;podcast_preview_upload_id&quot;:null,&quot;podcastUpload&quot;:null,&quot;podcastPreviewUpload&quot;:null,&quot;voiceover_upload_id&quot;:null,&quot;voiceoverUpload&quot;:null,&quot;has_voiceover&quot;:false,&quot;description&quot;:&quot;What a 97% Token Reduction Reveals About How Multi-Agent Systems Can Actually Work&quot;,&quot;body_json&quot;:null,&quot;body_html&quot;:null,&quot;truncated_body_text&quot;:&quot;&quot;,&quot;wordcount&quot;:8054,&quot;post_preview_limit&quot;:null,&quot;language&quot;:&quot;en&quot;,&quot;postTags&quot;:[],&quot;teaser_post_eligible&quot;:true,&quot;postCountryBlocks&quot;:[],&quot;headlineTest&quot;:null,&quot;coverImagePalette&quot;:{&quot;Vibrant&quot;:{&quot;rgb&quot;:[178,115,70],&quot;population&quot;:659},&quot;DarkVibrant&quot;:{&quot;rgb&quot;:[95.17258064516129,61.487903225806456,37.42741935483871],&quot;population&quot;:0},&quot;LightVibrant&quot;:{&quot;rgb&quot;:[222,204,142],&quot;population&quot;:3},&quot;Muted&quot;:{&quot;rgb&quot;:[111,124,124],&quot;population&quot;:549},&quot;DarkMuted&quot;:{&quot;rgb&quot;:[93,78,68],&quot;population&quot;:422},&quot;LightMuted&quot;:{&quot;rgb&quot;:[195,192,176],&quot;population&quot;:281}},&quot;publishedBylines&quot;:[{&quot;id&quot;:124460392,&quot;name&quot;:&quot;Interesting Engineering ++&quot;,&quot;handle&quot;:&quot;interestingengineering&quot;,&quot;previous_name&quot;:&quot;Suni&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/977225f0-cc19-41f4-9df4-e21d01541411_347x347.jpeg&quot;,&quot;bio&quot;:&quot;I spend my time learning about, and understanding our complex world better. &quot;,&quot;profile_set_up_at&quot;:&quot;2023-01-22T09:18:27.257Z&quot;,&quot;reader_installed_at&quot;:&quot;2024-04-03T22:44:12.301Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:1295515,&quot;user_id&quot;:124460392,&quot;publication_id&quot;:1335585,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:1335585,&quot;name&quot;:&quot;Interesting Engineering++&quot;,&quot;subdomain&quot;:&quot;interestingengineering&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;My personal Substack&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/05150353-1bdc-48d2-b72c-c0bd499513eb_1024x1024.png&quot;,&quot;author_id&quot;:124460392,&quot;primary_user_id&quot;:124460392,&quot;theme_var_background_pop&quot;:&quot;#45D800&quot;,&quot;created_at&quot;:&quot;2023-01-22T09:20:05.053Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;InterestingEngineering++&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;magaziney&quot;,&quot;is_personal_mode&quot;:false,&quot;logo_url_wide&quot;:null}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null,&quot;status&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:null,&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null},&quot;primary_publication&quot;:{&quot;id&quot;:1335585,&quot;subdomain&quot;:&quot;interestingengineering&quot;,&quot;custom_domain_optional&quot;:false,&quot;name&quot;:&quot;Interesting Engineering++&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/05150353-1bdc-48d2-b72c-c0bd499513eb_1024x1024.png&quot;,&quot;author_id&quot;:124460392,&quot;user_id&quot;:124460392,&quot;handles_enabled&quot;:false,&quot;explicit&quot;:false,&quot;is_personal_mode&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;pledges_enabled&quot;:true,&quot;ios_app_payments_enabled&quot;:false,&quot;has_reply_rules&quot;:false}}],&quot;reaction&quot;:false,&quot;reaction_count&quot;:2,&quot;comment_count&quot;:1,&quot;child_comment_count&quot;:1,&quot;audio_items&quot;:[{&quot;post_id&quot;:199597361,&quot;voice_id&quot;:&quot;en-US-AlloyTurboMultilingualNeural&quot;,&quot;audio_url&quot;:&quot;https://substack-video.s3.amazonaws.com/video_upload/post/199597361/tts/44c2e7af-d0eb-4d45-8e3d-69254f75ca8b/en-US-AlloyTurboMultilingualNeural.mp3&quot;,&quot;type&quot;:&quot;tts&quot;,&quot;status&quot;:&quot;completed&quot;}],&quot;country_blocks&quot;:[],&quot;is_geoblocked&quot;:false,&quot;hasCashtag&quot;:false,&quot;inboxItem&quot;:{&quot;content_key&quot;:&quot;post:199597361&quot;,&quot;updated_at&quot;:&quot;2026-06-02T15:15:49.836Z&quot;,&quot;content_date&quot;:&quot;2026-05-28T19:12:24.790Z&quot;,&quot;inbox_date&quot;:&quot;2026-05-28T19:12:24.790Z&quot;,&quot;seen_at&quot;:&quot;2026-06-02T15:15:49.836Z&quot;,&quot;saved_at&quot;:null,&quot;archived_at&quot;:null,&quot;skip_inbox&quot;:false,&quot;type&quot;:&quot;post&quot;,&quot;post_id&quot;:199597361,&quot;extra_views&quot;:[&quot;media&quot;],&quot;read_progress&quot;:0,&quot;max_read_progress&quot;:0.8230479143988874,&quot;audio_progress&quot;:0,&quot;max_audio_progress&quot;:0,&quot;video_progress&quot;:0,&quot;max_video_progress&quot;:0,&quot;postType&quot;:&quot;newsletter&quot;,&quot;title&quot;:&quot;The Structure Is The Intelligence&quot;,&quot;subtitle&quot;:&quot;What a 97% Token Reduction Reveals About How Multi-Agent Systems Can Actually Work&quot;,&quot;detail_view_subtitle&quot;:&quot;What a 97% Token Reduction Reveals About How Multi-Agent Systems Can Actually Work&quot;,&quot;cover_photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Vnz9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cf79cb4-0cd1-49e1-b750-1abc62fe754e_1205x659.png&quot;,&quot;audience&quot;:&quot;everyone&quot;,&quot;is_preview&quot;:false,&quot;audio_url&quot;:&quot;https://substack-video.s3.amazonaws.com/video_upload/post/199597361/tts/44c2e7af-d0eb-4d45-8e3d-69254f75ca8b/en-US-AlloyTurboMultilingualNeural.mp3&quot;,&quot;audio_type&quot;:&quot;tts&quot;,&quot;web_url&quot;:&quot;https://interestingengineering.substack.com/p/the-structure-is-the-intelligence&quot;,&quot;duration_metadata&quot;:{&quot;word_count&quot;:8054},&quot;authors&quot;:[&quot;Interesting Engineering ++&quot;],&quot;published_bylines&quot;:[{&quot;id&quot;:124460392,&quot;name&quot;:&quot;Interesting Engineering ++&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/977225f0-cc19-41f4-9df4-e21d01541411_347x347.jpeg&quot;}],&quot;coverImagePalette&quot;:{&quot;Vibrant&quot;:{&quot;rgb&quot;:[178,115,70],&quot;population&quot;:659},&quot;DarkVibrant&quot;:{&quot;rgb&quot;:[95.17258064516129,61.487903225806456,37.42741935483871],&quot;population&quot;:0},&quot;LightVibrant&quot;:{&quot;rgb&quot;:[222,204,142],&quot;population&quot;:3},&quot;Muted&quot;:{&quot;rgb&quot;:[111,124,124],&quot;population&quot;:549},&quot;DarkMuted&quot;:{&quot;rgb&quot;:[93,78,68],&quot;population&quot;:422},&quot;LightMuted&quot;:{&quot;rgb&quot;:[195,192,176],&quot;population&quot;:281}},&quot;publication_id&quot;:1335585,&quot;publisher_image_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!-M9w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05150353-1bdc-48d2-b72c-c0bd499513eb_1024x1024.png&quot;,&quot;publisher_name&quot;:&quot;Interesting Engineering++&quot;,&quot;is_personal_mode&quot;:false,&quot;like_count&quot;:2,&quot;comment_count&quot;:1,&quot;reaction&quot;:false,&quot;tracking_parameters&quot;:{&quot;is_saved&quot;:false,&quot;is_seen&quot;:true,&quot;post_id&quot;:199597361,&quot;post_type&quot;:&quot;newsletter&quot;,&quot;publication_id&quot;:1335585,&quot;tabId&quot;:&quot;home&quot;,&quot;tabType&quot;:&quot;base&quot;,&quot;max_read_progress&quot;:0.8230479143988874,&quot;max_audio_progress&quot;:0,&quot;max_video_progress&quot;:0,&quot;last_seen_at&quot;:&quot;2026-06-02T15:15:49.836Z&quot;,&quot;impression_id&quot;:&quot;9cf5ae50-812c-433d-ba0c-92631392029e&quot;}},&quot;is_saved&quot;:false,&quot;saved_at&quot;:null,&quot;is_viewed&quot;:true,&quot;read_progress&quot;:0,&quot;max_read_progress&quot;:0.8230479143988874,&quot;audio_progress&quot;:0,&quot;max_audio_progress&quot;:0,&quot;video_progress&quot;:0,&quot;max_video_progress&quot;:0,&quot;restacked&quot;:false},&quot;postSelection&quot;:null,&quot;postSelectionTheme&quot;:null,&quot;postImageSelection&quot;:null,&quot;clipInfo&quot;:null,&quot;mediaClip&quot;:null}],&quot;name&quot;:&quot;Interesting Engineering ++&quot;,&quot;user_id&quot;:124460392,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/977225f0-cc19-41f4-9df4-e21d01541411_347x347.jpeg&quot;,&quot;user_bestseller_tier&quot;:null,&quot;userStatus&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:null,&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null}},&quot;source&quot;:null,&quot;forumChannel&quot;:null}" data-component-name="CommentPlaceholder"></div><p>Becuase of this, most if not all model calls (for this exercise) were routed through <strong>OpenRouter</strong> &#8212; a third-party service providing access to AI models via a unified API. When a direct Anthropic key became available mid-experiment (after compliance approval), a second configuration was tested. Then, with <strong>Claude Managed Agents</strong>, a third was tested. Each configuration produced meaningfully different results &#8212; not because the model changed, but because the execution infrastructure changed.</p><blockquote><p><strong>&#128214;  Plain language &#8212; API</strong></p><p><em>An API (Application Programming Interface) is a standardised way for software systems to talk to each other. When this experiment calls Claude, it sends a request to Anthropic&#8217;s API &#8212; essentially a structured message over the internet &#8212; and receives a structured response back. OpenRouter is a service that sits in the middle: you send your request to <strong>OpenRouter, it forwards it to one of several AI providers (Anthropic, Amazon Bedrock, Google Vertex),</strong> and returns the result - see below sample runs. This is convenient for reliability &#8212; if one provider is down, another takes over &#8212; but it adds complexity when features like caching behave differently across providers.</em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3bnr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e45bf2d-9518-4788-a090-a3f0e70a2fb7_562x363.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3bnr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e45bf2d-9518-4788-a090-a3f0e70a2fb7_562x363.png 424w, https://substackcdn.com/image/fetch/$s_!3bnr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e45bf2d-9518-4788-a090-a3f0e70a2fb7_562x363.png 848w, https://substackcdn.com/image/fetch/$s_!3bnr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e45bf2d-9518-4788-a090-a3f0e70a2fb7_562x363.png 1272w, https://substackcdn.com/image/fetch/$s_!3bnr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e45bf2d-9518-4788-a090-a3f0e70a2fb7_562x363.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3bnr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e45bf2d-9518-4788-a090-a3f0e70a2fb7_562x363.png" width="562" height="363" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1e45bf2d-9518-4788-a090-a3f0e70a2fb7_562x363.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:363,&quot;width&quot;:562,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:59013,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e45bf2d-9518-4788-a090-a3f0e70a2fb7_562x363.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3bnr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e45bf2d-9518-4788-a090-a3f0e70a2fb7_562x363.png 424w, https://substackcdn.com/image/fetch/$s_!3bnr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e45bf2d-9518-4788-a090-a3f0e70a2fb7_562x363.png 848w, https://substackcdn.com/image/fetch/$s_!3bnr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e45bf2d-9518-4788-a090-a3f0e70a2fb7_562x363.png 1272w, https://substackcdn.com/image/fetch/$s_!3bnr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e45bf2d-9518-4788-a090-a3f0e70a2fb7_562x363.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="comment" data-attrs="{&quot;url&quot;:&quot;https://open.substack.com/&quot;,&quot;commentId&quot;:267574382,&quot;comment&quot;:{&quot;id&quot;:267574382,&quot;date&quot;:&quot;2026-05-30T09:13:57.305Z&quot;,&quot;edited_at&quot;:null,&quot;body&quot;:&quot;OpenRouter Documentation\n\nPrompt caching guide: http://openrouter.ai/docs/guides/best-practices/prompt-caching\n\nProvider routing and fallback behaviour: http://openrouter.ai/docs/guides/routing/provider-selection\n\nUsage accounting and cached token tracking: http://openrouter.ai/docs/cookbook/administration/usage-accounting\n\nAnthropic Official Documentation\n\nPrompt caching &#8212; full technical reference: http://platform.claude.com/docs/en/build-with-claude/prompt-caching\n\nPricing &#8212; cache write, cache read, and TTL tiers: http://platform.claude.com/docs/en/about-claude/pricing\n\nPrompt caching general availability announcement (December 2024): http://anthropic.com/news/prompt-caching\n\nPractitioner Evidence and Independent Analysis\n\nPrompt caching with Claude API &#8212; practical guide including the growing conversation cache bug: http://dev.to/thegdsks/prompt-caching-with-the-claude-api-a-practical-guide-14ce\n\nOpenClaw GitHub issue &#8212; cache_control missing on OpenRouter for non-Anthropic providers: http://github.com/openclaw/openclaw/issues/9600\n\nMindStudio &#8212; how cache invalidation occurs in multi-turn agents: http://mindstudio.ai/blog/anthropic-prompt-caching-claude-subscription-limits\n\nAWS &#8212; prompt caching on Amazon Bedrock with Claude Code: http://aws.amazon.com/blogs/machine-learning/supercharge-your-development-with-claude-code-and-amazon-bedrock-prompt-caching&quot;,&quot;body_json&quot;:{&quot;type&quot;:&quot;doc&quot;,&quot;attrs&quot;:{&quot;schemaVersion&quot;:&quot;v1&quot;},&quot;content&quot;:[{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;bold&quot;}],&quot;text&quot;:&quot;OpenRouter Documentation&quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;Prompt caching guide: &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;link&quot;,&quot;attrs&quot;:{&quot;href&quot;:&quot;http://openrouter.ai/docs/guides/best-practices/prompt-caching&quot;,&quot;target&quot;:&quot;_blank&quot;,&quot;rel&quot;:&quot;nofollow ugc noopener&quot;,&quot;class&quot;:&quot;note-link&quot;}}],&quot;text&quot;:&quot;http://openrouter.ai/docs/guides/best-practices/prompt-caching&quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;Provider routing and fallback behaviour: &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;link&quot;,&quot;attrs&quot;:{&quot;href&quot;:&quot;http://openrouter.ai/docs/guides/routing/provider-selection&quot;,&quot;target&quot;:&quot;_blank&quot;,&quot;rel&quot;:&quot;nofollow ugc noopener&quot;,&quot;class&quot;:&quot;note-link&quot;}}],&quot;text&quot;:&quot;http://openrouter.ai/docs/guides/routing/provider-selection&quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;Usage accounting and cached token tracking: &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;link&quot;,&quot;attrs&quot;:{&quot;href&quot;:&quot;http://openrouter.ai/docs/cookbook/administration/usage-accounting&quot;,&quot;target&quot;:&quot;_blank&quot;,&quot;rel&quot;:&quot;nofollow ugc noopener&quot;,&quot;class&quot;:&quot;note-link&quot;}}],&quot;text&quot;:&quot;http://openrouter.ai/docs/cookbook/administration/usage-accounting&quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;bold&quot;}],&quot;text&quot;:&quot;Anthropic Official Documentation&quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;Prompt caching &#8212; full technical reference: &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;link&quot;,&quot;attrs&quot;:{&quot;href&quot;:&quot;http://platform.claude.com/docs/en/build-with-claude/prompt-caching&quot;,&quot;target&quot;:&quot;_blank&quot;,&quot;rel&quot;:&quot;nofollow ugc noopener&quot;,&quot;class&quot;:&quot;note-link&quot;}}],&quot;text&quot;:&quot;http://platform.claude.com/docs/en/build-with-claude/prompt-caching&quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;Pricing &#8212; cache write, cache read, and TTL tiers: &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;link&quot;,&quot;attrs&quot;:{&quot;href&quot;:&quot;http://platform.claude.com/docs/en/about-claude/pricing&quot;,&quot;target&quot;:&quot;_blank&quot;,&quot;rel&quot;:&quot;nofollow ugc noopener&quot;,&quot;class&quot;:&quot;note-link&quot;}}],&quot;text&quot;:&quot;http://platform.claude.com/docs/en/about-claude/pricing&quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;Prompt caching general availability announcement (December 2024): &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;link&quot;,&quot;attrs&quot;:{&quot;href&quot;:&quot;http://anthropic.com/news/prompt-caching&quot;,&quot;target&quot;:&quot;_blank&quot;,&quot;rel&quot;:&quot;nofollow ugc noopener&quot;,&quot;class&quot;:&quot;note-link&quot;}}],&quot;text&quot;:&quot;http://anthropic.com/news/prompt-caching&quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;bold&quot;}],&quot;text&quot;:&quot;Practitioner Evidence and Independent Analysis&quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;Prompt caching with Claude API &#8212; practical guide including the growing conversation cache bug: &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;link&quot;,&quot;attrs&quot;:{&quot;href&quot;:&quot;http://dev.to/thegdsks/prompt-caching-with-the-claude-api-a-practical-guide-14ce&quot;,&quot;target&quot;:&quot;_blank&quot;,&quot;rel&quot;:&quot;nofollow ugc noopener&quot;,&quot;class&quot;:&quot;note-link&quot;}}],&quot;text&quot;:&quot;http://dev.to/thegdsks/prompt-caching-with-the-claude-api-a-practical-guide-14ce&quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;OpenClaw GitHub issue &#8212; cache_control missing on OpenRouter for non-Anthropic providers: &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;link&quot;,&quot;attrs&quot;:{&quot;href&quot;:&quot;http://github.com/openclaw/openclaw/issues/9600&quot;,&quot;target&quot;:&quot;_blank&quot;,&quot;rel&quot;:&quot;nofollow ugc noopener&quot;,&quot;class&quot;:&quot;note-link&quot;}}],&quot;text&quot;:&quot;http://github.com/openclaw/openclaw/issues/9600&quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;MindStudio &#8212; how cache invalidation occurs in multi-turn agents: &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;link&quot;,&quot;attrs&quot;:{&quot;href&quot;:&quot;http://mindstudio.ai/blog/anthropic-prompt-caching-claude-subscription-limits&quot;,&quot;target&quot;:&quot;_blank&quot;,&quot;rel&quot;:&quot;nofollow ugc noopener&quot;,&quot;class&quot;:&quot;note-link&quot;}}],&quot;text&quot;:&quot;http://mindstudio.ai/blog/anthropic-prompt-caching-claude-subscription-limits&quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;AWS &#8212; prompt caching on Amazon Bedrock with Claude Code: &quot;},{&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;link&quot;,&quot;attrs&quot;:{&quot;href&quot;:&quot;http://aws.amazon.com/blogs/machine-learning/supercharge-your-development-with-claude-code-and-amazon-bedrock-prompt-caching&quot;,&quot;target&quot;:&quot;_blank&quot;,&quot;rel&quot;:&quot;nofollow ugc noopener&quot;,&quot;class&quot;:&quot;note-link&quot;}}],&quot;text&quot;:&quot;http://aws.amazon.com/blogs/machine-learning/supercharge-your-development-with-claude-code-and-amazon-bedrock-prompt-caching&quot;}]}]},&quot;restacks&quot;:0,&quot;reaction_count&quot;:0,&quot;children_count&quot;:0,&quot;attachments&quot;:[],&quot;name&quot;:&quot;Interesting Engineering ++&quot;,&quot;user_id&quot;:124460392,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/977225f0-cc19-41f4-9df4-e21d01541411_347x347.jpeg&quot;,&quot;user_bestseller_tier&quot;:null,&quot;userStatus&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:null,&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null}},&quot;source&quot;:null,&quot;forumChannel&quot;:null}" data-component-name="CommentPlaceholder"></div><h2><strong>Prior Context: The ASCRS Experiments</strong></h2><p>This experiment does not exist in isolation. It was preceded by a series of architecture design exercises and controlled benchmark experiments run against a pharmaceutical supply chain scenario &#8212; the ASCRS series (<a href="https://interestingengineering.substack.com/p/the-architecture-of-awareness-design">Architecture of Awareness, April 2026</a>; <a href="https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated">ASCRS Harness Lab, May 2026</a>). That series explored similar questions about agent architecture from a different angle: rather than decomposing a failing system, it compared ten different architectural patterns on a fresh task. Its headline finding &#8212; that a well-written single prompt outperformed a five-agent specialist swarm &#8212; appears to contradict the StockPilot findings. </p><p>The apparent contradiction dissolves once the task types are distinguished. Section 7 addresses this directly.</p><h2><strong>Key Results &amp; Learnings at a Glance</strong></h2><p>If you have no time and for those who want the conclusions before the evidence, the twelve most important findings from this experiment are listed below. Each is explained in full in the relevant following sections. The results are summarized and the evolution:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Urdm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64c7f8c5-bec2-4708-bd5e-16139b346789_850x322.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Urdm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64c7f8c5-bec2-4708-bd5e-16139b346789_850x322.png 424w, https://substackcdn.com/image/fetch/$s_!Urdm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64c7f8c5-bec2-4708-bd5e-16139b346789_850x322.png 848w, https://substackcdn.com/image/fetch/$s_!Urdm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64c7f8c5-bec2-4708-bd5e-16139b346789_850x322.png 1272w, https://substackcdn.com/image/fetch/$s_!Urdm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64c7f8c5-bec2-4708-bd5e-16139b346789_850x322.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Urdm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64c7f8c5-bec2-4708-bd5e-16139b346789_850x322.png" width="850" height="322" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/64c7f8c5-bec2-4708-bd5e-16139b346789_850x322.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:322,&quot;width&quot;:850,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:23860,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64c7f8c5-bec2-4708-bd5e-16139b346789_850x322.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Urdm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64c7f8c5-bec2-4708-bd5e-16139b346789_850x322.png 424w, https://substackcdn.com/image/fetch/$s_!Urdm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64c7f8c5-bec2-4708-bd5e-16139b346789_850x322.png 848w, https://substackcdn.com/image/fetch/$s_!Urdm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64c7f8c5-bec2-4708-bd5e-16139b346789_850x322.png 1272w, https://substackcdn.com/image/fetch/$s_!Urdm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64c7f8c5-bec2-4708-bd5e-16139b346789_850x322.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YMjJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YMjJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png 424w, https://substackcdn.com/image/fetch/$s_!YMjJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png 848w, https://substackcdn.com/image/fetch/$s_!YMjJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png 1272w, https://substackcdn.com/image/fetch/$s_!YMjJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YMjJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png" width="897" height="673" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:673,&quot;width&quot;:897,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:71145,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!YMjJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png 424w, https://substackcdn.com/image/fetch/$s_!YMjJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png 848w, https://substackcdn.com/image/fetch/$s_!YMjJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png 1272w, https://substackcdn.com/image/fetch/$s_!YMjJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0HLX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02411425-3873-49e5-af0b-1b1e646f0709_837x95.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0HLX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02411425-3873-49e5-af0b-1b1e646f0709_837x95.png 424w, https://substackcdn.com/image/fetch/$s_!0HLX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02411425-3873-49e5-af0b-1b1e646f0709_837x95.png 848w, https://substackcdn.com/image/fetch/$s_!0HLX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02411425-3873-49e5-af0b-1b1e646f0709_837x95.png 1272w, https://substackcdn.com/image/fetch/$s_!0HLX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02411425-3873-49e5-af0b-1b1e646f0709_837x95.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0HLX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02411425-3873-49e5-af0b-1b1e646f0709_837x95.png" width="837" height="95" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/02411425-3873-49e5-af0b-1b1e646f0709_837x95.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:95,&quot;width&quot;:837,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:11204,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02411425-3873-49e5-af0b-1b1e646f0709_837x95.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!0HLX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02411425-3873-49e5-af0b-1b1e646f0709_837x95.png 424w, https://substackcdn.com/image/fetch/$s_!0HLX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02411425-3873-49e5-af0b-1b1e646f0709_837x95.png 848w, https://substackcdn.com/image/fetch/$s_!0HLX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02411425-3873-49e5-af0b-1b1e646f0709_837x95.png 1272w, https://substackcdn.com/image/fetch/$s_!0HLX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02411425-3873-49e5-af0b-1b1e646f0709_837x95.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gz1S!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5526b6b3-5aa6-43a8-bd33-f7ecc5b24295_866x469.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gz1S!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5526b6b3-5aa6-43a8-bd33-f7ecc5b24295_866x469.png 424w, https://substackcdn.com/image/fetch/$s_!gz1S!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5526b6b3-5aa6-43a8-bd33-f7ecc5b24295_866x469.png 848w, https://substackcdn.com/image/fetch/$s_!gz1S!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5526b6b3-5aa6-43a8-bd33-f7ecc5b24295_866x469.png 1272w, https://substackcdn.com/image/fetch/$s_!gz1S!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5526b6b3-5aa6-43a8-bd33-f7ecc5b24295_866x469.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gz1S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5526b6b3-5aa6-43a8-bd33-f7ecc5b24295_866x469.png" width="866" height="469" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5526b6b3-5aa6-43a8-bd33-f7ecc5b24295_866x469.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:469,&quot;width&quot;:866,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:68672,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5526b6b3-5aa6-43a8-bd33-f7ecc5b24295_866x469.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gz1S!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5526b6b3-5aa6-43a8-bd33-f7ecc5b24295_866x469.png 424w, https://substackcdn.com/image/fetch/$s_!gz1S!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5526b6b3-5aa6-43a8-bd33-f7ecc5b24295_866x469.png 848w, https://substackcdn.com/image/fetch/$s_!gz1S!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5526b6b3-5aa6-43a8-bd33-f7ecc5b24295_866x469.png 1272w, https://substackcdn.com/image/fetch/$s_!gz1S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5526b6b3-5aa6-43a8-bd33-f7ecc5b24295_866x469.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wi9C!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9182777-0baa-4c8d-91e2-207006b03800_842x331.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wi9C!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9182777-0baa-4c8d-91e2-207006b03800_842x331.png 424w, https://substackcdn.com/image/fetch/$s_!wi9C!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9182777-0baa-4c8d-91e2-207006b03800_842x331.png 848w, https://substackcdn.com/image/fetch/$s_!wi9C!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9182777-0baa-4c8d-91e2-207006b03800_842x331.png 1272w, https://substackcdn.com/image/fetch/$s_!wi9C!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9182777-0baa-4c8d-91e2-207006b03800_842x331.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wi9C!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9182777-0baa-4c8d-91e2-207006b03800_842x331.png" width="842" height="331" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d9182777-0baa-4c8d-91e2-207006b03800_842x331.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:331,&quot;width&quot;:842,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:45132,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9182777-0baa-4c8d-91e2-207006b03800_842x331.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wi9C!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9182777-0baa-4c8d-91e2-207006b03800_842x331.png 424w, https://substackcdn.com/image/fetch/$s_!wi9C!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9182777-0baa-4c8d-91e2-207006b03800_842x331.png 848w, https://substackcdn.com/image/fetch/$s_!wi9C!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9182777-0baa-4c8d-91e2-207006b03800_842x331.png 1272w, https://substackcdn.com/image/fetch/$s_!wi9C!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9182777-0baa-4c8d-91e2-207006b03800_842x331.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5EuP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397d8703-aa10-4f03-a0d4-97528122d99b_825x562.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5EuP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397d8703-aa10-4f03-a0d4-97528122d99b_825x562.png 424w, https://substackcdn.com/image/fetch/$s_!5EuP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397d8703-aa10-4f03-a0d4-97528122d99b_825x562.png 848w, https://substackcdn.com/image/fetch/$s_!5EuP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397d8703-aa10-4f03-a0d4-97528122d99b_825x562.png 1272w, https://substackcdn.com/image/fetch/$s_!5EuP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397d8703-aa10-4f03-a0d4-97528122d99b_825x562.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5EuP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397d8703-aa10-4f03-a0d4-97528122d99b_825x562.png" width="825" height="562" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/397d8703-aa10-4f03-a0d4-97528122d99b_825x562.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:562,&quot;width&quot;:825,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:74468,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397d8703-aa10-4f03-a0d4-97528122d99b_825x562.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5EuP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397d8703-aa10-4f03-a0d4-97528122d99b_825x562.png 424w, https://substackcdn.com/image/fetch/$s_!5EuP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397d8703-aa10-4f03-a0d4-97528122d99b_825x562.png 848w, https://substackcdn.com/image/fetch/$s_!5EuP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397d8703-aa10-4f03-a0d4-97528122d99b_825x562.png 1272w, https://substackcdn.com/image/fetch/$s_!5EuP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397d8703-aa10-4f03-a0d4-97528122d99b_825x562.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TUip!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26e64379-8e4c-42ce-87e1-5614008ed03c_863x578.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TUip!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26e64379-8e4c-42ce-87e1-5614008ed03c_863x578.png 424w, https://substackcdn.com/image/fetch/$s_!TUip!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26e64379-8e4c-42ce-87e1-5614008ed03c_863x578.png 848w, https://substackcdn.com/image/fetch/$s_!TUip!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26e64379-8e4c-42ce-87e1-5614008ed03c_863x578.png 1272w, https://substackcdn.com/image/fetch/$s_!TUip!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26e64379-8e4c-42ce-87e1-5614008ed03c_863x578.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TUip!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26e64379-8e4c-42ce-87e1-5614008ed03c_863x578.png" width="863" height="578" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/26e64379-8e4c-42ce-87e1-5614008ed03c_863x578.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:578,&quot;width&quot;:863,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:73818,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26e64379-8e4c-42ce-87e1-5614008ed03c_863x578.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TUip!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26e64379-8e4c-42ce-87e1-5614008ed03c_863x578.png 424w, https://substackcdn.com/image/fetch/$s_!TUip!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26e64379-8e4c-42ce-87e1-5614008ed03c_863x578.png 848w, https://substackcdn.com/image/fetch/$s_!TUip!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26e64379-8e4c-42ce-87e1-5614008ed03c_863x578.png 1272w, https://substackcdn.com/image/fetch/$s_!TUip!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26e64379-8e4c-42ce-87e1-5614008ed03c_863x578.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Lhdl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe22ac6e7-5886-473b-8d3e-45a4e3d03de5_864x627.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Lhdl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe22ac6e7-5886-473b-8d3e-45a4e3d03de5_864x627.png 424w, https://substackcdn.com/image/fetch/$s_!Lhdl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe22ac6e7-5886-473b-8d3e-45a4e3d03de5_864x627.png 848w, https://substackcdn.com/image/fetch/$s_!Lhdl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe22ac6e7-5886-473b-8d3e-45a4e3d03de5_864x627.png 1272w, https://substackcdn.com/image/fetch/$s_!Lhdl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe22ac6e7-5886-473b-8d3e-45a4e3d03de5_864x627.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Lhdl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe22ac6e7-5886-473b-8d3e-45a4e3d03de5_864x627.png" width="864" height="627" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e22ac6e7-5886-473b-8d3e-45a4e3d03de5_864x627.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:627,&quot;width&quot;:864,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:85502,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe22ac6e7-5886-473b-8d3e-45a4e3d03de5_864x627.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Lhdl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe22ac6e7-5886-473b-8d3e-45a4e3d03de5_864x627.png 424w, https://substackcdn.com/image/fetch/$s_!Lhdl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe22ac6e7-5886-473b-8d3e-45a4e3d03de5_864x627.png 848w, https://substackcdn.com/image/fetch/$s_!Lhdl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe22ac6e7-5886-473b-8d3e-45a4e3d03de5_864x627.png 1272w, https://substackcdn.com/image/fetch/$s_!Lhdl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe22ac6e7-5886-473b-8d3e-45a4e3d03de5_864x627.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DHAs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a871601-2a58-4e3d-8268-705346295436_862x391.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DHAs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a871601-2a58-4e3d-8268-705346295436_862x391.png 424w, https://substackcdn.com/image/fetch/$s_!DHAs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a871601-2a58-4e3d-8268-705346295436_862x391.png 848w, https://substackcdn.com/image/fetch/$s_!DHAs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a871601-2a58-4e3d-8268-705346295436_862x391.png 1272w, https://substackcdn.com/image/fetch/$s_!DHAs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a871601-2a58-4e3d-8268-705346295436_862x391.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DHAs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a871601-2a58-4e3d-8268-705346295436_862x391.png" width="862" height="391" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0a871601-2a58-4e3d-8268-705346295436_862x391.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:391,&quot;width&quot;:862,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:68865,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a871601-2a58-4e3d-8268-705346295436_862x391.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DHAs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a871601-2a58-4e3d-8268-705346295436_862x391.png 424w, https://substackcdn.com/image/fetch/$s_!DHAs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a871601-2a58-4e3d-8268-705346295436_862x391.png 848w, https://substackcdn.com/image/fetch/$s_!DHAs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a871601-2a58-4e3d-8268-705346295436_862x391.png 1272w, https://substackcdn.com/image/fetch/$s_!DHAs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a871601-2a58-4e3d-8268-705346295436_862x391.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!URXT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ce8c2fb-823b-4f3c-839f-4904b96457c3_867x455.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!URXT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ce8c2fb-823b-4f3c-839f-4904b96457c3_867x455.png 424w, https://substackcdn.com/image/fetch/$s_!URXT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ce8c2fb-823b-4f3c-839f-4904b96457c3_867x455.png 848w, https://substackcdn.com/image/fetch/$s_!URXT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ce8c2fb-823b-4f3c-839f-4904b96457c3_867x455.png 1272w, https://substackcdn.com/image/fetch/$s_!URXT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ce8c2fb-823b-4f3c-839f-4904b96457c3_867x455.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!URXT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ce8c2fb-823b-4f3c-839f-4904b96457c3_867x455.png" width="867" height="455" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2ce8c2fb-823b-4f3c-839f-4904b96457c3_867x455.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:455,&quot;width&quot;:867,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:57029,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ce8c2fb-823b-4f3c-839f-4904b96457c3_867x455.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!URXT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ce8c2fb-823b-4f3c-839f-4904b96457c3_867x455.png 424w, https://substackcdn.com/image/fetch/$s_!URXT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ce8c2fb-823b-4f3c-839f-4904b96457c3_867x455.png 848w, https://substackcdn.com/image/fetch/$s_!URXT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ce8c2fb-823b-4f3c-839f-4904b96457c3_867x455.png 1272w, https://substackcdn.com/image/fetch/$s_!URXT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ce8c2fb-823b-4f3c-839f-4904b96457c3_867x455.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>The Evaluation Suite: What Was Being Tested and How</strong></h2><p>Before examining what the experiment found, it is worth understanding how the experiment measured anything at all. The quality of a finding depends entirely on the quality of the measurement. This section explains the 12-task evaluation suite used in the StockPilot experiment, and &#8212; for context &#8212; how evaluating an AI agent differs structurally from evaluating a standard language model.</p><h3><strong>Evaluating Language Models vs Evaluating Agents: The Structural Difference</strong></h3><p>When researchers benchmark a language model &#8212; testing whether GPT-5.5 or Claude Opus 4.7 is better at summarising legal documents, or answering medical questions &#8212; the methodology is relatively straightforward. You feed the model a fixed input, collect its output, and compare that output against a known correct answer. One turn. Fixed context. Deterministic measurement.</p><blockquote><p><strong>&#128214;  Plain language &#8212; Benchmark</strong></p><p><em>A benchmark is a standardised test used to compare AI models. Like a driving test that every candidate takes under the same conditions, a benchmark gives every model the same questions so you can compare scores fairly. The limitation is that models can effectively &#8216;study for the test&#8217; &#8212; if a benchmark is used long enough, the companies building models start optimising specifically for that test rather than for genuine capability.</em></p></blockquote><p>Evaluating an AI agent is structurally different in almost every dimension.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LB5S!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b13acbd-8784-4487-bc3a-0d1aa730bc08_863x464.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LB5S!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b13acbd-8784-4487-bc3a-0d1aa730bc08_863x464.png 424w, https://substackcdn.com/image/fetch/$s_!LB5S!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b13acbd-8784-4487-bc3a-0d1aa730bc08_863x464.png 848w, https://substackcdn.com/image/fetch/$s_!LB5S!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b13acbd-8784-4487-bc3a-0d1aa730bc08_863x464.png 1272w, https://substackcdn.com/image/fetch/$s_!LB5S!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b13acbd-8784-4487-bc3a-0d1aa730bc08_863x464.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LB5S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b13acbd-8784-4487-bc3a-0d1aa730bc08_863x464.png" width="863" height="464" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5b13acbd-8784-4487-bc3a-0d1aa730bc08_863x464.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:464,&quot;width&quot;:863,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:41726,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b13acbd-8784-4487-bc3a-0d1aa730bc08_863x464.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LB5S!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b13acbd-8784-4487-bc3a-0d1aa730bc08_863x464.png 424w, https://substackcdn.com/image/fetch/$s_!LB5S!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b13acbd-8784-4487-bc3a-0d1aa730bc08_863x464.png 848w, https://substackcdn.com/image/fetch/$s_!LB5S!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b13acbd-8784-4487-bc3a-0d1aa730bc08_863x464.png 1272w, https://substackcdn.com/image/fetch/$s_!LB5S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b13acbd-8784-4487-bc3a-0d1aa730bc08_863x464.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p><strong>&#128214;  Plain language &#8212; Hallucination</strong></p><p><em>When an AI model produces information that sounds plausible but is factually wrong &#8212; often stated with full confidence &#8212; this is called a hallucination. The model is not lying; it is pattern-matching to what a correct answer would look like without actually knowing the correct answer. In agent systems, hallucinations are particularly dangerous because the agent may act on false information rather than just stating it.</em></p></blockquote><p>The last point in the table is significant. Language model benchmarks have become increasingly unreliable as a signal of genuine capability because frontier models effectively train on benchmark distributions. <em><strong>An agent eval suite is harder to saturate &#8212; the agent must navigate a multi-step task correctly, which requires genuine reasoning rather than pattern matching to a fixed answer format</strong></em>.</p><p>The cost, however, can be substantially higher, when scalled. In this case, running a 12-task agent eval suite in this experiment cost $8.97 at baseline. Running a comparable LLM benchmark would cost a fraction of that. This is why agent eval suites tend to be smaller, more carefully curated, and more domain-specific than language model benchmarks &#8212; each test case is expensive, so each must be individually informative.</p><blockquote><p><strong>&#9888;  The Budget Constraint: Why Agents Need Hard Limits</strong></p><p>Standard LLM evaluation has no concept of a budget &#8212; the model produces an answer or it doesn&#8217;t.</p><p>Agent evaluation requires explicit budget constraints because agents can loop indefinitely.</p><p>StockPilot&#8217;s eval harness enforces two hard limits per task:</p><p>&#8226; Turn budget: a maximum number of agent-model exchanges (typically 5&#8211;10 depending on task)</p><p>&#8226; Wall time budget: a maximum clock time (270 seconds for the F1 daily sweep)</p><p>If either limit is exceeded, the task fails &#8212; regardless of the quality of work done so far.</p><p>F1 failed at baseline not because the agent gave a wrong answer, but because it took 516 seconds to produce an answer the eval harness never received.</p></blockquote><blockquote><p><strong>&#128214;  Plain language &#8212; Wall time</strong></p><p><em>Wall time is the actual elapsed clock time from when a task starts to when it finishes &#8212; named after a clock on the wall. This is distinct from the processing time inside the computer. Wall time includes network round trips, waiting for API responses, and any pauses between steps. An agent that takes 516 seconds of wall time has kept the user waiting 8.6 minutes for a daily task that should complete in under 5.</em></p></blockquote><h2><strong>The 12 Tasks: What Each One Tests</strong></h2><p>The evaluation suite is divided into two series. <strong>The R-series (nine tasks) tests retrieval and reporting operations</strong> &#8212; shorter, lower-token tasks that verify the agent can correctly read, interpret, and communicate inventory information. <strong>The F-series (three tasks) tests fulfillment and decision-making operations</strong> &#8212; complex multi-step tasks requiring the agent to reason across multiple data sources, apply policies, and produce actionable output.</p><blockquote><p><strong>&#128214;  Plain language &#8212; SKU</strong></p></blockquote><blockquote><p><em>SKU stands for Stock Keeping Unit &#8212; it is simply the unique code that identifies a specific product in an inventory system. When a supermarket tracks how many tins of a particular brand of soup it has, each size and variety gets its own SKU. StockPilot manages 250 SKUs, meaning 250 distinct products, each with its own stock level, reorder point, and supplier information.</em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZWL4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52489ff5-bfc1-41a3-8f63-f81ddcafacf0_781x735.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZWL4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52489ff5-bfc1-41a3-8f63-f81ddcafacf0_781x735.png 424w, https://substackcdn.com/image/fetch/$s_!ZWL4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52489ff5-bfc1-41a3-8f63-f81ddcafacf0_781x735.png 848w, https://substackcdn.com/image/fetch/$s_!ZWL4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52489ff5-bfc1-41a3-8f63-f81ddcafacf0_781x735.png 1272w, https://substackcdn.com/image/fetch/$s_!ZWL4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52489ff5-bfc1-41a3-8f63-f81ddcafacf0_781x735.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZWL4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52489ff5-bfc1-41a3-8f63-f81ddcafacf0_781x735.png" width="781" height="735" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/52489ff5-bfc1-41a3-8f63-f81ddcafacf0_781x735.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:735,&quot;width&quot;:781,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:105719,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52489ff5-bfc1-41a3-8f63-f81ddcafacf0_781x735.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZWL4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52489ff5-bfc1-41a3-8f63-f81ddcafacf0_781x735.png 424w, https://substackcdn.com/image/fetch/$s_!ZWL4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52489ff5-bfc1-41a3-8f63-f81ddcafacf0_781x735.png 848w, https://substackcdn.com/image/fetch/$s_!ZWL4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52489ff5-bfc1-41a3-8f63-f81ddcafacf0_781x735.png 1272w, https://substackcdn.com/image/fetch/$s_!ZWL4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52489ff5-bfc1-41a3-8f63-f81ddcafacf0_781x735.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wLsh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ee17123-86a4-40f3-8e99-42b520989047_712x264.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wLsh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ee17123-86a4-40f3-8e99-42b520989047_712x264.png 424w, https://substackcdn.com/image/fetch/$s_!wLsh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ee17123-86a4-40f3-8e99-42b520989047_712x264.png 848w, https://substackcdn.com/image/fetch/$s_!wLsh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ee17123-86a4-40f3-8e99-42b520989047_712x264.png 1272w, https://substackcdn.com/image/fetch/$s_!wLsh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ee17123-86a4-40f3-8e99-42b520989047_712x264.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wLsh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ee17123-86a4-40f3-8e99-42b520989047_712x264.png" width="712" height="264" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7ee17123-86a4-40f3-8e99-42b520989047_712x264.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:264,&quot;width&quot;:712,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:36635,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ee17123-86a4-40f3-8e99-42b520989047_712x264.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wLsh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ee17123-86a4-40f3-8e99-42b520989047_712x264.png 424w, https://substackcdn.com/image/fetch/$s_!wLsh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ee17123-86a4-40f3-8e99-42b520989047_712x264.png 848w, https://substackcdn.com/image/fetch/$s_!wLsh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ee17123-86a4-40f3-8e99-42b520989047_712x264.png 1272w, https://substackcdn.com/image/fetch/$s_!wLsh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ee17123-86a4-40f3-8e99-42b520989047_712x264.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p><strong>&#128214;  Plain language &#8212; Purchase order / PO</strong></p><p><em>A purchase order is a formal request from a buyer to a supplier asking them to provide a specific quantity of goods at an agreed price. In inventory management, when stock runs low, the agent should generate a purchase order to trigger restocking. Think of it as the AI equivalent of a manager filling in a requisition form &#8212; but instead of pen and paper, it produces a structured data record.</em></p></blockquote><h3><strong>How the Evaluation Harness Scores the Agent</strong></h3><p>Each task has a pass/fail outcome determined by the eval harness &#8212; a separate piece of code that runs the agent, monitors its resource usage, and inspects its output. The harness uses deterministic rules where possible and structured output inspection where not:</p><ul><li><p>R1 passes if the returned stock value matches the CSV to within rounding tolerance</p></li><li><p>R9 passes if the top-10 list is correctly ordered and each entry includes a reorder recommendation flag</p></li><li><p>F1 passes if it completes within 270 seconds and produces recommendations for all critical SKUs</p></li><li><p>F3 passes if the output file exists at the expected path and contains the correct number of notification records in the correct format</p></li></ul><blockquote><p><strong>&#128214;  Plain language &#8212; Eval harness</strong></p><p><em>An eval harness is the testing framework that runs an AI agent through a set of tasks and measures the results. Like a laboratory test rig, it controls the conditions, records what happens, and reports the outcome. It is separate from the agent itself &#8212; importantly, the harness does not use the same AI model to score the results, because a model grading its own outputs would be like a student marking their own exam.</em></p></blockquote><h2><strong>The Plusses and Minuses of Agent Evaluation</strong></h2><h3><strong>What agent evals do well</strong></h3><ul><li><p><strong>Capture real operational failure modes: </strong>Budget overruns, path mismatches, silent tool errors, and environment-dependent behaviour are invisible to standard model benchmarks. Agent evals surface the failures that matter in production.</p></li><li><p><strong>Harder to game: </strong>Multi-step task completion requires genuine reasoning. It is difficult to improve an agent&#8217;s score by memorising answer patterns.</p></li><li><p><strong>Measure efficiency alongside correctness: </strong>Token count, wall time, tool calls, and cost are first-class metrics. A language model eval has no equivalent &#8212; there is no concept of a correct answer that cost too much to produce.</p></li><li><p><strong>Reveal architectural failures that unit tests miss: </strong>F3&#8217;s consistent local failure was invisible to code review. Only the eval caught it &#8212; and only because the eval ran in the intended execution environment.</p></li></ul><h3><strong>What agent evals do poorly</strong></h3><ul><li><p><strong>Cost: </strong>At baseline, 12 tasks cost $8.97. Running this suite in a CI/CD pipeline on every code change is economically impractical without significant optimisation.</p></li><li><p><strong>Variance: </strong>Agent behaviour is not fully deterministic. The same agent on the same task may take different numbers of turns or call tools in a different order. A single-run eval score is a point estimate, not a reliable measure.</p></li><li><p><strong>Environment sensitivity: </strong>Eval results can be invalidated by environment mismatches that have nothing to do with agent quality &#8212; as F3 demonstrated.</p></li><li><p><strong>Interpretability: </strong>When a task fails, the reason requires manual investigation of the execution trace. An agent failure might be a reasoning error, a tool failure, a budget overrun, a path mismatch, or a silent exception.</p></li><li><p><strong>Coverage cost: </strong>Designing a high-quality agent eval suite requires domain expertise. The StockPilot suite was provided by Anthropic as part of the workshop. Building an equivalent from scratch is a significant investment.</p></li></ul><p><strong>&#128214;  Plain language &#8212; CI/CD pipeline</strong></p><blockquote><p><em>CI/CD stands for Continuous Integration / Continuous Deployment &#8212; a software engineering practice where every code change is automatically tested before it goes live. Running agent evals as part of a CI/CD pipeline would mean that every time a developer changes the agent&#8217;s code, the full 12-task suite runs automatically to check nothing broke. At $8.97 per run, that adds up quickly if code changes happen dozens of times per day.</em></p></blockquote><h2><strong>Abstract/Overview</strong></h2><p><em><strong>This document is a technical teardown of a five-cycle agent decomposition experiment run against Anthropic&#8217;s StockPilot workshop codebase &#8212; a monolithic inventory management agent with a 402-line system prompt and 12 tools. The experiment was conducted across three API configurations: OpenRouter proxy, direct Anthropic API, and Claude Managed Agents (CMA). Every failure is documented alongside the fix.</strong></em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YMjJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YMjJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png 424w, https://substackcdn.com/image/fetch/$s_!YMjJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png 848w, https://substackcdn.com/image/fetch/$s_!YMjJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png 1272w, https://substackcdn.com/image/fetch/$s_!YMjJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YMjJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png" width="897" height="673" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:673,&quot;width&quot;:897,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:71145,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YMjJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png 424w, https://substackcdn.com/image/fetch/$s_!YMjJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png 848w, https://substackcdn.com/image/fetch/$s_!YMjJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png 1272w, https://substackcdn.com/image/fetch/$s_!YMjJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c198b42-3788-436a-aa3e-6f4c2ce71e24_897x673.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0HLX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02411425-3873-49e5-af0b-1b1e646f0709_837x95.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0HLX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02411425-3873-49e5-af0b-1b1e646f0709_837x95.png 424w, https://substackcdn.com/image/fetch/$s_!0HLX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02411425-3873-49e5-af0b-1b1e646f0709_837x95.png 848w, https://substackcdn.com/image/fetch/$s_!0HLX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02411425-3873-49e5-af0b-1b1e646f0709_837x95.png 1272w, https://substackcdn.com/image/fetch/$s_!0HLX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02411425-3873-49e5-af0b-1b1e646f0709_837x95.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0HLX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02411425-3873-49e5-af0b-1b1e646f0709_837x95.png" width="837" height="95" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/02411425-3873-49e5-af0b-1b1e646f0709_837x95.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:95,&quot;width&quot;:837,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:11204,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02411425-3873-49e5-af0b-1b1e646f0709_837x95.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0HLX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02411425-3873-49e5-af0b-1b1e646f0709_837x95.png 424w, https://substackcdn.com/image/fetch/$s_!0HLX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02411425-3873-49e5-af0b-1b1e646f0709_837x95.png 848w, https://substackcdn.com/image/fetch/$s_!0HLX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02411425-3873-49e5-af0b-1b1e646f0709_837x95.png 1272w, https://substackcdn.com/image/fetch/$s_!0HLX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02411425-3873-49e5-af0b-1b1e646f0709_837x95.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xg8F!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6cb503c-9914-4a8c-bee5-a844d59010bc_850x322.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xg8F!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6cb503c-9914-4a8c-bee5-a844d59010bc_850x322.png 424w, https://substackcdn.com/image/fetch/$s_!xg8F!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6cb503c-9914-4a8c-bee5-a844d59010bc_850x322.png 848w, https://substackcdn.com/image/fetch/$s_!xg8F!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6cb503c-9914-4a8c-bee5-a844d59010bc_850x322.png 1272w, https://substackcdn.com/image/fetch/$s_!xg8F!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6cb503c-9914-4a8c-bee5-a844d59010bc_850x322.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xg8F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6cb503c-9914-4a8c-bee5-a844d59010bc_850x322.png" width="850" height="322" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a6cb503c-9914-4a8c-bee5-a844d59010bc_850x322.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:322,&quot;width&quot;:850,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:23860,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6cb503c-9914-4a8c-bee5-a844d59010bc_850x322.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xg8F!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6cb503c-9914-4a8c-bee5-a844d59010bc_850x322.png 424w, https://substackcdn.com/image/fetch/$s_!xg8F!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6cb503c-9914-4a8c-bee5-a844d59010bc_850x322.png 848w, https://substackcdn.com/image/fetch/$s_!xg8F!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6cb503c-9914-4a8c-bee5-a844d59010bc_850x322.png 1272w, https://substackcdn.com/image/fetch/$s_!xg8F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6cb503c-9914-4a8c-bee5-a844d59010bc_850x322.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UMtR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F320055bd-9a0f-473a-b59e-dd331bcb887b_1061x596.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UMtR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F320055bd-9a0f-473a-b59e-dd331bcb887b_1061x596.png 424w, https://substackcdn.com/image/fetch/$s_!UMtR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F320055bd-9a0f-473a-b59e-dd331bcb887b_1061x596.png 848w, https://substackcdn.com/image/fetch/$s_!UMtR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F320055bd-9a0f-473a-b59e-dd331bcb887b_1061x596.png 1272w, https://substackcdn.com/image/fetch/$s_!UMtR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F320055bd-9a0f-473a-b59e-dd331bcb887b_1061x596.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UMtR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F320055bd-9a0f-473a-b59e-dd331bcb887b_1061x596.png" width="1061" height="596" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/320055bd-9a0f-473a-b59e-dd331bcb887b_1061x596.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:596,&quot;width&quot;:1061,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:53997,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F320055bd-9a0f-473a-b59e-dd331bcb887b_1061x596.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UMtR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F320055bd-9a0f-473a-b59e-dd331bcb887b_1061x596.png 424w, https://substackcdn.com/image/fetch/$s_!UMtR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F320055bd-9a0f-473a-b59e-dd331bcb887b_1061x596.png 848w, https://substackcdn.com/image/fetch/$s_!UMtR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F320055bd-9a0f-473a-b59e-dd331bcb887b_1061x596.png 1272w, https://substackcdn.com/image/fetch/$s_!UMtR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F320055bd-9a0f-473a-b59e-dd331bcb887b_1061x596.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nwTs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ac2ae4b-d6f3-4ead-9fdb-ac3e6a340a22_1075x651.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nwTs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ac2ae4b-d6f3-4ead-9fdb-ac3e6a340a22_1075x651.png 424w, https://substackcdn.com/image/fetch/$s_!nwTs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ac2ae4b-d6f3-4ead-9fdb-ac3e6a340a22_1075x651.png 848w, https://substackcdn.com/image/fetch/$s_!nwTs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ac2ae4b-d6f3-4ead-9fdb-ac3e6a340a22_1075x651.png 1272w, https://substackcdn.com/image/fetch/$s_!nwTs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ac2ae4b-d6f3-4ead-9fdb-ac3e6a340a22_1075x651.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nwTs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ac2ae4b-d6f3-4ead-9fdb-ac3e6a340a22_1075x651.png" width="1075" height="651" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9ac2ae4b-d6f3-4ead-9fdb-ac3e6a340a22_1075x651.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:651,&quot;width&quot;:1075,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:37563,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ac2ae4b-d6f3-4ead-9fdb-ac3e6a340a22_1075x651.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nwTs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ac2ae4b-d6f3-4ead-9fdb-ac3e6a340a22_1075x651.png 424w, https://substackcdn.com/image/fetch/$s_!nwTs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ac2ae4b-d6f3-4ead-9fdb-ac3e6a340a22_1075x651.png 848w, https://substackcdn.com/image/fetch/$s_!nwTs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ac2ae4b-d6f3-4ead-9fdb-ac3e6a340a22_1075x651.png 1272w, https://substackcdn.com/image/fetch/$s_!nwTs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ac2ae4b-d6f3-4ead-9fdb-ac3e6a340a22_1075x651.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The headline finding is not so much the 97% token reduction &#8212; though that number is real and the mechanism is explained. The more significant finding is structural: <em><strong>Claude Managed Agents is not, as many misunderstand - only a cloud hosting service or an inference wrapper. It is a purpose-built agent runtime with server-side session state, native tool execution, sandboxed per-session containers, and an on-demand skill loading architecture that makes context cost proportional to task complexity rather than to the total knowledge the agent holds. And you have to have something break, to appreciate it&#8217;s optimizations.</strong></em></p><p><em><strong>This distinction matters because the common reading of &#8216;managed agents&#8217; assumes the &#8216;managed&#8217; part refers to only compute management. More imporantly it refers to execution state management. </strong></em>The difference in token economics between a local runner and CMA is not primarily model-level; it is infrastructure-level. That gap does not close by adding prompt caching to a stateless API wrapper. It closes by changing the execution model.</p><p>A note on prior context: these findings are in dialogue with the ASCRS Harness Lab, which ran ten harness architectures against a pharmaceutical supply chain scenario using OpenRouter throughout. That experiment found that a well-written prompt (H2, &#945;=1.000) outperformed a five-agent swarm (H9, &#945;=0.625) on a single-turn document task. Both findings are correct and complementary. The distinction &#8212; greenfield document task versus brownfield operational agent &#8212; is addressed in Section 7.</p><h2><strong>1.  The Starting Condition: What a Monolithic Agent Costs</strong></h2><p>The StockPilot baseline is not a contrived worst case. It represents how agentic systems actually grow in organisations: a prototype that worked, to which requirements were added one at a time without reconsidering the core architecture. Over six months the system prompt grew to 402 lines. The tool count reached 12. Three sub-agents were hardcoded to run on every invocation regardless of whether the task needed them.</p><blockquote><p><strong>&#128214;  Plain language &#8212; Token</strong></p><p><em>A token is the basic unit of text that a language model processes &#8212; roughly three-quarters of a word, or about four characters. &#8216;Hello world&#8217; is approximately 2&#8211;3 tokens. Processing tokens costs money: the more tokens in context on each turn, the higher the per-turn cost. 274,384 tokens is roughly equivalent to a 220,000-word document &#8212; about two average novels &#8212; being re-read by the model on a single task run.</em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZK78!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ffdf8e7-ee01-4229-a8db-70bcaf1b616d_865x285.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZK78!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ffdf8e7-ee01-4229-a8db-70bcaf1b616d_865x285.png 424w, https://substackcdn.com/image/fetch/$s_!ZK78!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ffdf8e7-ee01-4229-a8db-70bcaf1b616d_865x285.png 848w, https://substackcdn.com/image/fetch/$s_!ZK78!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ffdf8e7-ee01-4229-a8db-70bcaf1b616d_865x285.png 1272w, https://substackcdn.com/image/fetch/$s_!ZK78!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ffdf8e7-ee01-4229-a8db-70bcaf1b616d_865x285.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZK78!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ffdf8e7-ee01-4229-a8db-70bcaf1b616d_865x285.png" width="865" height="285" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4ffdf8e7-ee01-4229-a8db-70bcaf1b616d_865x285.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:285,&quot;width&quot;:865,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:25976,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ffdf8e7-ee01-4229-a8db-70bcaf1b616d_865x285.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZK78!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ffdf8e7-ee01-4229-a8db-70bcaf1b616d_865x285.png 424w, https://substackcdn.com/image/fetch/$s_!ZK78!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ffdf8e7-ee01-4229-a8db-70bcaf1b616d_865x285.png 848w, https://substackcdn.com/image/fetch/$s_!ZK78!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ffdf8e7-ee01-4229-a8db-70bcaf1b616d_865x285.png 1272w, https://substackcdn.com/image/fetch/$s_!ZK78!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ffdf8e7-ee01-4229-a8db-70bcaf1b616d_865x285.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>The 402-line prompt as a liability</strong></h3><p>The system prompt contained genuinely useful domain knowledge &#8212; reorder logic, supplier scoring, notification templates, forecast contracts &#8212; embedded as prose the model re-parsed on every turn of every task. A stock level lookup was preceded by 15,000 tokens of policy context it would never use.</p><h3><strong>Tools that return data instead of answers</strong></h3><p>The most expensive tool was list_low_stock, which returned a 400-row CSV table directly into the model&#8217;s context window. The agent then reasoned across all 400 rows to find the 8&#8211;12 that required action. By turn 15 of an F1 run, the context contained several complete tables the model had already processed &#8212; billed again at full price on every subsequent turn.</p><blockquote><p><strong>&#128214;  Plain language &#8212; Context window</strong></p><p><em>The context window is the total amount of text a model can &#8216;hold in mind&#8217; at once &#8212; everything it can see when formulating its next response. This includes the system prompt, the entire conversation history, and all tool results so far. Think of it as working memory. A large context window means the model can handle longer conversations &#8212; but every token in context costs money on every turn, even if most of it is irrelevant to the current step.</em></p></blockquote><h3><strong>Hardcoded sub-agents as unconditional overhead</strong></h3><p>Three sub-agents ran on every task regardless of whether the task needed their specialisation. A sub-agent is a separate model call &#8212; a second AI process invoked inside the first. Each added a full API round trip, its own context initialisation, and its output as prose into the parent agent&#8217;s context. For tasks that needed none of this, the overhead was pure cost with no quality benefit.</p><blockquote><p><strong>&#128214;  Plain language &#8212; Sub-agent</strong></p><p><em>A sub-agent is a separate AI model call made from within the main agent&#8217;s execution. The main agent (&#8217;orchestrator&#8217;) delegates a specific task to a sub-agent, receives the result, and continues. Think of it as a manager who, instead of doing everything themselves, sends individual questions to specialists and collects their answers. The benefit is isolation &#8212; the specialist only sees what it needs to. The cost is an additional API call and the risk that the specialist&#8217;s answer needs to be merged coherently with the main agent&#8217;s work.</em></p></blockquote><h2><strong>2.  The Decomposition Progression: Cycle by Cycle</strong></h2><p>The experiment ran four improvement cycles. Each cycle applied one architectural change, measured the result, and logged both the gain and any new failure mode introduced.</p><blockquote><p><strong>&#128269;  The Three Diagnostic Questions</strong></p><p>1. Does this tool return more than ~2,000 tokens of raw data?</p><p>   Replace with a code execution tool that filters at source.</p><p>2. Is this policy instruction only relevant to some tasks?</p><p>   Extract from the system prompt into a skill file loaded on demand.</p><p>3. Does this sub-agent run regardless of whether the current task needs it?</p><p>   Convert to an explicit callable with a typed output contract, invoked conditionally.</p></blockquote><h2><strong>Cycle 1 &#8212; Policy as Skills</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QYtZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43fc838c-ef72-49f0-a35d-a38b1407a5a7_773x246.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QYtZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43fc838c-ef72-49f0-a35d-a38b1407a5a7_773x246.png 424w, https://substackcdn.com/image/fetch/$s_!QYtZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43fc838c-ef72-49f0-a35d-a38b1407a5a7_773x246.png 848w, https://substackcdn.com/image/fetch/$s_!QYtZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43fc838c-ef72-49f0-a35d-a38b1407a5a7_773x246.png 1272w, https://substackcdn.com/image/fetch/$s_!QYtZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43fc838c-ef72-49f0-a35d-a38b1407a5a7_773x246.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QYtZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43fc838c-ef72-49f0-a35d-a38b1407a5a7_773x246.png" width="773" height="246" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/43fc838c-ef72-49f0-a35d-a38b1407a5a7_773x246.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:246,&quot;width&quot;:773,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:15212,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43fc838c-ef72-49f0-a35d-a38b1407a5a7_773x246.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QYtZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43fc838c-ef72-49f0-a35d-a38b1407a5a7_773x246.png 424w, https://substackcdn.com/image/fetch/$s_!QYtZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43fc838c-ef72-49f0-a35d-a38b1407a5a7_773x246.png 848w, https://substackcdn.com/image/fetch/$s_!QYtZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43fc838c-ef72-49f0-a35d-a38b1407a5a7_773x246.png 1272w, https://substackcdn.com/image/fetch/$s_!QYtZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43fc838c-ef72-49f0-a35d-a38b1407a5a7_773x246.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The 402-line prompt was replaced with a 15-line orientation prompt. Domain policies were extracted into five skill files. Score jumped to 100% immediately &#8212; not from token optimisation but from eliminating policy confusion. F1 tokens increased because all skill files were injected upfront into the system prompt, carrying 15,000 tokens of context per turn per task whether needed or not. Quality improved; cost did not yet.</p><blockquote><p><strong>&#128214;  Plain language &#8212; Skill file</strong></p><p><em>A skill file is a document containing specialised instructions or policy rules that an agent loads only when it needs them. Instead of the system prompt listing every possible rule the agent might ever need, the system prompt simply lists which skill files exist. When the agent encounters a task that requires the reorder policy, it reads that file. Simple tasks never load any skill files at all. This is the difference between giving an employee a 400-page manual to carry everywhere versus telling them where the manuals are stored and letting them look things up when needed.</em></p></blockquote><h2><strong>Cycle 2 &#8212; Compute Over Context</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FxWN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea88c8c-e970-42ed-982d-ab3ee142f5b3_773x196.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FxWN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea88c8c-e970-42ed-982d-ab3ee142f5b3_773x196.png 424w, https://substackcdn.com/image/fetch/$s_!FxWN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea88c8c-e970-42ed-982d-ab3ee142f5b3_773x196.png 848w, https://substackcdn.com/image/fetch/$s_!FxWN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea88c8c-e970-42ed-982d-ab3ee142f5b3_773x196.png 1272w, https://substackcdn.com/image/fetch/$s_!FxWN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea88c8c-e970-42ed-982d-ab3ee142f5b3_773x196.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FxWN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea88c8c-e970-42ed-982d-ab3ee142f5b3_773x196.png" width="773" height="196" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9ea88c8c-e970-42ed-982d-ab3ee142f5b3_773x196.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:196,&quot;width&quot;:773,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:11257,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea88c8c-e970-42ed-982d-ab3ee142f5b3_773x196.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FxWN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea88c8c-e970-42ed-982d-ab3ee142f5b3_773x196.png 424w, https://substackcdn.com/image/fetch/$s_!FxWN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea88c8c-e970-42ed-982d-ab3ee142f5b3_773x196.png 848w, https://substackcdn.com/image/fetch/$s_!FxWN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea88c8c-e970-42ed-982d-ab3ee142f5b3_773x196.png 1272w, https://substackcdn.com/image/fetch/$s_!FxWN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea88c8c-e970-42ed-982d-ab3ee142f5b3_773x196.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>list_low_stock replaced with bash_execute &#8212; a tool that runs a Python filter script and returns only the rows that matter. Tool calls for F1 dropped from 77 to 20. The principle: a tool whose return value exceeds ~2,000 tokens is a context liability. Replace it with code that filters at source.</p><blockquote><p><strong>&#128214;  Plain language &#8212; Bash / Python script</strong></p></blockquote><blockquote><p><em>Bash and Python are programming languages. Bash is used to run commands in a terminal; Python is a general-purpose language commonly used for data processing. In this context, instead of the agent asking &#8216;give me all 400 rows of inventory data&#8217;, it now writes a small Python program that asks &#8216;give me only the 8 rows where stock is below the reorder point&#8217; &#8212; and only those rows come back. The filtering happens in code, not in the model&#8217;s head.</em></p></blockquote><h2><strong>Cycle 3 &#8212; Explicit Sub-agent Delegation</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!w7P8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd621ebc7-c39b-4df5-9000-e802f530693f_775x212.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!w7P8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd621ebc7-c39b-4df5-9000-e802f530693f_775x212.png 424w, https://substackcdn.com/image/fetch/$s_!w7P8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd621ebc7-c39b-4df5-9000-e802f530693f_775x212.png 848w, https://substackcdn.com/image/fetch/$s_!w7P8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd621ebc7-c39b-4df5-9000-e802f530693f_775x212.png 1272w, https://substackcdn.com/image/fetch/$s_!w7P8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd621ebc7-c39b-4df5-9000-e802f530693f_775x212.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!w7P8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd621ebc7-c39b-4df5-9000-e802f530693f_775x212.png" width="775" height="212" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d621ebc7-c39b-4df5-9000-e802f530693f_775x212.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:212,&quot;width&quot;:775,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12646,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd621ebc7-c39b-4df5-9000-e802f530693f_775x212.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!w7P8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd621ebc7-c39b-4df5-9000-e802f530693f_775x212.png 424w, https://substackcdn.com/image/fetch/$s_!w7P8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd621ebc7-c39b-4df5-9000-e802f530693f_775x212.png 848w, https://substackcdn.com/image/fetch/$s_!w7P8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd621ebc7-c39b-4df5-9000-e802f530693f_775x212.png 1272w, https://substackcdn.com/image/fetch/$s_!w7P8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd621ebc7-c39b-4df5-9000-e802f530693f_775x212.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Two hidden sub-agents replaced with one explicit sub-agent returning structured JSON. The parent agent&#8217;s context received the answer only &#8212; never the reasoning chain. F1 tokens dropped 54% in a single cycle. F3 began failing because the agent wrote output to the CMA sandbox path while the local eval grader read from a different location. This is the correct signal that the architecture had reached the boundary of what a stateless subprocess can reproduce.</p><blockquote><p><strong>&#128214;  Plain language &#8212; JSON</strong></p><p><em>JSON (JavaScript Object Notation) is a standardised text format for structured data. Instead of returning a paragraph of prose &#8212; &#8216;I recommend ordering 200 units of SKU-0183 from Supplier A&#8217; &#8212; the sub-agent returns {&#8221;SKU-0183&#8221;: 200, &#8220;supplier&#8221;: &#8220;A&#8221;}. This is machine-readable, unambiguous, and cannot accidentally include stray reasoning text that would inflate the parent agent&#8217;s context. The parent agent reads the number, not the explanation.</em></p></blockquote><h2><strong>3.  The OpenRouter Interlude: What a Proxy API Costs You</strong></h2><p>Cycles 1 through 3 ran on OpenRouter &#8212; a multi-model routing service providing access to Claude via a unified API. This is a common starting configuration. The experiment made the practical costs concrete.</p><h3><strong>SDK format mismatch</strong></h3><p>The workshop code was written using Anthropic&#8217;s Python library. OpenRouter uses the OpenAI message format. A compatibility adapter had to be written to translate every request and response between the two formats &#8212; adding maintenance surface and creating failure points invisible until the environment changes.</p><blockquote><p><strong>&#128214;  Plain language &#8212; SDK</strong></p><p><em>An SDK (Software Development Kit) is a pre-built library of code that makes it easier to interact with a specific service. Anthropic&#8217;s SDK handles the details of formatting requests, sending them, and parsing responses so developers don&#8217;t have to write that plumbing from scratch. The problem here is that Anthropic&#8217;s SDK speaks one format and OpenRouter expects a different format &#8212; requiring a translation layer between them.</em></p></blockquote><h3><strong>Model naming</strong></h3><p>Anthropic model names use hyphens: claude-sonnet-4-6. OpenRouter prefixes the provider: anthropic/claude-sonnet-4-6. Using a dot &#8212; claude-sonnet-4.6 &#8212; returns a 404 error as raw HTML rather than a structured error message. The kind of failure that surfaces as a confusing library-level exception.</p><h3><strong>Prompt caching: scaffolded but not operational</strong></h3><p><strong>What we had right:</strong> <code>cache_control: {"type": "ephemeral"}</code> in the code. That is the correct implementation. On its own, that should have worked.</p><p><strong>What undermined it:</strong> OpenRouter&#8217;s default behaviour was routing requests across Anthropic direct, Amazon Bedrock, and Google Vertex AI &#8212; rotating between providers for resilience. Prompt cache entries are provider-specific and session-specific. A cache written on an Anthropic-routed turn is invisible on the next turn if OpenRouter routes that request to Bedrock or Vertex. So the sequence looked like this:</p><ul><li><p>Turn 1 &#8594; Anthropic direct &#8594; cache written at 1.25&#215; cost</p></li><li><p>Turn 2 &#8594; Bedrock &#8594; cache miss, full price paid, new cache written at 1.25&#215;</p></li><li><p>Turn 3 &#8594; Vertex &#8594; cache miss again, full price paid</p></li><li><p>Net result: paying cache write cost repeatedly with near-zero cache reads</p></li></ul><p><strong>The missing piece &#8212; </strong><code>allow_fallbacks: false</code><strong>:</strong> Pinning all requests to a single provider via <code>allow_fallbacks: false</code> in the <code>extra_body</code> would have fixed this. That is the configuration decision that was non-obvious &#8212; it requires knowing that OpenRouter&#8217;s multi-provider routing breaks cache coherency, which is documented but not prominently. This may give you pause when considering vendor lock-in.</p><p><strong>The honest summary:</strong> The <code>cache_control</code> implementation was correct. The provider rotation silently neutralised it. The fix required a separate configuration flag whose necessity was not apparent from the caching documentation alone. A developer following Anthropic&#8217;s caching guide correctly, then routing through OpenRouter without reading the provider-pinning documentation, would hit exactly this outcome.</p><p>Caching was implemented but silently broken by provider rotation, and the fix required knowing to set a flag that the caching documentation does not mention.</p><p>The correct configuration &#8212; cache_control on the system prompt block plus provider pinning &#8212; was available and would have worked. In cycles 1 and 2, where five skill files were injected into the system prompt on every turn (approximately 15,000 stable tokens), the reduction from turn 2 onward on that portion alone would have been approximately 92%. The total suite reduction for those cycles would have been in the range of 40&#8211;60%.</p><p>The reason this still falls well short of CMA&#8217;s 93% suite-level reduction is structural rather than implementation. Explicit prompt caching requires transmitting the full context on every turn and paying 10% of the standard rate to re-read the cached portion. CMA&#8217;s stateful sessions do not re-transmit stable context at all &#8212; it is already on the server. For a multi-turn agent running 10&#8211;15 turns per task, that difference compounds. The R1 task landed at 409 tokens on CMA precisely because the agent received only the new user message &#8212; not the system prompt plus skill files on every turn.</p><blockquote><p><strong>&#128214;  Plain language &#8212; Prompt caching</strong></p><p><em>Prompt caching is a cost-saving mechanism where repeated text that appears in multiple API calls is stored temporarily so the model does not have to process it from scratch each time. If a 15,000-token system prompt is sent on every turn of a 15-turn task, you normally pay for 15,000 tokens &#215; 15 turns = 225,000 tokens. With caching, turns 2&#8211;15 read that text from cache at 10% of the standard rate &#8212; effectively paying for it once at full price and then 14 more times at a 90% discount. The catch is that the cache is only valid for the same provider, and only for 5 minutes to 1 hour depending on the configuration chosen.</em></p></blockquote><h2><strong>4.  The Hidden Hardcoding Problem: Cycle 3b</strong></h2><p>When a direct Anthropic key replaced OpenRouter, the eval score dropped from 92% to 75%. The cause: a hardcoded OpenRouter client inside the forecaster tool function. With the OpenRouter key absent, the function sent the Anthropic key to OpenRouter&#8217;s server &#8212; which rejected it with a 401. The agent&#8217;s error handler caught this silently, returned a structured error, and the main agent fell back to expensive manual computation. F1 went from 6 turns and 103,354 tokens to 9 turns and 232,000 tokens.</p><blockquote><p><strong>&#128214;  Plain language &#8212; 401 error / Authentication error</strong></p><p><em>HTTP 401 means &#8216;unauthorised&#8217; &#8212; the server received a request but rejected it because the credentials (the API key) were wrong or missing. In this case the Anthropic key was valid, but it was being sent to OpenRouter&#8217;s server, which does not accept Anthropic keys. The server&#8217;s response was essentially &#8216;I don&#8217;t recognise this key&#8217; &#8212; but because the error was caught and swallowed silently by the code, the main agent never knew its sub-agent had failed.</em></p></blockquote><p>Fix: replace the hardcoded client with the shared factory function. One import change. Tokens returned to 124,000. Principle: all API client construction should go through a shared factory that reads from the environment. Hardcoded clients create failure modes that only surface when the environment changes &#8212; the worst possible moment for silent degradation.</p><h2><strong>5.  What Claude Managed Agents (CMA) Actually Is</strong></h2><p>The reasonable assumption &#8212; the one this experiment started with &#8212; is that &#8216;managed&#8217; refers to compute management: Anthropic hosts the model, handles scaling, manages infrastructure. CMA is a premium API tier.</p><p>That reading is wrong. CMA is not a hosting service. It is an agent execution runtime.</p><blockquote><p><strong>&#128161;  Plain Language: What &#8216;Execution Runtime&#8217; Means</strong></p><p>A standard API call is stateless. You send everything the model needs to know &#8212; the full history, the system prompt, all prior tool results &#8212; on every turn. The server processes it and forgets.</p><p>Next turn: send everything again. The context window grows; you pay for all of it each time.</p><p>An execution runtime is different. It maintains the session between turns. The history stays on the server. You send only what is new. The prior context is already there. For agents running many turns across complex tasks, this distinction has large cost implications.</p><p>Every turn of a stateless API call pays for all prior context again.</p><p>A runtime pays only for the new.</p></blockquote><h4><strong>1. Stateful sessions &#8212; no history resending</strong></h4><p>In a standard API loop, the context window grows with every turn and the billing system charges for everything sent each time. By turn 15 of an F1 run, the system re-reads 30,000 tokens the model has already processed 14 times. CMA sessions are stateful server-side &#8212; only the new delta is transmitted. Prompt caching reduces the cost of resending context. Stateful sessions eliminate the resending. These are different mechanisms.</p><blockquote><p><strong>&#128214;  Plain language &#8212; Stateful vs stateless</strong></p><p><em>Stateless means the server keeps no memory between requests &#8212; each call starts fresh. Stateful means the server remembers the session. A good everyday analogy: calling a customer service number where you have to re-explain your problem from scratch every time you&#8217;re transferred is stateless. A dedicated account manager who already knows your history is stateful. For AI agents, stateless is cheap to implement but expensive at scale; stateful requires infrastructure but pays back in efficiency.</em></p></blockquote><h4><strong>2. On-demand skill loading &#8212; context proportional to task</strong></h4><p>In the local runner, skill files were injected into the system prompt at session start &#8212; 15,000 tokens of context per turn regardless of task. In CMA, skills are server-side objects loaded by the agent only when needed:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_Sj4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F797af6d0-23da-4bb2-a167-869233677f7d_768x183.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_Sj4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F797af6d0-23da-4bb2-a167-869233677f7d_768x183.png 424w, https://substackcdn.com/image/fetch/$s_!_Sj4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F797af6d0-23da-4bb2-a167-869233677f7d_768x183.png 848w, https://substackcdn.com/image/fetch/$s_!_Sj4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F797af6d0-23da-4bb2-a167-869233677f7d_768x183.png 1272w, https://substackcdn.com/image/fetch/$s_!_Sj4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F797af6d0-23da-4bb2-a167-869233677f7d_768x183.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_Sj4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F797af6d0-23da-4bb2-a167-869233677f7d_768x183.png" width="768" height="183" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/797af6d0-23da-4bb2-a167-869233677f7d_768x183.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:183,&quot;width&quot;:768,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:13219,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F797af6d0-23da-4bb2-a167-869233677f7d_768x183.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_Sj4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F797af6d0-23da-4bb2-a167-869233677f7d_768x183.png 424w, https://substackcdn.com/image/fetch/$s_!_Sj4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F797af6d0-23da-4bb2-a167-869233677f7d_768x183.png 848w, https://substackcdn.com/image/fetch/$s_!_Sj4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F797af6d0-23da-4bb2-a167-869233677f7d_768x183.png 1272w, https://substackcdn.com/image/fetch/$s_!_Sj4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F797af6d0-23da-4bb2-a167-869233677f7d_768x183.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h4><strong>3. Native sandboxed tool execution</strong></h4><p>The local bash_execute tool was a Python subprocess wrapper &#8212; functionally correct for evaluation, not appropriate for production. CMA provides a native Bash tool running inside a per-session ephemeral container with no network access and read-only data mounts. No subprocess security risk; no shared state between sessions.</p><blockquote><p><strong>&#128214;  Plain language &#8212; Sandbox / Ephemeral container</strong></p><p><em>A sandbox is an isolated execution environment &#8212; like a locked room where the AI can run code but cannot access anything outside it. &#8216;Ephemeral&#8217; means it exists only for the duration of the task and is discarded afterwards. This matters for security: if an agent writes code that has unintended consequences, those consequences are contained within the sandbox and disappear when the session ends. It also matters for consistency: each task starts with a clean slate, not leftover state from previous runs.</em></p></blockquote><h4><strong>4. Authoritative billing metrics</strong></h4><p>The local runner summed raw tokens sent per turn &#8212; transmission cost, not billing cost. CMA retrieves the authoritative billed total post-session including all cache effects. Cycle 3 local: 103,354 reported tokens. Cycle 4 CMA: 9,388 billed tokens. One is what was sent; the other is what was charged.</p><h2><strong>6.  Cycle 4 &#8212; CMA: The Full Numbers</strong></h2><p>With CMA available and a direct Anthropic key active, one configuration correction was required: the system prompt specified a data path for the local runner. In the CMA sandbox, files are at a different absolute path. Without correcting this, the agent spent three tool calls searching for data before doing any actual work &#8212; bringing F1 wall time to 368 seconds (budget exceeded). After the path correction: 203 seconds.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ECq0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61b4751-b137-4629-93db-ebfccb3c0523_846x321.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ECq0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61b4751-b137-4629-93db-ebfccb3c0523_846x321.png 424w, https://substackcdn.com/image/fetch/$s_!ECq0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61b4751-b137-4629-93db-ebfccb3c0523_846x321.png 848w, https://substackcdn.com/image/fetch/$s_!ECq0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61b4751-b137-4629-93db-ebfccb3c0523_846x321.png 1272w, https://substackcdn.com/image/fetch/$s_!ECq0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61b4751-b137-4629-93db-ebfccb3c0523_846x321.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ECq0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61b4751-b137-4629-93db-ebfccb3c0523_846x321.png" width="846" height="321" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d61b4751-b137-4629-93db-ebfccb3c0523_846x321.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:321,&quot;width&quot;:846,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:23509,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61b4751-b137-4629-93db-ebfccb3c0523_846x321.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ECq0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61b4751-b137-4629-93db-ebfccb3c0523_846x321.png 424w, https://substackcdn.com/image/fetch/$s_!ECq0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61b4751-b137-4629-93db-ebfccb3c0523_846x321.png 848w, https://substackcdn.com/image/fetch/$s_!ECq0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61b4751-b137-4629-93db-ebfccb3c0523_846x321.png 1272w, https://substackcdn.com/image/fetch/$s_!ECq0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61b4751-b137-4629-93db-ebfccb3c0523_846x321.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p><strong>&#128310;  Hypothetical: What prompt caching would have recovered</strong></p><p>The table above shows three actual configurations. A fourth scenario &#8212; cycles 1 and 2 with prompt caching correctly applied &#8212; would have sat between the OpenRouter and CMA columns:</p><p><strong>  Cycle 1-2 suite cost with caching:  </strong>~$1.20&#8211;$1.75  (vs $2.64&#8211;$2.93 without, vs $0.43 on CMA)</p><p><strong>  F1 token reduction on stable skill content:  </strong>~92% on re-read turns (15k token skill block at 0.1&#215; from turn 2 onward)</p><p><strong>  Total suite reduction vs baseline:  </strong>~40&#8211;60%  (vs CMA&#8217;s 95%)</p><p><em>Why it falls short of CMA: caching still requires transmitting the full context every turn &#8212; you pay 10% to re-read the cached portion rather than 100%, but you still send it. CMA&#8217;s stateful sessions eliminate the transmission entirely. That structural difference accounts for the remaining gap.</em></p><p><em>The missing precondition: caching would also have required pinning requests to a single provider via allow_fallbacks: false. Without that pin, OpenRouter&#8217;s multi-provider routing (Anthropic / Bedrock / Vertex) would have written cache entries on one provider and missed them on the next &#8212; net result: slightly higher cost than no caching at all.</em></p></blockquote><p>F3 passed cleanly in CMA without any change to agent logic &#8212; CMA&#8217;s native sink synchronisation resolved the path mismatch automatically. F1 wall time is slightly longer on CMA (203s) than cycle 3 locally (121s) due to approximately 40 seconds of container provisioning overhead per session. For the suite as a whole this adds roughly 8 minutes of initialisation time across 12 tasks.</p><blockquote><p><strong>&#128214;  Plain language &#8212; Session overhead</strong></p><p><em>Session overhead is the setup cost paid at the start of each task run. CMA provisions a fresh, isolated container for every session &#8212; essentially spinning up a clean temporary computer for the agent to work in. This takes approximately 40 seconds. For a task that runs for 5 minutes, 40 seconds of setup is a small fraction of total time. For a very short task, the overhead is proportionally larger. It is the price of the clean-room isolation that makes CMA&#8217;s per-session guarantees possible.</em></p></blockquote><h2><strong>7.  In Dialogue: The ASCRS Harness Lab</strong></h2><p>The ASCRS Harness Lab ran ten harness architectures against a pharmaceutical supply chain crisis task &#8212; producing a CFO-approvable rerouting brief within six hours when the Strait of Hormuz shipping corridor was disrupted. H2 (a well-structured prompt, &#945;=1.000 at 15,277 tokens) outperformed H9 (a five-agent swarm, &#945;=0.625 at 58,090 tokens).</p><blockquote><p><strong>&#128214;  Plain language &#8212; &#945; (alpha) score</strong></p><p><em>In the ASCRS Harness Lab experiments, &#945; (the Greek letter alpha) is the quality score: a number between 0.0 and 1.0 measuring how well the agent&#8217;s output met the defined criteria. &#945;=1.000 means a perfect score; &#945;=0.625 means the output met about 63% of the quality criteria. The score is produced by a separate AI model that reads the output and grades it against a rubric &#8212; crucially, this scorer model is different from the model that produced the output, to prevent self-grading inflation.</em></p></blockquote><p>The StockPilot decomposition shows multi-agent architecture outperforming a monolith. The ASCRS experiment shows a single prompt outperforming a multi-agent architecture. How can both be true? Task structure.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OXGA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb8a0c46-cdde-4da6-9af6-7d8a69e8a35d_862x271.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OXGA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb8a0c46-cdde-4da6-9af6-7d8a69e8a35d_862x271.png 424w, https://substackcdn.com/image/fetch/$s_!OXGA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb8a0c46-cdde-4da6-9af6-7d8a69e8a35d_862x271.png 848w, https://substackcdn.com/image/fetch/$s_!OXGA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb8a0c46-cdde-4da6-9af6-7d8a69e8a35d_862x271.png 1272w, https://substackcdn.com/image/fetch/$s_!OXGA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb8a0c46-cdde-4da6-9af6-7d8a69e8a35d_862x271.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OXGA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb8a0c46-cdde-4da6-9af6-7d8a69e8a35d_862x271.png" width="862" height="271" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fb8a0c46-cdde-4da6-9af6-7d8a69e8a35d_862x271.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:271,&quot;width&quot;:862,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:32525,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199597361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb8a0c46-cdde-4da6-9af6-7d8a69e8a35d_862x271.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OXGA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb8a0c46-cdde-4da6-9af6-7d8a69e8a35d_862x271.png 424w, https://substackcdn.com/image/fetch/$s_!OXGA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb8a0c46-cdde-4da6-9af6-7d8a69e8a35d_862x271.png 848w, https://substackcdn.com/image/fetch/$s_!OXGA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb8a0c46-cdde-4da6-9af6-7d8a69e8a35d_862x271.png 1272w, https://substackcdn.com/image/fetch/$s_!OXGA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb8a0c46-cdde-4da6-9af6-7d8a69e8a35d_862x271.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p><strong>&#128214;  Plain language &#8212; Brownfield vs greenfield</strong></p><p><em>In software and engineering, greenfield means starting from scratch with no legacy constraints &#8212; a clean field with nothing built on it yet. Brownfield means working within or alongside an existing system &#8212; a field that already has buildings on it. In AI agent terms: a greenfield task has all the data available upfront and requires producing a single well-formed output. A brownfield task is an ongoing operational system that must interact with live data, existing databases, and real-world state that changes as the agent works.</em></p></blockquote><p>The ASCRS swarm (H9) failed because the orchestrator reconciled five independent analyses by averaging them &#8212; every quality criterion scored exactly 0.5. The StockPilot sub-agent delegation worked because it produced a typed JSON answer consumed directly by the parent agent. No merge step. No coordination noise. Architecture must match the task&#8217;s coordination requirements.</p><h2><strong>8.  Thoughts</strong></h2><ul><li><p><strong>CMA is an execution runtime, not a hosting service: </strong>The 97% token reduction is attributable to <strong>stateful sessions</strong>, <strong>on-demand skill loading, native tool execution, and authoritative billing metrics &#8212; a different execution model, not API-level optimisations.</strong></p></li></ul><ul><li><p><strong>Prompt caching on a stateless API does not replicate CMA&#8217;s state management: </strong>Caching reduces the cost of resending context. <strong>Stateful sessions eliminate the resending</strong>. Even with correct implementation and provider pinning, caching would have recovered ~40&#8211;60% in cycles 1&#8211;2. CMA recovered 95%.</p></li></ul><ul><li><p><strong>Skills improve reasoning quality immediately &#8212; but token cost only when loading is on-demand: </strong>Cycle 1 improved score from 71% to 100% without reducing tokens. The mechanism for quality improvement is separate from the mechanism for cost improvement. Both require different implementation choices.</p></li></ul><ul><li><p><strong>Hardcoded API clients in tool implementations create silent failure modes: </strong>One hardcoded client in one tool function caused a 17-point score regression when the API configuration changed. The failure was masked by error handling. All client construction should go through a shared factory.</p></li></ul><ul><li><p><strong>Agent evals surface failures that LLM benchmarks cannot: </strong>Budget overruns, path mismatches, silent tool errors, and environment-dependent behaviour are invisible to standard model benchmarks. Agent eval suites are more expensive to run but capture the failure modes that matter in production.</p></li></ul><ul><li><p><strong>Architecture must match task structure: </strong>Multi-agent delegation with typed contracts outperforms monolithic prompts on brownfield operational tasks. A well-written single prompt outperforms multi-agent swarms on greenfield document tasks. The question is what kind of coordination the task actually requires.</p></li></ul><ul><li><p><strong>The measurement layer is as important as the experiment layer: </strong>Token counts from OpenRouter, direct Anthropic API, and CMA measure different things. A finding is only as reliable as the measurement that produced it.</p></li></ul><p style="text-align: center;"></p><p><strong>Source (cma.py): </strong><a href="https://github.com/anthropics/cwc-workshops/blob/main/agent-decomposition/agents/cma.py">github.com/anthropics/cwc-workshops/blob/main/agent-decomposition/agents/cma.py</a></p><p><strong>CMA general documentation: </strong><a href="https://platform.claude.com/docs/en/managed-agents/overview">platform.claude.com/docs/en/managed-agents/overview</a></p><p>Reading both side by side is the fastest way to see the gap between what the docs describe and what practice requires.</p><h2><strong>What in the Code Made This Work?</strong></h2><p>Three specific patterns in cma.py produced the efficiency gains. None of them are CMA-specific. All of them are transferable to any agent codebase.</p><h3><strong>1. _build_system() &#8212; the single assembly point</strong></h3><p>The function that constructed the system prompt before each API call was the single highest-leverage point in the codebase. Everything the model saw on every turn passed through _build_system(). When this function was changed from &#8216;inject all skill files upfront&#8217; to &#8216;load skills on demand&#8217;, the entire token cost structure changed &#8212; not through dozens of edits scattered across the codebase, but through one function in one place. If your codebase has one function responsible for assembling what the model sees, that function is your primary optimisation target. If context assembly is scattered &#8212; built piece by piece in five different places &#8212; creating a single assembly point is the first refactor worth doing, before any other optimisation.</p><h3><strong>2. make_client() &#8212; the shared factory</strong></h3><p>Every tool that called make_client() from agents/common.py instead of building its own API connection survived the OpenRouter-to-Anthropic switch without breaking. The one tool that hardcoded its own connection &#8212; call_forecaster_subagent &#8212; caused a 17-point score regression and 232,000 tokens of wasted computation when the environment changed. The agent&#8217;s error handling caught the authentication failure silently and fell back to expensive manual computation rather than raising an error. A shared client factory is not just good software practice. In agent systems where every wasted token is a direct cost, it is the difference between an environment change costing one line and costing a debugging session.</p><h3><strong>3. The typed return contract</strong></h3><p>The explicit sub-agent returned {&#8221;SKU-XXXX&#8221;: qty_int} only. No prose. No reasoning. No explanatory text. The parent agent consumed a number and moved on. The sub-agent&#8217;s entire chain of thought stayed in its own isolated context and was never seen by the orchestrator. This single boundary definition produced the largest token reduction in the experiment &#8212; 54% in one cycle. The boundary between agents is defined by what crosses it. The smaller and more structured that boundary is, the cleaner the context on both sides, and the lower the cost of every subsequent turn. Prose crosses boundaries and inflates context. Typed JSON crosses boundaries and disappears.</p><blockquote><p><strong>&#128161;  Why These Three Patterns Matter at Scale</strong></p><p><em>These patterns are the structural reason the same model produced a 97% token reduction from cycle 0 to cycle 4 &#8212; without any change to the model itself, and without any change to the quality of reasoning. The model&#8217;s capability was constant throughout. What changed was the structure around it: how much it had to read before it could think, how reliably it could reach external services, and how much it had to carry across turns.</em></p><p><em>An agent running 15 turns per task, 250 tasks per day, accumulates these inefficiencies at scale. A system costing $8.97 per evaluation suite today will cost $89.70 in six hours if usage scales by 10&#215; and nothing changes. Within the day, continuous run &gt;$350. And before you know it - $10k/month. And this was such a simple, one department exercise. </em></p></blockquote><p>Fixing the structure is not optimisation &#8212; it is the difference between a system that is economically viable and one that is not.</p><h2><strong>The Replicable Audit: This can be applied to Any Codebase</strong></h2><p>The following prompt can be pasted directly into Claude Code for any project &#8212; your own or someone else&#8217;s. Adjust depending on context etc. It runs in report-only mode: Claude Code identifies the specific files, functions, and lines that match each failure pattern, ranks them by estimated impact, and waits for your confirmation before changing anything. This is the same discipline the lab experiment used: one cycle at a time, measure before proceeding, never touch the baseline.</p><p>The five phases map to the five failure modes this experiment surfaced. They are not inventory-management-specific. They occur in any system that grew incrementally without periodic architectural review.</p><p><code>You are performing an architecture efficiency audit on this codebase.</code></p><p><code>PHASE 1 &#8212; CONTEXT AUDIT</code></p><p><code>Scan all files for anything that assembles or constructs what</code></p><p><code>gets sent to the model (system prompts, message builders,</code></p><p><code>context assembly functions). For each one report:</code></p><p><code>- How many tokens does it inject on every call?</code></p><p><code>- Is any of that content only relevant to some tasks?</code></p><p><code>- Is any of that content static/repeated across all calls?</code></p><p><code>- Where is it assembled &#8212; one function or scattered?</code></p><p><code>PHASE 2 &#8212; TOOL AUDIT</code></p><p><code>List every tool or function that makes an external call or</code></p><p><code>reads data. For each one report:</code></p><p><code>- What does it return, and approximately how many tokens?</code></p><p><code>- Does it return raw data or a filtered answer?</code></p><p><code>- If it returns more than ~2,000 tokens, what is the minimum</code></p><p><code>the caller actually needs from it?</code></p><p><code>PHASE 3 &#8212; CLIENT AUDIT</code></p><p><code>Find every location where an API client, HTTP client, or</code></p><p><code>external service connection is instantiated. Report:</code></p><p><code>- Is it hardcoded or does it read from a shared factory?</code></p><p><code>- Would it break silently if an environment variable changed?</code></p><p><code>- Is there a single factory function, or multiple</code></p><p><code>instantiation points scattered across files?</code></p><p><code>PHASE 4 &#8212; SUB-PROCESS AUDIT</code></p><p><code>Find any function that makes a nested model call, spawns a</code></p><p><code>subprocess, or delegates to a secondary process. For each:</code></p><p><code>- Does it run conditionally or unconditionally?</code></p><p><code>- What does it return &#8212; prose or structured data?</code></p><p><code>- Does the parent context see the sub-process reasoning,</code></p><p><code>or only its conclusion?</code></p><p><code>PHASE 5 &#8212; OUTPUT CONTRACT AUDIT</code></p><p><code>Find every location where the agent writes output, files,</code></p><p><code>or results. Report:</code></p><p><code>- Is the output path hardcoded or environment-aware?</code></p><p><code>- Would the output path work identically in a different</code></p><p><code>runtime (local vs cloud vs container)?</code></p><p><code>- Is there a single known drop location or multiple?</code></p><p><code>PHASE 6 &#8212; REPORT</code></p><p><code>Produce a prioritised list of changes ranked by estimated</code></p><p><code>impact. For each item state:</code></p><p><code>- The specific file and function</code></p><p><code>- The smell test it fails (data dump / always-loaded policy /</code></p><p><code>unconditional sub-process / hardcoded client / brittle path)</code></p><p><code>- The specific change that would fix it</code></p><p><code>- Estimated token or cost reduction if applicable</code></p><p><code>Do not rewrite anything yet. Report only.</code></p><p><code>Ask me which items to proceed with before changing any code.</code></p><h2><strong>The Significance</strong></h2><p>This experiment ran one codebase through four improvement cycles and produced a 97% token reduction, a 95% cost reduction, and a quality score that held steady. Those numbers are striking. </p><p>What they represent is more important than the numbers themselves.</p><p>Every AI agent system that exists today will accumulate the same failure modes this experiment identified. Add scale. Instructions grow. Tools proliferate. Sub-processes multiply. Nobody plans for this &#8212; it happens one reasonable addition at a time. The system that costs $8.97 per evaluation suite today will become economically unsustainable as usage scales, unless someone asks the five questions in the right order.</p><p>The lab&#8217;s lasting contribution is not the specific fixes it applied to StockPilot. It is the diagnostic framework: five questions you can ask of any agent codebase, at any point in its lifecycle, to identify exactly where the inefficiency lives and what the precise remedy is. That framework is independent of the model, the domain, the programming language, and the deployment environment.</p><p>The audit prompt above encodes that framework as a replicable instruction. Run it on a new codebase before building on top of it. Run it on an existing system before scaling it. Run it on your own code before deploying it. The five phases are the five questions the lab asked &#8212; and answered with real numbers &#8212; in the experiment documented here.</p><p><strong>Source: </strong><a href="https://github.com/anthropics/cwc-workshops/tree/main/agent-decomposition">github.com/anthropics/cwc-workshops/tree/main/agent-decomposition</a></p><p><strong>CMA overview: </strong><a href="https://platform.claude.com/docs/en/managed-agents/overview">platform.claude.com/docs/en/managed-agents/overview</a></p><p><strong>Skills in CMA: </strong><a href="https://platform.claude.com/docs/en/managed-agents/skills">platform.claude.com/docs/en/managed-agents/skills</a></p><p><strong>Multi-agent sessions: </strong><a href="https://platform.claude.com/docs/en/managed-agents/multi-agent">platform.claude.com/docs/en/managed-agents/multi-agent</a></p><p style="text-align: center;">&#10022;</p><h1><strong>References</strong></h1><h2><strong>Experiment Source</strong></h2><p><strong>Anthropic cwc-workshops &#8212; Agent Decomposition: </strong><a href="https://github.com/anthropics/cwc-workshops/tree/main/agent-decomposition">github.com/anthropics/cwc-workshops/tree/main/agent-decomposition</a></p><p><strong>Workshop video walkthrough: </strong><a href="https://youtu.be/mWvtOHlZM-I">youtu.be/mWvtOHlZM-I</a></p><h2><strong>Prior Work in This Series</strong></h2><p><strong>The Architecture of Awareness &#8212; Interesting Engineering++, April 2026: </strong><a href="https://interestingengineering.substack.com/p/the-architecture-of-awareness-design">interestingengineering.substack.com</a></p><p><strong>ASCRS Harness Lab: The Integrated Agentic Stack &#8212; Interesting Engineering++, May 2026: </strong><a href="https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated">interestingengineering.substack.com</a></p><h2><strong>Anthropic Documentation</strong></h2><p><strong>Claude Managed Agents: </strong><a href="https://docs.anthropic.com/en/docs/agents">docs.anthropic.com/en/docs/agents</a></p><p><strong>Prompt caching (generally available December 2024): </strong><a href="https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching">docs.anthropic.com/en/docs/build-with-claude/prompt-caching</a></p><p><strong>Prompt engineering overview: </strong><a href="https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview">docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview</a></p><h2><strong>OpenRouter</strong></h2><p><strong>Prompt caching documentation: </strong><a href="https://openrouter.ai/docs/guides/best-practices/prompt-caching">openrouter.ai/docs/guides/best-practices/prompt-caching</a></p><p><strong>Claude Sonnet 4.6 provider page: </strong><a href="https://openrouter.ai/anthropic/claude-sonnet-4.6/providers">openrouter.ai/anthropic/claude-sonnet-4.6/providers</a></p><p><strong>Model catalogue: </strong><a href="https://openrouter.ai/models">openrouter.ai/models</a></p><h2><strong>Agent Architecture Research</strong></h2><p><strong>ReAct: Synergizing Reasoning and Acting in Language Models (Yao et al., 2022): </strong><a href="https://arxiv.org/abs/2210.03629">arxiv.org/abs/2210.03629</a></p><p><strong>AutoGen: Multi-Agent Conversation Framework (Wu et al., 2023): </strong><a href="https://arxiv.org/abs/2308.08155">arxiv.org/abs/2308.08155</a></p><p><strong>Chain-of-Thought Prompting Elicits Reasoning in LLMs (Wei et al., 2022): </strong><a href="https://arxiv.org/abs/2201.11903">arxiv.org/abs/2201.11903</a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Why Multi-Agent AI Systems Break]]></title><description><![CDATA[And What We Might Do About it. Updated Framework]]></description><link>https://interestingengineering.substack.com/p/why-multi-agent-ai-systems-break</link><guid isPermaLink="false">https://interestingengineering.substack.com/p/why-multi-agent-ai-systems-break</guid><dc:creator><![CDATA[Interesting Engineering ++]]></dc:creator><pubDate>Sun, 24 May 2026 11:17:30 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!5I5A!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf493edb-0fd5-4c3c-b88b-65a90035464a_1169x641.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5I5A!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf493edb-0fd5-4c3c-b88b-65a90035464a_1169x641.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5I5A!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf493edb-0fd5-4c3c-b88b-65a90035464a_1169x641.png 424w, https://substackcdn.com/image/fetch/$s_!5I5A!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf493edb-0fd5-4c3c-b88b-65a90035464a_1169x641.png 848w, https://substackcdn.com/image/fetch/$s_!5I5A!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf493edb-0fd5-4c3c-b88b-65a90035464a_1169x641.png 1272w, https://substackcdn.com/image/fetch/$s_!5I5A!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf493edb-0fd5-4c3c-b88b-65a90035464a_1169x641.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5I5A!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf493edb-0fd5-4c3c-b88b-65a90035464a_1169x641.png" width="1169" height="641" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/df493edb-0fd5-4c3c-b88b-65a90035464a_1169x641.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:641,&quot;width&quot;:1169,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1126817,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199052683?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf493edb-0fd5-4c3c-b88b-65a90035464a_1169x641.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5I5A!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf493edb-0fd5-4c3c-b88b-65a90035464a_1169x641.png 424w, https://substackcdn.com/image/fetch/$s_!5I5A!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf493edb-0fd5-4c3c-b88b-65a90035464a_1169x641.png 848w, https://substackcdn.com/image/fetch/$s_!5I5A!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf493edb-0fd5-4c3c-b88b-65a90035464a_1169x641.png 1272w, https://substackcdn.com/image/fetch/$s_!5I5A!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf493edb-0fd5-4c3c-b88b-65a90035464a_1169x641.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EKfe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F619b931b-951f-4f87-ae84-c07e55bc4a8f_1188x648.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EKfe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F619b931b-951f-4f87-ae84-c07e55bc4a8f_1188x648.png 424w, https://substackcdn.com/image/fetch/$s_!EKfe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F619b931b-951f-4f87-ae84-c07e55bc4a8f_1188x648.png 848w, https://substackcdn.com/image/fetch/$s_!EKfe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F619b931b-951f-4f87-ae84-c07e55bc4a8f_1188x648.png 1272w, https://substackcdn.com/image/fetch/$s_!EKfe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F619b931b-951f-4f87-ae84-c07e55bc4a8f_1188x648.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EKfe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F619b931b-951f-4f87-ae84-c07e55bc4a8f_1188x648.png" width="1188" height="648" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/619b931b-951f-4f87-ae84-c07e55bc4a8f_1188x648.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:648,&quot;width&quot;:1188,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1014615,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199052683?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F619b931b-951f-4f87-ae84-c07e55bc4a8f_1188x648.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EKfe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F619b931b-951f-4f87-ae84-c07e55bc4a8f_1188x648.png 424w, https://substackcdn.com/image/fetch/$s_!EKfe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F619b931b-951f-4f87-ae84-c07e55bc4a8f_1188x648.png 848w, https://substackcdn.com/image/fetch/$s_!EKfe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F619b931b-951f-4f87-ae84-c07e55bc4a8f_1188x648.png 1272w, https://substackcdn.com/image/fetch/$s_!EKfe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F619b931b-951f-4f87-ae84-c07e55bc4a8f_1188x648.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Smi0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1110bcd0-fa2a-4b2b-a17a-7fe2829f4471_1157x632.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Smi0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1110bcd0-fa2a-4b2b-a17a-7fe2829f4471_1157x632.png 424w, https://substackcdn.com/image/fetch/$s_!Smi0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1110bcd0-fa2a-4b2b-a17a-7fe2829f4471_1157x632.png 848w, https://substackcdn.com/image/fetch/$s_!Smi0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1110bcd0-fa2a-4b2b-a17a-7fe2829f4471_1157x632.png 1272w, https://substackcdn.com/image/fetch/$s_!Smi0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1110bcd0-fa2a-4b2b-a17a-7fe2829f4471_1157x632.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Smi0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1110bcd0-fa2a-4b2b-a17a-7fe2829f4471_1157x632.png" width="1157" height="632" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1110bcd0-fa2a-4b2b-a17a-7fe2829f4471_1157x632.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:632,&quot;width&quot;:1157,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:943603,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199052683?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1110bcd0-fa2a-4b2b-a17a-7fe2829f4471_1157x632.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Smi0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1110bcd0-fa2a-4b2b-a17a-7fe2829f4471_1157x632.png 424w, https://substackcdn.com/image/fetch/$s_!Smi0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1110bcd0-fa2a-4b2b-a17a-7fe2829f4471_1157x632.png 848w, https://substackcdn.com/image/fetch/$s_!Smi0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1110bcd0-fa2a-4b2b-a17a-7fe2829f4471_1157x632.png 1272w, https://substackcdn.com/image/fetch/$s_!Smi0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1110bcd0-fa2a-4b2b-a17a-7fe2829f4471_1157x632.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Good reads recently from Pradeep on <a href="https://cioinsights.substack.com/p/multi-agent-ai-systems-break-in-production">Why Multi-Agent Systems Break In Production</a>. Also, <a href="https://arxiv.org/abs/2605.18747">Code As Agent Harness</a>. I wanted to benchmark these against two of my recent articles, for good measure. But more importantly, pick up on lessons learned that I may have missed or not applied. The space is evolving so fast, that being able to apply learnings from experts in the space is wonderful. Substack has been amazing this way for me. The two recent articles that I would like to review:</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;a52b781b-2302-479c-9a77-1b3201ef8a6b&quot;,&quot;caption&quot;:&quot;Before You Read: A Structural Introduction&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;The Architecture of Awareness: Design Considerations Of A Shipper's Agentic Logic&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:124460392,&quot;name&quot;:&quot;Interesting Engineering ++&quot;,&quot;bio&quot;:&quot;I spend my time learning about, and understanding our complex world better. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/977225f0-cc19-41f4-9df4-e21d01541411_347x347.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-04-22T17:10:21.275Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!iyM4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16e932de-491b-473e-864f-65b45369eacb_1408x768.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://interestingengineering.substack.com/p/the-architecture-of-awareness-design&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:194979383,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:2,&quot;comment_count&quot;:0,&quot;publication_id&quot;:1335585,&quot;publication_name&quot;:&quot;Interesting Engineering++&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!-M9w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05150353-1bdc-48d2-b72c-c0bd499513eb_1024x1024.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;92ba57ea-5213-4d7e-bba2-808977cc4612&quot;,&quot;caption&quot;:&quot;Had some time on my hands, and applied the features of The Harness Experiment(s) to the Architecture of Awareness design considerations. You will remember from The Harness Experiment (applied to a mini vendor analysis case study) that the results presented as follows:&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;ASCRS Harness Lab - The Integrated Agentic Stack: When Does More Architecture Mean Better AI? A Diagnostic Teardown&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:124460392,&quot;name&quot;:&quot;Interesting Engineering ++&quot;,&quot;bio&quot;:&quot;I spend my time learning about, and understanding our complex world better. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/977225f0-cc19-41f4-9df4-e21d01541411_347x347.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-05-16T17:52:19.700Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!cv0d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fb21fa-dc0a-4c0f-8ce4-6b2e85594843_1160x595.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:198013155,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:2,&quot;comment_count&quot;:2,&quot;publication_id&quot;:1335585,&quot;publication_name&quot;:&quot;Interesting Engineering++&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!-M9w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05150353-1bdc-48d2-b72c-c0bd499513eb_1024x1024.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p><strong>Why Do Multi-Agent Systems Break?</strong></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!V1WR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3e5ec0-f55e-4583-a023-186e7dc61058_1172x639.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!V1WR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3e5ec0-f55e-4583-a023-186e7dc61058_1172x639.png 424w, https://substackcdn.com/image/fetch/$s_!V1WR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3e5ec0-f55e-4583-a023-186e7dc61058_1172x639.png 848w, https://substackcdn.com/image/fetch/$s_!V1WR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3e5ec0-f55e-4583-a023-186e7dc61058_1172x639.png 1272w, https://substackcdn.com/image/fetch/$s_!V1WR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3e5ec0-f55e-4583-a023-186e7dc61058_1172x639.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!V1WR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3e5ec0-f55e-4583-a023-186e7dc61058_1172x639.png" width="1172" height="639" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0b3e5ec0-f55e-4583-a023-186e7dc61058_1172x639.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:639,&quot;width&quot;:1172,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1126638,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199052683?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3e5ec0-f55e-4583-a023-186e7dc61058_1172x639.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!V1WR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3e5ec0-f55e-4583-a023-186e7dc61058_1172x639.png 424w, https://substackcdn.com/image/fetch/$s_!V1WR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3e5ec0-f55e-4583-a023-186e7dc61058_1172x639.png 848w, https://substackcdn.com/image/fetch/$s_!V1WR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3e5ec0-f55e-4583-a023-186e7dc61058_1172x639.png 1272w, https://substackcdn.com/image/fetch/$s_!V1WR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3e5ec0-f55e-4583-a023-186e7dc61058_1172x639.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Pradeep&#8217;s production failure analysis (drawn from ArXiv research and IEEE working group field observation) identifies a clear pattern. <strong>Multi-agent systems fail</strong> not because the underlying models are bad, but <strong>because the architecture around them is unobserved and unverified. The harness.</strong></p><p>The breakdown by cause:</p><ul><li><p><strong>Specification failures (42%)</strong> &#8212; Agents can&#8217;t resolve ambiguous instructions. Vague task definitions don&#8217;t surface errors; they silently cascade into wrong actions.</p></li><li><p><strong>Inter-agent misalignment (23%)</strong> &#8212; Agents act on stale shared state. Conflicting decisions and race conditions on shared resources are invisible to standard monitoring.</p></li><li><p><strong>Task verification failures (15%)</strong> &#8212; No agent in the pipeline has explicit responsibility for confirming its output is correct before passing it downstream.</p></li><li><p><strong>Tool errors, context loss in handoffs, coordination overhead (20%)</strong> &#8212; More tractable, but still require purpose-built instrumentation to catch.</p></li></ul><p>The compounding problem: a 4-agent pipeline creates 6 potential failure points. A 10-agent pipeline creates 45. Output-only evaluation &#8212; treating a completed status code as proof of correctness &#8212; passes 20&#8211;40% more test cases than it should. Those extra passes are false confidence.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Qnpk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F501fa6a0-2f25-4914-9e03-c544c68c6043_1172x629.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Qnpk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F501fa6a0-2f25-4914-9e03-c544c68c6043_1172x629.png 424w, https://substackcdn.com/image/fetch/$s_!Qnpk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F501fa6a0-2f25-4914-9e03-c544c68c6043_1172x629.png 848w, https://substackcdn.com/image/fetch/$s_!Qnpk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F501fa6a0-2f25-4914-9e03-c544c68c6043_1172x629.png 1272w, https://substackcdn.com/image/fetch/$s_!Qnpk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F501fa6a0-2f25-4914-9e03-c544c68c6043_1172x629.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Qnpk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F501fa6a0-2f25-4914-9e03-c544c68c6043_1172x629.png" width="1172" height="629" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/501fa6a0-2f25-4914-9e03-c544c68c6043_1172x629.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:629,&quot;width&quot;:1172,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:856372,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199052683?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F501fa6a0-2f25-4914-9e03-c544c68c6043_1172x629.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Qnpk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F501fa6a0-2f25-4914-9e03-c544c68c6043_1172x629.png 424w, https://substackcdn.com/image/fetch/$s_!Qnpk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F501fa6a0-2f25-4914-9e03-c544c68c6043_1172x629.png 848w, https://substackcdn.com/image/fetch/$s_!Qnpk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F501fa6a0-2f25-4914-9e03-c544c68c6043_1172x629.png 1272w, https://substackcdn.com/image/fetch/$s_!Qnpk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F501fa6a0-2f25-4914-9e03-c544c68c6043_1172x629.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><p><strong>So What To Do About It (Pradeep&#8217;s Recommendations)</strong></p><p>Three structural practices separate teams that get this right:</p><ol><li><p><strong>Capture full traces from day one.</strong> Every tool call, every handoff, every state change &#8212; not sampled, not filtered. Storage is cheap. Debugging without traces is guesswork.</p></li><li><p><strong>Design verification into the agent graph, not around it.</strong> Each agent that produces output should have a corresponding verification step before that output leaves the node. Built in, not bolted on.</p></li><li><p><strong>Track coordination metrics as first-class KPIs.</strong> Handoff success rate, context retention across turns, inter-agent instruction adherence. If these aren&#8217;t defined as targets, they never get measured.</p></li></ol><p>The CLEAR evaluation framework adds a useful lens: <strong>Cost, Latency, Efficacy, Assurance, Reliability</strong> &#8212; validated against 300 enterprise tasks, it correlates with actual production success at &#961; = 0.83, versus &#961; = 0.41 for accuracy-only evaluation.</p><div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pXd6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F449e1dac-862e-4b17-b624-95f597ca6ae7_1157x632.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pXd6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F449e1dac-862e-4b17-b624-95f597ca6ae7_1157x632.png 424w, https://substackcdn.com/image/fetch/$s_!pXd6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F449e1dac-862e-4b17-b624-95f597ca6ae7_1157x632.png 848w, https://substackcdn.com/image/fetch/$s_!pXd6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F449e1dac-862e-4b17-b624-95f597ca6ae7_1157x632.png 1272w, https://substackcdn.com/image/fetch/$s_!pXd6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F449e1dac-862e-4b17-b624-95f597ca6ae7_1157x632.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pXd6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F449e1dac-862e-4b17-b624-95f597ca6ae7_1157x632.png" width="1157" height="632" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/449e1dac-862e-4b17-b624-95f597ca6ae7_1157x632.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:632,&quot;width&quot;:1157,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:943603,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199052683?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F449e1dac-862e-4b17-b624-95f597ca6ae7_1157x632.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pXd6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F449e1dac-862e-4b17-b624-95f597ca6ae7_1157x632.png 424w, https://substackcdn.com/image/fetch/$s_!pXd6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F449e1dac-862e-4b17-b624-95f597ca6ae7_1157x632.png 848w, https://substackcdn.com/image/fetch/$s_!pXd6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F449e1dac-862e-4b17-b624-95f597ca6ae7_1157x632.png 1272w, https://substackcdn.com/image/fetch/$s_!pXd6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F449e1dac-862e-4b17-b624-95f597ca6ae7_1157x632.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p><strong>Contrast with My Articles &#8212; Three Gaps Pradeep Doesn&#8217;t Include </strong></p><p>The Architecture of Awareness and the ASCRS Harness Lab arrived at several of the same conclusions from the engineering direction rather than the evaluation direction. But they identified failure modes and fixes that Pradeep&#8217;s taxonomy doesn&#8217;t name. Having said that, I see them evolving from capturing traces and designing verification into the Agent Graph. Ultimately, reading his piece did make me take a second look, which i thoroughly appreciated!</p><p><strong>Gap 1: Reviewers check presence, not dependency.</strong></p><p>Pradeep flags task verification failures (15%) as a category. My Harness Lab identified the precise mechanism: SA_reviewer confirmed that G4 and G7 numbers both appeared in the brief &#8212; but didn&#8217;t verify they derived from the same scenario. It checked presence, not logical dependency. This distinction matters enormously. A verification agent that confirms a number exists is not the same as one that confirms the number is coherent with the planning basis that precedes it. The fix from the Meta-Harness: reorder gate evaluation so G4 must be locked before G7 is written. Zero added cost. Inconsistency eliminated.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WSzL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07febadd-d527-44af-adf6-6c61ae6bbc04_1173x644.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WSzL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07febadd-d527-44af-adf6-6c61ae6bbc04_1173x644.png 424w, https://substackcdn.com/image/fetch/$s_!WSzL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07febadd-d527-44af-adf6-6c61ae6bbc04_1173x644.png 848w, https://substackcdn.com/image/fetch/$s_!WSzL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07febadd-d527-44af-adf6-6c61ae6bbc04_1173x644.png 1272w, https://substackcdn.com/image/fetch/$s_!WSzL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07febadd-d527-44af-adf6-6c61ae6bbc04_1173x644.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WSzL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07febadd-d527-44af-adf6-6c61ae6bbc04_1173x644.png" width="1173" height="644" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/07febadd-d527-44af-adf6-6c61ae6bbc04_1173x644.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:644,&quot;width&quot;:1173,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1191310,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199052683?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07febadd-d527-44af-adf6-6c61ae6bbc04_1173x644.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WSzL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07febadd-d527-44af-adf6-6c61ae6bbc04_1173x644.png 424w, https://substackcdn.com/image/fetch/$s_!WSzL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07febadd-d527-44af-adf6-6c61ae6bbc04_1173x644.png 848w, https://substackcdn.com/image/fetch/$s_!WSzL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07febadd-d527-44af-adf6-6c61ae6bbc04_1173x644.png 1272w, https://substackcdn.com/image/fetch/$s_!WSzL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07febadd-d527-44af-adf6-6c61ae6bbc04_1173x644.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Gap 2: Orchestrators average instead of select.</strong></p><p>Pradeep&#8217;s inter-agent misalignment category captures coordination failures, but my  Harness Lab names the specific failure mode inside the merge step: every criterion in H9 scored exactly 0.5 &#8212; present but not rigorous &#8212; because the orchestrator reconciled five different phrasings by averaging them rather than selecting the best. This is a design choice, not an accident. Orchestrators need selection logic, not reconciliation logic. <strong>If you&#8217;re merging five specialist outputs into one document, the orchestrator needs a decision rule, not a synthesis process.</strong></p><p><strong>Gap 3: Task structure determines whether swarms help or hurt.</strong></p><p>Pradeep&#8217;s recommendations are architecture-agnostic. My work establishes a specific, deployable decision rule: brownfield tasks (continuous operations, real-time data feeds, external state, ERP writes) benefit from multi-agent coordination. Greenfield tasks (single-turn document generation, all data present upfront) do not &#8212; a well-written prompt beats every multi-agent architecture tested, including the five-specialist swarm. A bit of tweaking depending on task objectives. The practical implication: most teams reach for a swarm architecture when they should reach for a better specification first. Don&#8217;t overcomplicate more than is necessary.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4qAE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d16606a-322b-4b05-8ed1-9429501a4ed7_1178x648.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4qAE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d16606a-322b-4b05-8ed1-9429501a4ed7_1178x648.png 424w, https://substackcdn.com/image/fetch/$s_!4qAE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d16606a-322b-4b05-8ed1-9429501a4ed7_1178x648.png 848w, https://substackcdn.com/image/fetch/$s_!4qAE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d16606a-322b-4b05-8ed1-9429501a4ed7_1178x648.png 1272w, https://substackcdn.com/image/fetch/$s_!4qAE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d16606a-322b-4b05-8ed1-9429501a4ed7_1178x648.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4qAE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d16606a-322b-4b05-8ed1-9429501a4ed7_1178x648.png" width="1178" height="648" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d16606a-322b-4b05-8ed1-9429501a4ed7_1178x648.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:648,&quot;width&quot;:1178,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1097927,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199052683?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d16606a-322b-4b05-8ed1-9429501a4ed7_1178x648.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4qAE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d16606a-322b-4b05-8ed1-9429501a4ed7_1178x648.png 424w, https://substackcdn.com/image/fetch/$s_!4qAE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d16606a-322b-4b05-8ed1-9429501a4ed7_1178x648.png 848w, https://substackcdn.com/image/fetch/$s_!4qAE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d16606a-322b-4b05-8ed1-9429501a4ed7_1178x648.png 1272w, https://substackcdn.com/image/fetch/$s_!4qAE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d16606a-322b-4b05-8ed1-9429501a4ed7_1178x648.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Nl1R!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8c10839-c4ef-4ab2-908b-a33210f664c6_901x455.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Nl1R!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8c10839-c4ef-4ab2-908b-a33210f664c6_901x455.png 424w, https://substackcdn.com/image/fetch/$s_!Nl1R!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8c10839-c4ef-4ab2-908b-a33210f664c6_901x455.png 848w, https://substackcdn.com/image/fetch/$s_!Nl1R!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8c10839-c4ef-4ab2-908b-a33210f664c6_901x455.png 1272w, https://substackcdn.com/image/fetch/$s_!Nl1R!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8c10839-c4ef-4ab2-908b-a33210f664c6_901x455.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Nl1R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8c10839-c4ef-4ab2-908b-a33210f664c6_901x455.png" width="901" height="455" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a8c10839-c4ef-4ab2-908b-a33210f664c6_901x455.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:455,&quot;width&quot;:901,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:64040,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199052683?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8c10839-c4ef-4ab2-908b-a33210f664c6_901x455.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Nl1R!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8c10839-c4ef-4ab2-908b-a33210f664c6_901x455.png 424w, https://substackcdn.com/image/fetch/$s_!Nl1R!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8c10839-c4ef-4ab2-908b-a33210f664c6_901x455.png 848w, https://substackcdn.com/image/fetch/$s_!Nl1R!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8c10839-c4ef-4ab2-908b-a33210f664c6_901x455.png 1272w, https://substackcdn.com/image/fetch/$s_!Nl1R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8c10839-c4ef-4ab2-908b-a33210f664c6_901x455.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The underlying principle both bodies of work share: <strong>the failure is rarely the model. It&#8217;s what surrounds the model</strong> &#8212; when agents start, what they&#8217;re allowed to read, what gets verified before it moves downstream, and what the system chooses to remember. Structure is not scaffolding around intelligence. In multi-agent systems, it is the intelligence.</p><p>The open question neither article fully resolves: how do you close the outcome loop? The ASCRS work found that institutional memory &#8212; the sixth objective &#8212; remained unverified because humans didn&#8217;t complete post-event documentation. Pradeep finds that evaluation infrastructure is the gap between prototype and production. Both are pointing at the same discipline problem: the system can be built correctly and still fail to learn, if the humans operating it don&#8217;t close the loop.</p><h2>Code As Agent Harness (Survey Paper)</h2><p>This is also an amazing and necessary read, which I will incorporate into an updated fromework further below:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jT9L!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff33ef35a-d98d-40fd-9681-64dfc814b8a3_752x538.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jT9L!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff33ef35a-d98d-40fd-9681-64dfc814b8a3_752x538.png 424w, https://substackcdn.com/image/fetch/$s_!jT9L!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff33ef35a-d98d-40fd-9681-64dfc814b8a3_752x538.png 848w, https://substackcdn.com/image/fetch/$s_!jT9L!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff33ef35a-d98d-40fd-9681-64dfc814b8a3_752x538.png 1272w, https://substackcdn.com/image/fetch/$s_!jT9L!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff33ef35a-d98d-40fd-9681-64dfc814b8a3_752x538.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jT9L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff33ef35a-d98d-40fd-9681-64dfc814b8a3_752x538.png" width="752" height="538" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f33ef35a-d98d-40fd-9681-64dfc814b8a3_752x538.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:538,&quot;width&quot;:752,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:143044,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199052683?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff33ef35a-d98d-40fd-9681-64dfc814b8a3_752x538.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jT9L!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff33ef35a-d98d-40fd-9681-64dfc814b8a3_752x538.png 424w, https://substackcdn.com/image/fetch/$s_!jT9L!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff33ef35a-d98d-40fd-9681-64dfc814b8a3_752x538.png 848w, https://substackcdn.com/image/fetch/$s_!jT9L!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff33ef35a-d98d-40fd-9681-64dfc814b8a3_752x538.png 1272w, https://substackcdn.com/image/fetch/$s_!jT9L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff33ef35a-d98d-40fd-9681-64dfc814b8a3_752x538.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://arxiv.org/abs/2605.18747">Source Paper</a></figcaption></figure></div><p>&#129497;&#127998;&#8205;&#9794;&#65039;: This paper reframes code not just as &#8220;output&#8221; from an LLM, but as the <strong>operating substrate / harness</strong> through which agents reason, act, verify, remember, and coordinate. The important insight is: <strong>many agent failures are actually harness failures, not just model failures.</strong></p><h2>&#128273; Core Thesis &#8212; &#8220;Code as Agent Harness&#8221;</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!x6kJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8d8fa1f-5b1d-4183-b0ea-66adb40b36f3_1153x634.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!x6kJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8d8fa1f-5b1d-4183-b0ea-66adb40b36f3_1153x634.png 424w, https://substackcdn.com/image/fetch/$s_!x6kJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8d8fa1f-5b1d-4183-b0ea-66adb40b36f3_1153x634.png 848w, https://substackcdn.com/image/fetch/$s_!x6kJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8d8fa1f-5b1d-4183-b0ea-66adb40b36f3_1153x634.png 1272w, https://substackcdn.com/image/fetch/$s_!x6kJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8d8fa1f-5b1d-4183-b0ea-66adb40b36f3_1153x634.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!x6kJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8d8fa1f-5b1d-4183-b0ea-66adb40b36f3_1153x634.png" width="1153" height="634" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d8d8fa1f-5b1d-4183-b0ea-66adb40b36f3_1153x634.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:634,&quot;width&quot;:1153,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:976177,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199052683?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8d8fa1f-5b1d-4183-b0ea-66adb40b36f3_1153x634.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!x6kJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8d8fa1f-5b1d-4183-b0ea-66adb40b36f3_1153x634.png 424w, https://substackcdn.com/image/fetch/$s_!x6kJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8d8fa1f-5b1d-4183-b0ea-66adb40b36f3_1153x634.png 848w, https://substackcdn.com/image/fetch/$s_!x6kJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8d8fa1f-5b1d-4183-b0ea-66adb40b36f3_1153x634.png 1272w, https://substackcdn.com/image/fetch/$s_!x6kJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8d8fa1f-5b1d-4183-b0ea-66adb40b36f3_1153x634.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>Code becomes:</p><ul><li><p>&#9881;&#65039; <strong>Executable</strong> &#8594; actions can run</p></li><li><p>&#128269; <strong>Inspectable</strong> &#8594; traces/logs visible</p></li><li><p>&#129504; <strong>Stateful</strong> &#8594; memory &amp; progress persist</p></li><li><p>&#9989; <strong>Verifiable</strong> &#8594; tests/runtime checks possible</p></li></ul></li></ul><p>Instead of:</p><blockquote><p>&#8220;LLM thinks &#8594; outputs answer&#8221;</p></blockquote><p>It becomes:</p><blockquote><p>&#8220;LLM + harness + code artifacts + execution feedback + memory + tools + verification&#8221;</p></blockquote><div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vfCc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff45b849c-a38c-4b7b-9147-79e4dee47bfb_1179x647.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vfCc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff45b849c-a38c-4b7b-9147-79e4dee47bfb_1179x647.png 424w, https://substackcdn.com/image/fetch/$s_!vfCc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff45b849c-a38c-4b7b-9147-79e4dee47bfb_1179x647.png 848w, https://substackcdn.com/image/fetch/$s_!vfCc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff45b849c-a38c-4b7b-9147-79e4dee47bfb_1179x647.png 1272w, https://substackcdn.com/image/fetch/$s_!vfCc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff45b849c-a38c-4b7b-9147-79e4dee47bfb_1179x647.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vfCc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff45b849c-a38c-4b7b-9147-79e4dee47bfb_1179x647.png" width="1179" height="647" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f45b849c-a38c-4b7b-9147-79e4dee47bfb_1179x647.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:647,&quot;width&quot;:1179,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1190118,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199052683?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff45b849c-a38c-4b7b-9147-79e4dee47bfb_1179x647.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vfCc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff45b849c-a38c-4b7b-9147-79e4dee47bfb_1179x647.png 424w, https://substackcdn.com/image/fetch/$s_!vfCc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff45b849c-a38c-4b7b-9147-79e4dee47bfb_1179x647.png 848w, https://substackcdn.com/image/fetch/$s_!vfCc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff45b849c-a38c-4b7b-9147-79e4dee47bfb_1179x647.png 1272w, https://substackcdn.com/image/fetch/$s_!vfCc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff45b849c-a38c-4b7b-9147-79e4dee47bfb_1179x647.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>&#128680; Major Causes of Multi-Agent Failures</h2><h3>1. &#10060; Shared State Desynchronization</h3><p>Different agents operate on inconsistent repo state, traces, or assumptions.</p><h4>Failure Modes</h4><ul><li><p>Agent A edits outdated code</p></li><li><p>Reviewer validates stale branch</p></li><li><p>Planner assumptions drift from runtime reality</p></li><li><p>Memory inconsistencies between agents</p></li></ul><p>Paper highlights:</p><ul><li><p>&#8220;consistent shared state across multiple agents&#8221;</p></li><li><p>&#8220;shared-harness synchronization&#8221;</p></li><li><p>&#8220;transactional shared program state&#8221; as open problems</p></li></ul><h4>Fix</h4><p>&#9989; Treat shared state like distributed systems:</p><ul><li><p>versioned artifacts</p></li><li><p>synchronized execution traces</p></li><li><p>repo-backed memory</p></li><li><p>transactional updates</p></li><li><p>authoritative source of truth</p></li></ul><div><hr></div><h3>2. &#10060; Weak Verification / Oracle Problems</h3><p>Agents think tasks succeeded when they merely produced plausible text.</p><h4>Failure Modes</h4><ul><li><p>Silent failures</p></li><li><p>Hallucinated completion</p></li><li><p>Superficial fixes</p></li><li><p>Incorrect reasoning paths</p></li></ul><p>Paper stresses:</p><ul><li><p>executable verification</p></li><li><p>deterministic sensors</p></li><li><p>runtime tests</p></li><li><p>execution traces</p></li><li><p>formal verification</p></li></ul><h4>Fix</h4><p>&#9989; Harness must:</p><ul><li><p>execute code</p></li><li><p>run tests continuously</p></li><li><p>validate intermediate states</p></li><li><p>expose runtime traces</p></li><li><p>use step-level verification, not just final outputs</p></li></ul><p>Think:</p><blockquote><p>&#8220;Never trust agent text. Trust execution.&#8221;</p></blockquote><div><hr></div><h3>3. &#10060; Single-Path Planning Brittleness</h3><p>Linear plans collapse when assumptions fail.</p><h3>Failure Modes</h3><ul><li><p>Early wrong decomposition poisons whole workflow</p></li><li><p>Agents tunnel into one bad strategy</p></li><li><p>No rollback/search</p></li></ul><p>Paper explicitly critiques single-path planning.</p><h4>Fix</h4><p>&#9989; Search-based harness:</p><ul><li><p>branch candidate trajectories</p></li><li><p>preserve alternatives</p></li><li><p>compare patches/tests</p></li><li><p>MCTS/tree search</p></li><li><p>execution-guided replanning</p></li></ul><p>Key idea:</p><blockquote><p>harness manages competing trajectories, not just one chain.</p></blockquote><div><hr></div><h3>4. &#10060; Context &amp; Memory Collapse</h3><p>Agents forget prior decisions, repo structure, or prior failures.</p><h4>Failure Modes</h4><ul><li><p>repeated mistakes</p></li><li><p>redundant work</p></li><li><p>inconsistent implementations</p></li><li><p>token-window overflow</p></li></ul><h4>Fix</h4><p>&#9989; Persistent memory layers:</p><ul><li><p>working memory</p></li><li><p>semantic memory</p></li><li><p>experiential memory</p></li><li><p>long-term memory</p></li><li><p>context compaction/state offloading</p></li></ul><p>Important insight:</p><blockquote><p>Memory should live in filesystem/repo artifacts, not only token context.</p></blockquote><p>Examples:</p><ul><li><p>PLAN.md</p></li><li><p>status logs</p></li><li><p>execution traces</p></li><li><p>reusable skill libraries</p></li></ul><div><hr></div><h3>5. &#10060; Poor Role Coordination</h3><p>Multiple agents without explicit orchestration become noisy or contradictory.</p><h4>Failure Modes</h4><ul><li><p>coder/reviewer conflict</p></li><li><p>duplicated effort</p></li><li><p>unclear ownership</p></li><li><p>no escalation logic</p></li></ul><p>Paper frames multi-agent systems around:</p><ul><li><p>manager</p></li><li><p>planner</p></li><li><p>coder</p></li><li><p>reviewer</p></li><li><p>tester roles</p></li></ul><h4>Fix</h4><p>&#9989; Strong orchestration harness:</p><ul><li><p>explicit roles</p></li><li><p>workflow topology</p></li><li><p>escalation rules</p></li><li><p>review loops</p></li><li><p>structured artifact passing</p></li></ul><p>Not:</p><blockquote><p>&#8220;many agents chatting&#8221;</p></blockquote><p>But:</p><blockquote><p>&#8220;software pipeline with governed state transitions&#8221;</p></blockquote><div><hr></div><h2>&#128736;&#65039; The Big Harness Insight</h2><p>The paper repeatedly argues:</p><h4>Agent reliability &#8800; smarter model alone</h4><p>Reliability comes from:</p><pre><code><code>Planning
+ Memory
+ Verification
+ Execution
+ Shared State
+ Tool Governance
+ Feedback Loops
+ Orchestration
</code></code></pre><p>The harness is effectively:</p><blockquote><p>the operating system for agents.</p></blockquote><div><hr></div><h2>&#128204; Best Practices the Paper Implies</h2><h3>&#9989; Good Harness Design</h3><ul><li><p>sandbox execution</p></li><li><p>deterministic validators</p></li><li><p>execution traces</p></li><li><p>persistent artifacts</p></li><li><p>repo-native workflows</p></li><li><p>replayable history</p></li><li><p>rollback capability</p></li><li><p>search over plans</p></li><li><p>role specialization</p></li><li><p>governed permissions</p></li></ul><div><hr></div><h2>&#9888;&#65039; Most Important Open Problems</h2><h3>The paper highlights these as unsolved:</h3><ul><li><p>regression-free self-improving harnesses</p></li><li><p>semantic verification beyond tests</p></li><li><p>multi-agent state convergence</p></li><li><p>human oversight at scale</p></li><li><p>multimodal harnesses</p></li><li><p>evaluation beyond final success metrics</p></li></ul><div><hr></div><h2>&#129504; One-Sentence Takeaway</h2><h4>Old view:</h4><blockquote><p>&#8220;Agents fail because models hallucinate.&#8221;</p></blockquote><h4>New view:</h4><blockquote><p>&#8220;Agents fail because the harness lacks robust state, verification, orchestration, and execution control.&#8221;</p></blockquote><h2>Updated Framework:</h2><p>Let me first clarify what the Code as Harness paper actually adds, then build the integrated framework.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LTuD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F322d271b-8744-454e-a859-284fc9670cf7_1185x648.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LTuD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F322d271b-8744-454e-a859-284fc9670cf7_1185x648.png 424w, https://substackcdn.com/image/fetch/$s_!LTuD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F322d271b-8744-454e-a859-284fc9670cf7_1185x648.png 848w, https://substackcdn.com/image/fetch/$s_!LTuD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F322d271b-8744-454e-a859-284fc9670cf7_1185x648.png 1272w, https://substackcdn.com/image/fetch/$s_!LTuD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F322d271b-8744-454e-a859-284fc9670cf7_1185x648.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LTuD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F322d271b-8744-454e-a859-284fc9670cf7_1185x648.png" width="1185" height="648" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/322d271b-8744-454e-a859-284fc9670cf7_1185x648.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:648,&quot;width&quot;:1185,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1184290,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199052683?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F322d271b-8744-454e-a859-284fc9670cf7_1185x648.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LTuD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F322d271b-8744-454e-a859-284fc9670cf7_1185x648.png 424w, https://substackcdn.com/image/fetch/$s_!LTuD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F322d271b-8744-454e-a859-284fc9670cf7_1185x648.png 848w, https://substackcdn.com/image/fetch/$s_!LTuD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F322d271b-8744-454e-a859-284fc9670cf7_1185x648.png 1272w, https://substackcdn.com/image/fetch/$s_!LTuD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F322d271b-8744-454e-a859-284fc9670cf7_1185x648.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aVDv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F146f907e-cc3e-4902-96a7-fa945baec451_1168x638.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aVDv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F146f907e-cc3e-4902-96a7-fa945baec451_1168x638.png 424w, https://substackcdn.com/image/fetch/$s_!aVDv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F146f907e-cc3e-4902-96a7-fa945baec451_1168x638.png 848w, https://substackcdn.com/image/fetch/$s_!aVDv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F146f907e-cc3e-4902-96a7-fa945baec451_1168x638.png 1272w, https://substackcdn.com/image/fetch/$s_!aVDv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F146f907e-cc3e-4902-96a7-fa945baec451_1168x638.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aVDv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F146f907e-cc3e-4902-96a7-fa945baec451_1168x638.png" width="1168" height="638" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/146f907e-cc3e-4902-96a7-fa945baec451_1168x638.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:638,&quot;width&quot;:1168,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1006975,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199052683?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F146f907e-cc3e-4902-96a7-fa945baec451_1168x638.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aVDv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F146f907e-cc3e-4902-96a7-fa945baec451_1168x638.png 424w, https://substackcdn.com/image/fetch/$s_!aVDv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F146f907e-cc3e-4902-96a7-fa945baec451_1168x638.png 848w, https://substackcdn.com/image/fetch/$s_!aVDv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F146f907e-cc3e-4902-96a7-fa945baec451_1168x638.png 1272w, https://substackcdn.com/image/fetch/$s_!aVDv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F146f907e-cc3e-4902-96a7-fa945baec451_1168x638.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OUha!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5333af99-ee91-4c36-b13b-6554f52eb3b1_1172x639.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OUha!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5333af99-ee91-4c36-b13b-6554f52eb3b1_1172x639.png 424w, https://substackcdn.com/image/fetch/$s_!OUha!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5333af99-ee91-4c36-b13b-6554f52eb3b1_1172x639.png 848w, https://substackcdn.com/image/fetch/$s_!OUha!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5333af99-ee91-4c36-b13b-6554f52eb3b1_1172x639.png 1272w, https://substackcdn.com/image/fetch/$s_!OUha!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5333af99-ee91-4c36-b13b-6554f52eb3b1_1172x639.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OUha!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5333af99-ee91-4c36-b13b-6554f52eb3b1_1172x639.png" width="1172" height="639" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5333af99-ee91-4c36-b13b-6554f52eb3b1_1172x639.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:639,&quot;width&quot;:1172,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:948061,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199052683?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5333af99-ee91-4c36-b13b-6554f52eb3b1_1172x639.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OUha!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5333af99-ee91-4c36-b13b-6554f52eb3b1_1172x639.png 424w, https://substackcdn.com/image/fetch/$s_!OUha!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5333af99-ee91-4c36-b13b-6554f52eb3b1_1172x639.png 848w, https://substackcdn.com/image/fetch/$s_!OUha!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5333af99-ee91-4c36-b13b-6554f52eb3b1_1172x639.png 1272w, https://substackcdn.com/image/fetch/$s_!OUha!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5333af99-ee91-4c36-b13b-6554f52eb3b1_1172x639.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>What it adds that my existing ASCII doesn&#8217;t yet capture:</strong></p><p>The paper formalizes <em>what the harness is made of</em> &#8212; three distinct code functions sitting between agents and environment. My ASCII explains <em>what harnesses do and where they fail</em>, but doesn&#8217;t name the substrate. The four properties (Executable, Inspectable, Stateful, Governed) also give a clean diagnostic for <em>which MAST failure category you&#8217;re actually in</em>. Those two additions make the framework more complete.</p><p>Here&#8217;s the integrated version:</p><pre><code><code>&#9556;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9559;
&#9553;          CODE AS HARNESS &#8212; INTEGRATED PRODUCTION FRAMEWORK                   &#9553;
&#9553;          Theory (Code as Harness) &#215; Evidence (ASCRS) &#215; Taxonomy (MAST)      &#9553;
&#9562;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9565;

  THE CORE CLAIM (all three sources agree):
  The model never changes. Every failure is a harness problem.
  The harness is not scaffolding around intelligence &#8212; it IS the intelligence.
  And the harness is code: executable, inspectable, stateful, governed.


&#9556;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9559;
&#9553;  STEP 0 &#8212; BEFORE YOU BUILD ANYTHING                                          &#9553;
&#9553;  Task Structure Determines Architecture                                      &#9553;
&#9568;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9571;
&#9553;                                                                              &#9553;
&#9553;   GREENFIELD                          GREYFIELD / BROWNFIELD                 &#9553;
&#9553;   (Document / Benchmark)              (Operational / Production)             &#9553;
&#9553;                                                                              &#9553;
&#9553;   &#183; All data present upfront          &#183; Real-time data feeds                 &#9553;
&#9553;   &#183; Single-turn reasoning             &#183; ERP/database writes                  &#9553;
&#9553;   &#183; No external state                 &#183; External state queries               &#9553;
&#9553;   &#183; No concurrent modifiers           &#183; Humans + systems modify state        &#9553;
&#9553;                                                                              &#9553;
&#9553;   HARNESS NEED: Precision             HARNESS NEED: Resilience               &#9553;
&#9553;   Write the spec before the swarm.    Build auditing loops into the graph.   &#9553;
&#9553;                                                                              &#9553;
&#9553;   FAILURE MODE: Hallucination         FAILURE MODE: Coordination noise       &#9553;
&#9553;   (wrong spec &#8594; wrong output)         (agents average each other)            &#9553;
&#9553;                                                                              &#9553;
&#9553;   WINNING MOVE: High-density          WINNING MOVE: Structural               &#9553;
&#9553;   prompting (H2, &#945; = 1.000)           auditing loops (V3/V4)                 &#9553;
&#9553;                                                                              &#9553;
&#9553;   H2 FIRST RULE: Never build a swarm to fix what a better prompt             &#9553;
&#9553;   could solve. Coordination loops loop on the same specification failure.    &#9553;
&#9562;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9565;
                                        &#9474;
                                        &#9660;
&#9556;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9559;
&#9553;  STEP 1 &#8212; WHAT THE HARNESS IS MADE OF                                        &#9553;
&#9553;  Three Code Functions Between Agent and Environment                          &#9553;
&#9568;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9571;
&#9553;                                                                              &#9553;
&#9553;  &#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;     &#9553;
&#9553;  &#9474;  CODE FOR REASONING                                                  &#9474;     &#9553;
&#9553;  &#9474;  Externalizes inference. Makes thinking inspectable.                 &#9474;     &#9553;
&#9553;  &#9474;                                                                     &#9474;     &#9553;
&#9553;  &#9474;  &#183; Intermediate reasoning steps written to files, not held          &#9474;     &#9553;
&#9553;  &#9474;    in context only                                                   &#9474;     &#9553;
&#9553;  &#9474;  &#183; Program synthesis: agent proposes structure, code verifies it    &#9474;     &#9553;
&#9553;  &#9474;  &#183; ASCRS application: G4 planning basis written as a file           &#9474;     &#9553;
&#9553;  &#9474;    dependency before G7 trigger is allowed to execute               &#9474;     &#9553;
&#9553;  &#9474;                                                                     &#9474;     &#9553;
&#9553;  &#9474;  WHERE IT FAILS: Agents reason correctly in isolation but           &#9474;     &#9553;
&#9553;  &#9474;  reasoning is never written out, so merge steps average             &#9474;     &#9553;
&#9553;  &#9474;  rather than verify. H9: every criterion 0.5.                       &#9474;     &#9553;
&#9553;  &#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;     &#9553;
&#9553;                                                                              &#9553;
&#9553;  &#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;     &#9553;
&#9553;  &#9474;  CODE FOR ACTING                                                     &#9474;     &#9553;
&#9553;  &#9474;  Translates intent into executable, reversible actions.             &#9474;     &#9553;
&#9553;  &#9474;                                                                     &#9474;     &#9553;
&#9553;  &#9474;  &#183; Tool calls with explicit permission gates                        &#9474;     &#9553;
&#9553;  &#9474;  &#183; Policies that govern what agents can write vs. read              &#9474;     &#9553;
&#9553;  &#9474;  &#183; Skills: reusable action templates (skill.md files)               &#9474;     &#9553;
&#9553;  &#9474;  &#183; ASCRS application: G7/G8 human gate &#8212; no ERP write              &#9474;     &#9553;
&#9553;  &#9474;    executes without explicit sign-off                               &#9474;     &#9553;
&#9553;  &#9474;                                                                     &#9474;     &#9553;
&#9553;  &#9474;  WHERE IT FAILS: Actions execute successfully (status 200)         &#9474;     &#9553;
&#9553;  &#9474;  but downstream state is corrupted. Nobody catches it              &#9474;     &#9553;
&#9553;  &#9474;  until a customer does. MAST: tool/API errors (10%)                &#9474;     &#9553;
&#9553;  &#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;     &#9553;
&#9553;                                                                              &#9553;
&#9553;  &#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;     &#9553;
&#9553;  &#9474;  CODE FOR ENVIRONMENT MODELING                                       &#9474;     &#9553;
&#9553;  &#9474;  Captures state, traces, and execution feedback.                    &#9474;     &#9553;
&#9553;  &#9474;                                                                     &#9474;     &#9553;
&#9553;  &#9474;  &#183; Program states and repos as readable artifacts                   &#9474;     &#9553;
&#9553;  &#9474;  &#183; Execution feedback loops (not just final output)                 &#9474;     &#9553;
&#9553;  &#9474;  &#183; ASCRS application: traces/structured/ captures quorum           &#9474;     &#9553;
&#9553;  &#9474;    timestamps, scenario selection decisions, strategist             &#9474;     &#9553;
&#9553;  &#9474;    reasoning &#8212; the data the Meta-Harness reads to find             &#9474;     &#9553;
&#9553;  &#9474;    structural flaws invisible in final briefs                       &#9474;     &#9553;
&#9553;  &#9474;                                                                     &#9474;     &#9553;
&#9553;  &#9474;  WHERE IT FAILS: Shanghai slow-burn. Three weeks of vessel         &#9474;     &#9553;
&#9553;  &#9474;  buildup data vanished because the write threshold was             &#9474;     &#9553;
&#9553;  &#9474;  only triggered at +50% above baseline. System saw a              &#9474;     &#9553;
&#9553;  &#9474;  number, not a trajectory. Fix: provisional memory tier            &#9474;     &#9553;
&#9553;  &#9474;  at +20%, TTL 30 days, promotes at +50%.                           &#9474;     &#9553;
&#9553;  &#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;     &#9553;
&#9562;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9565;
                                        &#9474;
                                        &#9660;
&#9556;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9559;
&#9553;  STEP 2 &#8212; THE FOUR PROPERTIES AS DIAGNOSTIC                                  &#9553;
&#9553;  Which property is missing tells you which MAST failure you have             &#9553;
&#9568;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9574;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9574;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9571;
&#9553;  PROPERTY ABSENT     &#9553;  FAILURE PATTERN           &#9553;  MAST CATEGORY           &#9553;
&#9568;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9580;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9580;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9571;
&#9553;  NOT EXECUTABLE      &#9553;  Agent produces plausible  &#9553;  Specification           &#9553;
&#9553;                      &#9553;  output that cannot be     &#9553;  failure (42%)           &#9553;
&#9553;                      &#9553;  verified against ground   &#9553;                          &#9553;
&#9553;                      &#9553;  truth. Looks correct.     &#9553;  H2 fix: domain          &#9553;
&#9553;                      &#9553;  Isn't.                    &#9553;  constraints stated      &#9553;
&#9553;                      &#9553;                            &#9553;  before architecture.    &#9553;
&#9568;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9580;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9580;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9571;
&#9553;  NOT INSPECTABLE     &#9553;  Coordination breakdowns   &#9553;  Inter-agent             &#9553;
&#9553;                      &#9553;  invisible to standard     &#9553;  misalignment (23%)      &#9553;
&#9553;                      &#9553;  monitoring. Logs show     &#9553;                          &#9553;
&#9553;                      &#9553;  successful API calls.     &#9553;  Fix: full trajectory    &#9553;
&#9553;                      &#9553;  Downstream is corrupted.  &#9553;  capture &#8212; every tool    &#9553;
&#9553;                      &#9553;  V2+Loop passed the gate.  &#9553;  call, every handoff.    &#9553;
&#9553;                      &#9553;  Meta-Harness found cause. &#9553;  Engineers read output.  &#9553;
&#9553;                      &#9553;                            &#9553;  Harness reads causes.   &#9553;
&#9568;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9580;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9580;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9571;
&#9553;  NOT STATEFUL        &#9553;  Context lost in handoffs. &#9553;  Context loss in         &#9553;
&#9553;                      &#9553;  Memory poisoning: flawed  &#9553;  handoffs (7%)           &#9553;
&#9553;                      &#9553;  prior output re-injected  &#9553;  + H6 memory             &#9553;
&#9553;                      &#9553;  &#8594; model inherits the      &#9553;  poisoning               &#9553;
&#9553;                      &#9553;  contradiction, not        &#9553;                          &#9553;
&#9553;                      &#9553;  transcends it.            &#9553;  Fix: gate memory        &#9553;
&#9553;                      &#9553;  H6: &#945; 0.75 &#8594; 0.30         &#9553;  before trusting it.     &#9553;
&#9553;                      &#9553;  in one step.              &#9553;  Provisional tier.       &#9553;
&#9568;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9580;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9580;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9571;
&#9553;  NOT GOVERNED        &#9553;  No verification agent     &#9553;  Task verification       &#9553;
&#9553;                      &#9553;  checks logical dependency &#9553;  failure (15%)           &#9553;
&#9553;                      &#9553;  before output leaves node.&#9553;                          &#9553;
&#9553;                      &#9553;  SA_reviewer confirms G4   &#9553;  Fix: audit the          &#9553;
&#9553;                      &#9553;  + G7 are present &#8212; not    &#9553;  reviewer. Verify        &#9553;
&#9553;                      &#9553;  that they reference the   &#9553;  logical dependency,     &#9553;
&#9553;                      &#9553;  same scenario. Passes.    &#9553;  not keyword presence.   &#9553;
&#9553;                      &#9553;  Brief is inconsistent.    &#9553;  G4 locks before G7.     &#9553;
&#9562;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9577;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9577;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9565;
                                        &#9474;
                                        &#9660;
&#9556;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9559;
&#9553;  STEP 3 &#8212; THE PRODUCTION HARNESS STACK                                       &#9553;
&#9553;  Four layers. Only Layer 2 is the variable under test.                       &#9553;
&#9568;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9571;
&#9553;                                                                              &#9553;
&#9553;  LAYER 0 &#8212; HaaS RUNTIME                          [V3/V4 in ASCRS]           &#9553;
&#9553;  &#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;     &#9553;
&#9553;  &#9474;  Claude Code / Hermes Agent                                         &#9474;     &#9553;
&#9553;  &#9474;  Orchestrates experiments. Runs scripts. Manages session.           &#9474;     &#9553;
&#9553;  &#9474;  Active: bootstrap, debug, article writing.                         &#9474;     &#9553;
&#9553;  &#9474;  Silent: during all H1&#8211;H10 experiment runs.                         &#9474;     &#9553;
&#9553;  &#9474;  Returns only when you paste a new prompt or an error occurs.       &#9474;     &#9553;
&#9553;  &#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;     &#9553;
&#9553;                              &#9474;                                               &#9553;
&#9553;  LAYER 1 &#8212; SHARED INFRASTRUCTURE              [IMMUTABLE &#8212; never modified]   &#9553;
&#9553;  &#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;     &#9553;
&#9553;  &#9474;  task.md &#183; domain_data.json &#183; rubric.json &#183; gold_answer.md          &#9474;     &#9553;
&#9553;  &#9474;  scorer.js &#183; client.js &#183; self_heal.js &#183; logger.js                   &#9474;     &#9553;
&#9553;  &#9474;                                                                     &#9474;     &#9553;
&#9553;  &#9474;  Code for Environment Modeling lives here.                          &#9474;     &#9553;
&#9553;  &#9474;  SCORER_MODEL must differ from DEFAULT_MODEL.                       &#9474;     &#9553;
&#9553;  &#9474;  Self-grading inflates every alpha 15&#8211;30%.                          &#9474;     &#9553;
&#9553;  &#9474;  H1 must never receive Hermes memory &#8212; raises floor,               &#9474;     &#9553;
&#9553;  &#9474;  invalidates all lift measurements.                                 &#9474;     &#9553;
&#9553;  &#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;     &#9553;
&#9553;                              &#9474;                                               &#9553;
&#9553;  LAYER 2 &#8212; EXPERIMENT HARNESS                    [H1&#8211;H10, THE VARIABLE]      &#9553;
&#9553;  &#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;     &#9553;
&#9553;  &#9474;  Code for Reasoning + Code for Acting both live here.               &#9474;     &#9553;
&#9553;  &#9474;  This is what you change. Nothing else.                             &#9474;     &#9553;
&#9553;  &#9474;                                                                     &#9474;     &#9553;
&#9553;  &#9474;  H1  Bare model       &#183; No memory. No checks. Floor.                &#9474;     &#9553;
&#9553;  &#9474;  H2  Prompt harness &#9733; &#183; Code for Reasoning: spec explicit.          &#9474;     &#9553;
&#9553;  &#9474;                         &#945; = 1.000. 15K tokens. Wins.                &#9474;     &#9553;
&#9553;  &#9474;  H3  Sequential tools &#183; Reconciliation errors. 3&#215; tokens. &#945; &#8595;      &#9474;     &#9553;
&#9553;  &#9474;  H4  Parallel fan-out &#183; Merge failure. Parts correct,               &#9474;     &#9553;
&#9553;  &#9474;                         assembly incoherent. NOT GOVERNED.          &#9474;     &#9553;
&#9553;  &#9474;  H5  Eval loop        &#183; Arbitrary cap. /goal removes it.            &#9474;     &#9553;
&#9553;  &#9474;  H6  Skill memory  &#8595;&#8595; &#183; NOT STATEFUL. Memory poisoning.             &#9474;     &#9553;
&#9553;  &#9474;                         &#945; 0.75 &#8594; 0.30 in one step.                  &#9474;     &#9553;
&#9553;  &#9474;  H7  Model routing  &#9733; &#183; Right model for right task. 0.900,          &#9474;     &#9553;
&#9553;  &#9474;                         26K tokens. Best efficiency frontier.       &#9474;     &#9553;
&#9553;  &#9474;  H8  Simulated HITL &#8595; &#183; Simulated review &#8800; human review.            &#9474;     &#9553;
&#9553;  &#9474;  H9  Sub-agent swarm&#8595; &#183; NOT INSPECTABLE. Orchestrator averages.     &#9474;     &#9553;
&#9553;  &#9474;                         Every criterion 0.5. 4&#215; tokens. &#945; &#8595;         &#9474;     &#9553;
&#9553;  &#9474;  H10 Meta-harness   &#9733; &#183; Reads causes, not outputs. Finds            &#9474;     &#9553;
&#9553;  &#9474;                         structural flaws invisible in final briefs.  &#9474;     &#9553;
&#9553;  &#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;     &#9553;
&#9553;                              &#9474;                                               &#9553;
&#9553;  LAYER 3 &#8212; SWAPPABLE INTELLIGENCE                        [ONE .env LINE]     &#9553;
&#9553;  &#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;     &#9553;
&#9553;  &#9474;  DEFAULT_MODEL &#183; LIGHT_MODEL &#183; SCORER_MODEL                         &#9474;     &#9553;
&#9553;  &#9474;  Model is commodity. Harness is differentiator.                     &#9474;     &#9553;
&#9553;  &#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;     &#9553;
&#9562;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9565;
                                        &#9474;
                                        &#9660;
&#9556;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9559;
&#9553;  STEP 4 &#8212; SCALING: MULTI-AGENT COORDINATION                                  &#9553;
&#9553;  Code as Harness roles &#215; ASCRS agent topology &#215; MAST failure points         &#9553;
&#9568;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9574;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9574;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9571;
&#9553;  ROLE                &#9553;  HARNESS FUNCTION          &#9553;  ASCRS OBSERVED          &#9553;
&#9568;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9580;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9580;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9571;
&#9553;  Manager /           &#9553;  Orchestrates agents.      &#9553;  Must SELECT best        &#9553;
&#9553;  Orchestrator        &#9553;  Dispatch and sync.        &#9553;  output &#8212; not average.   &#9553;
&#9553;                      &#9553;  Weighted quorum policy.   &#9553;  Quorum fires on signal  &#9553;
&#9553;                      &#9553;                            &#9553;  weight, not count.      &#9553;
&#9568;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9580;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9580;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9571;
&#9553;  Planner             &#9553;  Pre-run decomposition.    &#9553;  G4 must lock before     &#9553;
&#9553;                      &#9553;  Structural grounding.     &#9553;  G7 is written.          &#9553;
&#9553;                      &#9553;  Dependency declaration.   &#9553;  Item ordering is code,  &#9553;
&#9553;                      &#9553;                            &#9553;  not agent judgment.     &#9553;
&#9568;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9580;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9580;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9571;
&#9553;  Coder / Researcher  &#9553;  Tool calls, data          &#9553;  Market researcher       &#9553;
&#9553;                      &#9553;  retrieval, specialist     &#9553;  carries strongest       &#9553;
&#9553;                      &#9553;  domain execution.         &#9553;  signal (r = 0.81).      &#9553;
&#9553;                      &#9553;                            &#9553;  Needs 90s grace         &#9553;
&#9553;                      &#9553;                            &#9553;  before quorum fires.    &#9553;
&#9568;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9580;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9580;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9571;
&#9553;  Reviewer            &#9553;  Verification before       &#9553;  Must check logical      &#9553;
&#9553;                      &#9553;  output leaves node.       &#9553;  dependency &#8212; not        &#9553;
&#9553;                      &#9553;  Blocks merge on failure.  &#9553;  keyword presence.       &#9553;
&#9553;                      &#9553;                            &#9553;  SA_reviewer failed:     &#9553;
&#9553;                      &#9553;                            &#9553;  confirmed numbers       &#9553;
&#9553;                      &#9553;                            &#9553;  present, not coherent.  &#9553;
&#9568;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9580;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9580;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9571;
&#9553;  Red-team /          &#9553;  Adversarial testing.      &#9553;  Not yet in ASCRS.       &#9553;
&#9553;  Debate agent        &#9553;  Challenges verdicts.      &#9553;  Recommended next        &#9553;
&#9553;                      &#9553;  Multi-agent evaluation.   &#9553;  extension (Devil's      &#9553;
&#9553;                      &#9553;                            &#9553;  Advocate skill file).   &#9553;
&#9568;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9580;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9580;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9571;
&#9553;  Meta-Harness        &#9553;  Reads execution traces.   &#9553;  Proposes harness        &#9553;
&#9553;  (Proposer)          &#9553;  Forms causal hypotheses.  &#9553;  edits. Finds what       &#9553;
&#9553;                      &#9553;  Proposes structural fixes.&#9553;  output inspection       &#9553;
&#9553;                      &#9553;                            &#9553;  cannot see.             &#9553;
&#9562;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9577;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9577;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9565;
                                        &#9474;
                                        &#9660;
&#9556;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9559;
&#9553;  PRODUCTION READINESS GATE                                                   &#9553;
&#9568;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9571;
&#9553;                                                                              &#9553;
&#9553;  EXECUTABLE?                                                                 &#9553;
&#9553;  &#9633; Domain constraints written before architecture chosen                     &#9553;
&#9553;  &#9633; Ground truth defined independently of experiments                         &#9553;
&#9553;  &#9633; Success criteria falsifiable, not aspirational                            &#9553;
&#9553;                                                                              &#9553;
&#9553;  INSPECTABLE?                                                                &#9553;
&#9553;  &#9633; Full trajectory captured &#8212; every tool call, handoff, state change         &#9553;
&#9553;  &#9633; SCORER_MODEL differs from DEFAULT_MODEL                                   &#9553;
&#9553;  &#9633; Traces structured for analysis, not just logged for storage               &#9553;
&#9553;                                                                              &#9553;
&#9553;  STATEFUL?                                                                   &#9553;
&#9553;  &#9633; Memory gated before re-injection &#8212; flawed memory is worse than none       &#9553;
&#9553;  &#9633; Provisional memory tier active at low threshold (early signals)           &#9553;
&#9553;  &#9633; Outcome fields written post-event (compounding value requires this)       &#9553;
&#9553;                                                                              &#9553;
&#9553;  GOVERNED?                                                                   &#9553;
&#9553;  &#9633; Reviewer verifies logical dependency, not keyword presence                &#9553;
&#9553;  &#9633; Human gate (G7/G8) required before consequential writes                   &#9553;
&#9553;  &#9633; When gate fails: fix only what failed. Preserve what passed.              &#9553;
&#9553;  &#9633; Orchestrator selects best output &#8212; does not average all outputs           &#9553;
&#9553;                                                                              &#9553;
&#9553;  &#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;  &#9553;
&#9553;  &lt; 8 checks: Not production-ready. Accuracy scores are false confidence.     &#9553;
&#9553;  8&#8211;10 checks: Partial. Identify which gaps have highest blast radius.        &#9553;
&#9553;  All checks: Production-ready. Run red-team before live deployment.          &#9553;
&#9553;                                                                              &#9553;
&#9553;  ONE OPEN QUESTION NONE OF THIS SOLVES:                                      &#9553;
&#9553;  Outcome closure. The system can be built correctly and still fail           &#9553;
&#9553;  to learn &#8212; if humans don't write back what actually happened.               &#9553;
&#9553;  This is the highest-value single improvement available. It is not           &#9553;
&#9553;  an engineering problem. It is an operational discipline problem.            &#9553;
&#9562;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9565;
</code></code></pre><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Prompt Is Not the Architecture — But It Still Governs How the System Reasons]]></title><description><![CDATA[New evidence from Anthropic&#8217;s Prompting Playbook and independent benchmarks &#8212; reconciled with the ASCRS Architecture of Awareness and Harness Lab findings]]></description><link>https://interestingengineering.substack.com/p/the-prompt-is-not-the-architecture</link><guid isPermaLink="false">https://interestingengineering.substack.com/p/the-prompt-is-not-the-architecture</guid><dc:creator><![CDATA[Interesting Engineering ++]]></dc:creator><pubDate>Sun, 24 May 2026 01:20:10 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!6RqV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3d7ae9a-ef50-49bc-b34e-8647fd24b23d_1157x625.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6RqV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3d7ae9a-ef50-49bc-b34e-8647fd24b23d_1157x625.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6RqV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3d7ae9a-ef50-49bc-b34e-8647fd24b23d_1157x625.png 424w, https://substackcdn.com/image/fetch/$s_!6RqV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3d7ae9a-ef50-49bc-b34e-8647fd24b23d_1157x625.png 848w, https://substackcdn.com/image/fetch/$s_!6RqV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3d7ae9a-ef50-49bc-b34e-8647fd24b23d_1157x625.png 1272w, https://substackcdn.com/image/fetch/$s_!6RqV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3d7ae9a-ef50-49bc-b34e-8647fd24b23d_1157x625.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6RqV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3d7ae9a-ef50-49bc-b34e-8647fd24b23d_1157x625.png" width="1157" height="625" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d3d7ae9a-ef50-49bc-b34e-8647fd24b23d_1157x625.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:625,&quot;width&quot;:1157,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:979237,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199018115?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3d7ae9a-ef50-49bc-b34e-8647fd24b23d_1157x625.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6RqV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3d7ae9a-ef50-49bc-b34e-8647fd24b23d_1157x625.png 424w, https://substackcdn.com/image/fetch/$s_!6RqV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3d7ae9a-ef50-49bc-b34e-8647fd24b23d_1157x625.png 848w, https://substackcdn.com/image/fetch/$s_!6RqV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3d7ae9a-ef50-49bc-b34e-8647fd24b23d_1157x625.png 1272w, https://substackcdn.com/image/fetch/$s_!6RqV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3d7ae9a-ef50-49bc-b34e-8647fd24b23d_1157x625.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The number of times I have had friends/associates come up to me with &#8220;I don&#8217;t prompt anymore. Claude Code/Hermes-Agent/OpenClaw/Codex does everything while i sleep&#8221;&#8230;. Not so fast!</p><p>This piece is a follow-up to two earlier articles, referenced below. It synthesises new external evidence and applies it back to findings from those experiments. No conclusions from the original work are reversed. Several are sharpened. Two are extended in ways the original experimental design did not capture.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bfAI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F065287dd-7a46-4190-8a02-61003763672b_1157x630.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bfAI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F065287dd-7a46-4190-8a02-61003763672b_1157x630.png 424w, https://substackcdn.com/image/fetch/$s_!bfAI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F065287dd-7a46-4190-8a02-61003763672b_1157x630.png 848w, https://substackcdn.com/image/fetch/$s_!bfAI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F065287dd-7a46-4190-8a02-61003763672b_1157x630.png 1272w, https://substackcdn.com/image/fetch/$s_!bfAI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F065287dd-7a46-4190-8a02-61003763672b_1157x630.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bfAI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F065287dd-7a46-4190-8a02-61003763672b_1157x630.png" width="1157" height="630" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/065287dd-7a46-4190-8a02-61003763672b_1157x630.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:630,&quot;width&quot;:1157,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1082973,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199018115?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F065287dd-7a46-4190-8a02-61003763672b_1157x630.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bfAI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F065287dd-7a46-4190-8a02-61003763672b_1157x630.png 424w, https://substackcdn.com/image/fetch/$s_!bfAI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F065287dd-7a46-4190-8a02-61003763672b_1157x630.png 848w, https://substackcdn.com/image/fetch/$s_!bfAI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F065287dd-7a46-4190-8a02-61003763672b_1157x630.png 1272w, https://substackcdn.com/image/fetch/$s_!bfAI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F065287dd-7a46-4190-8a02-61003763672b_1157x630.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>The Two Articles This Updates:</strong></p><p>1.<a href="https://interestingengineering.substack.com/p/the-architecture-of-awareness-design"> The Architecture of Awareness (April 2026) </a>&#8212; a design case study tracing four versions of an AI-powered pharmaceutical supply chain crisis response system (ASCRS). Each version addressed specific failures in the last. V1: sequential, no gates. V2: parallel with quorum. V3: correction loop. V4: Meta-Harness reading internal execution traces. </p><p>Evaluation method:</p><p>CFO outcome (could the rerouting brief be approved in time to book air freight?). Domain: Strait of Hormuz disruption, March 2026, 23 purchase orders, &#8364;15M pharmaceutical cargo.</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;93e84e41-f04c-4f20-8e8d-f1463e5efbd9&quot;,&quot;caption&quot;:&quot;Before You Read: A Structural Introduction&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;The Architecture of Awareness: Design Considerations Of A Shipper's Agentic Logic&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:124460392,&quot;name&quot;:&quot;Interesting Engineering ++&quot;,&quot;bio&quot;:&quot;I spend my time learning about, and understanding our complex world better. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/977225f0-cc19-41f4-9df4-e21d01541411_347x347.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-04-22T17:10:21.275Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!iyM4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16e932de-491b-473e-864f-65b45369eacb_1408x768.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://interestingengineering.substack.com/p/the-architecture-of-awareness-design&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:194979383,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:1,&quot;comment_count&quot;:0,&quot;publication_id&quot;:1335585,&quot;publication_name&quot;:&quot;Interesting Engineering++&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!-M9w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05150353-1bdc-48d2-b72c-c0bd499513eb_1024x1024.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>2. <a href="https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated">ASCRS Harness Lab (May 2026) </a>&#8212; a controlled benchmark experiment running ten distinct harness architectures (H1&#8211;H10) against the same task, model, and data. Scored by a separate scorer model against six rubric criteria (&#945; = 0.0&#8211;1.0). Winner: H2, a structured system prompt, &#945; = 1.000. H9 (five-agent swarm) scored 0.625 &#8212; below the bare baseline.</p><p>Core tension resolved: H2 wins on single-turn document tasks; V4 is needed on continuous brownfield operations with real-time data and external state.</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;e2bb4bc5-91bb-47ce-81b8-8830ace428d1&quot;,&quot;caption&quot;:&quot;Had some time on my hands, and applied the features of The Harness Experiment(s) to the Architecture of Awareness design considerations. You will remember from The Harness Experiment (applied to a mini vendor analysis case study) that the results presented as follows:&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;ASCRS Harness Lab - The Integrated Agentic Stack: When Does More Architecture Mean Better AI? A Diagnostic Teardown&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:124460392,&quot;name&quot;:&quot;Interesting Engineering ++&quot;,&quot;bio&quot;:&quot;I spend my time learning about, and understanding our complex world better. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/977225f0-cc19-41f4-9df4-e21d01541411_347x347.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-05-16T17:52:19.700Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!cv0d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fb21fa-dc0a-4c0f-8ce4-6b2e85594843_1160x595.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:198013155,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:2,&quot;comment_count&quot;:2,&quot;publication_id&quot;:1335585,&quot;publication_name&quot;:&quot;Interesting Engineering++&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!-M9w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05150353-1bdc-48d2-b72c-c0bd499513eb_1024x1024.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h1>Refreshing Evidence</h1><p>Two YouTube resources have since added relevant empirical content. Neither contradicts the prior work. Both supply mechanisms and vocabulary the experiments were working toward but had not yet named.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2dIg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F318871fa-5d43-4365-9cb7-21676bef6693_1166x629.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2dIg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F318871fa-5d43-4365-9cb7-21676bef6693_1166x629.png 424w, https://substackcdn.com/image/fetch/$s_!2dIg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F318871fa-5d43-4365-9cb7-21676bef6693_1166x629.png 848w, https://substackcdn.com/image/fetch/$s_!2dIg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F318871fa-5d43-4365-9cb7-21676bef6693_1166x629.png 1272w, https://substackcdn.com/image/fetch/$s_!2dIg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F318871fa-5d43-4365-9cb7-21676bef6693_1166x629.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2dIg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F318871fa-5d43-4365-9cb7-21676bef6693_1166x629.png" width="1166" height="629" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/318871fa-5d43-4365-9cb7-21676bef6693_1166x629.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:629,&quot;width&quot;:1166,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:993531,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199018115?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F318871fa-5d43-4365-9cb7-21676bef6693_1166x629.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2dIg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F318871fa-5d43-4365-9cb7-21676bef6693_1166x629.png 424w, https://substackcdn.com/image/fetch/$s_!2dIg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F318871fa-5d43-4365-9cb7-21676bef6693_1166x629.png 848w, https://substackcdn.com/image/fetch/$s_!2dIg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F318871fa-5d43-4365-9cb7-21676bef6693_1166x629.png 1272w, https://substackcdn.com/image/fetch/$s_!2dIg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F318871fa-5d43-4365-9cb7-21676bef6693_1166x629.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Source 1 &#8212; Anthropic Prompting Playbook, Code w/ Claude (May 2026)</h2><p>Presented by Margot Van laar (Anthropic Member, Technical Staff). Two case studies: a <strong>brownfield</strong> customer support bot (Meridian Mobile) iterated from v0 to v5, and a <strong>greenfield</strong> retail staff scheduling task tested across five architectural approaches. Full video: <a href="https://youtu.be/G2B0YWuJUgI?si=2--69ppcC-RbE4SL">youtube.com/watch?v=G2B0YWuJUgI</a></p><div id="youtube2-G2B0YWuJUgI" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;G2B0YWuJUgI&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/G2B0YWuJUgI?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><h2>Source 2 &#8212; Nate B Jones, &#8216;Prompting Just Split Into 4 Skills&#8217; (Feb &amp; May 2026)</h2><p>A 41-minute framework video arguing that &#8216;prompting&#8217; has fragmented into four distinct disciplines, only one of which most practitioners are actively developing. Full videos: <a href="https://youtu.be/BpibZSMGtdY?si=tObArfLLu2pwvCER">youtube.com</a> and  <a href="https://www.youtube.com/watch?v=ogTLWGBc3cE">youtube.com/watch?v=ogTLWGBc3cE</a> </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HRtO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fc59e7-c5b4-4d52-b7fc-fe37627c6f5c_1119x604.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HRtO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fc59e7-c5b4-4d52-b7fc-fe37627c6f5c_1119x604.png 424w, https://substackcdn.com/image/fetch/$s_!HRtO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fc59e7-c5b4-4d52-b7fc-fe37627c6f5c_1119x604.png 848w, https://substackcdn.com/image/fetch/$s_!HRtO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fc59e7-c5b4-4d52-b7fc-fe37627c6f5c_1119x604.png 1272w, https://substackcdn.com/image/fetch/$s_!HRtO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fc59e7-c5b4-4d52-b7fc-fe37627c6f5c_1119x604.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HRtO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fc59e7-c5b4-4d52-b7fc-fe37627c6f5c_1119x604.png" width="1119" height="604" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/77fc59e7-c5b4-4d52-b7fc-fe37627c6f5c_1119x604.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:604,&quot;width&quot;:1119,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:930855,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199018115?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fc59e7-c5b4-4d52-b7fc-fe37627c6f5c_1119x604.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HRtO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fc59e7-c5b4-4d52-b7fc-fe37627c6f5c_1119x604.png 424w, https://substackcdn.com/image/fetch/$s_!HRtO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fc59e7-c5b4-4d52-b7fc-fe37627c6f5c_1119x604.png 848w, https://substackcdn.com/image/fetch/$s_!HRtO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fc59e7-c5b4-4d52-b7fc-fe37627c6f5c_1119x604.png 1272w, https://substackcdn.com/image/fetch/$s_!HRtO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fc59e7-c5b4-4d52-b7fc-fe37627c6f5c_1119x604.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div id="youtube2-BpibZSMGtdY" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;BpibZSMGtdY&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/BpibZSMGtdY?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><div id="youtube2-ogTLWGBc3cE" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;ogTLWGBc3cE&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/ogTLWGBc3cE?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p><strong>The Core Shift: From Prompting to Questioning</strong><br>Nate introduces the <strong>AI Question Method</strong>, which shifts the mental model from treating AI as a junior teammate to treating it as a <strong>senior partner</strong> (2:48-4:05). To succeed in high-leverage knowledge work, you must move beyond simple task definitions and engage in a dialogue of deep inquiry.</p><p><strong>Three Principles for the AI Question Method</strong></p><ol><li><p><strong>The Flashlight Intent (10:05-12:30):</strong> Like a good manager, you must clearly define your &#8220;center of the flashlight&#8221;&#8212;your specific thesis or perspective&#8212;while allowing the AI room to explore the edges. Don&#8217;t just ask open-ended questions; provide direction and hard boundaries (e.g., what to exclude).</p></li><li><p><strong>Asking What Good Looks Like (13:10-16:20):</strong> Rather than relying solely on automated evals, use complex, layered questions to define success. Use the AI to wrestle with outcomes, such as synthesizing a PRFAQ (Press Release/FAQ) for a new product, which requires balancing multiple perspectives and constraints.</p></li><li><p><strong>Wrestling with Data and Opinions (17:04-21:30):</strong> Effectively leverage your AI by grouping related data (files, transcripts, metrics) into a single context and posing questions that bridge the gap between hard data and your strategic assumptions. Challenge the AI to synthesize these into a coherent, evidence-based thesis.</p></li></ol><p><strong>Key Takeaway</strong><br>Your goal is no longer to &#8220;prompt&#8221; the AI with specific instructions but to facilitate a sophisticated partnership. By using sharp, layered, and intent-driven questions, you unlock the ability to perform high-level knowledge work, turning the AI into a collaborator that can push back, synthesize, and provide deeper insights than ever before (23:45-25:04).</p><h2>Case Study Summaries</h2><h3>Meridian Mobile &#8212; Brownfield Prompt Iteration</h3><p>A customer support bot prompt was evolved iteratively across six versions (v0&#8211;v5), tracked live against five test scenarios (control, proration, prepaid, billing_error, hotspot), n=6 per cell, using claude-sonnet-4-6.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ekF7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87d30446-b63e-4f12-bdb1-3d9c0e0ed68a_1901x1080.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ekF7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87d30446-b63e-4f12-bdb1-3d9c0e0ed68a_1901x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ekF7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87d30446-b63e-4f12-bdb1-3d9c0e0ed68a_1901x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ekF7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87d30446-b63e-4f12-bdb1-3d9c0e0ed68a_1901x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ekF7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87d30446-b63e-4f12-bdb1-3d9c0e0ed68a_1901x1080.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ekF7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87d30446-b63e-4f12-bdb1-3d9c0e0ed68a_1901x1080.jpeg" width="1456" height="827" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/87d30446-b63e-4f12-bdb1-3d9c0e0ed68a_1901x1080.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:827,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:219535,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199018115?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87d30446-b63e-4f12-bdb1-3d9c0e0ed68a_1901x1080.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ekF7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87d30446-b63e-4f12-bdb1-3d9c0e0ed68a_1901x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ekF7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87d30446-b63e-4f12-bdb1-3d9c0e0ed68a_1901x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ekF7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87d30446-b63e-4f12-bdb1-3d9c0e0ed68a_1901x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ekF7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87d30446-b63e-4f12-bdb1-3d9c0e0ed68a_1901x1080.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ah7M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F387fb72c-f23a-43e0-a23e-4cade282d52c_780x453.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ah7M!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F387fb72c-f23a-43e0-a23e-4cade282d52c_780x453.png 424w, https://substackcdn.com/image/fetch/$s_!ah7M!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F387fb72c-f23a-43e0-a23e-4cade282d52c_780x453.png 848w, https://substackcdn.com/image/fetch/$s_!ah7M!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F387fb72c-f23a-43e0-a23e-4cade282d52c_780x453.png 1272w, https://substackcdn.com/image/fetch/$s_!ah7M!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F387fb72c-f23a-43e0-a23e-4cade282d52c_780x453.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ah7M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F387fb72c-f23a-43e0-a23e-4cade282d52c_780x453.png" width="780" height="453" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/387fb72c-f23a-43e0-a23e-4cade282d52c_780x453.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:453,&quot;width&quot;:780,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:42513,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199018115?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F387fb72c-f23a-43e0-a23e-4cade282d52c_780x453.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ah7M!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F387fb72c-f23a-43e0-a23e-4cade282d52c_780x453.png 424w, https://substackcdn.com/image/fetch/$s_!ah7M!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F387fb72c-f23a-43e0-a23e-4cade282d52c_780x453.png 848w, https://substackcdn.com/image/fetch/$s_!ah7M!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F387fb72c-f23a-43e0-a23e-4cade282d52c_780x453.png 1272w, https://substackcdn.com/image/fetch/$s_!ah7M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F387fb72c-f23a-43e0-a23e-4cade282d52c_780x453.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nfGQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35dce591-c102-4e6e-8843-294a926b6173_1080x2558.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nfGQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35dce591-c102-4e6e-8843-294a926b6173_1080x2558.jpeg 424w, https://substackcdn.com/image/fetch/$s_!nfGQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35dce591-c102-4e6e-8843-294a926b6173_1080x2558.jpeg 848w, https://substackcdn.com/image/fetch/$s_!nfGQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35dce591-c102-4e6e-8843-294a926b6173_1080x2558.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!nfGQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35dce591-c102-4e6e-8843-294a926b6173_1080x2558.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nfGQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35dce591-c102-4e6e-8843-294a926b6173_1080x2558.jpeg" width="1080" height="2558" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/35dce591-c102-4e6e-8843-294a926b6173_1080x2558.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2558,&quot;width&quot;:1080,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:903216,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199018115?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35dce591-c102-4e6e-8843-294a926b6173_1080x2558.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nfGQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35dce591-c102-4e6e-8843-294a926b6173_1080x2558.jpeg 424w, https://substackcdn.com/image/fetch/$s_!nfGQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35dce591-c102-4e6e-8843-294a926b6173_1080x2558.jpeg 848w, https://substackcdn.com/image/fetch/$s_!nfGQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35dce591-c102-4e6e-8843-294a926b6173_1080x2558.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!nfGQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35dce591-c102-4e6e-8843-294a926b6173_1080x2558.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rsAx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02f9a706-1e36-4331-ab11-f3f645a6c326_1080x2198.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rsAx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02f9a706-1e36-4331-ab11-f3f645a6c326_1080x2198.jpeg 424w, https://substackcdn.com/image/fetch/$s_!rsAx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02f9a706-1e36-4331-ab11-f3f645a6c326_1080x2198.jpeg 848w, https://substackcdn.com/image/fetch/$s_!rsAx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02f9a706-1e36-4331-ab11-f3f645a6c326_1080x2198.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!rsAx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02f9a706-1e36-4331-ab11-f3f645a6c326_1080x2198.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rsAx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02f9a706-1e36-4331-ab11-f3f645a6c326_1080x2198.jpeg" width="1080" height="2198" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/02f9a706-1e36-4331-ab11-f3f645a6c326_1080x2198.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2198,&quot;width&quot;:1080,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:727478,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199018115?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02f9a706-1e36-4331-ab11-f3f645a6c326_1080x2198.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rsAx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02f9a706-1e36-4331-ab11-f3f645a6c326_1080x2198.jpeg 424w, https://substackcdn.com/image/fetch/$s_!rsAx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02f9a706-1e36-4331-ab11-f3f645a6c326_1080x2198.jpeg 848w, https://substackcdn.com/image/fetch/$s_!rsAx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02f9a706-1e36-4331-ab11-f3f645a6c326_1080x2198.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!rsAx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02f9a706-1e36-4331-ab11-f3f645a6c326_1080x2198.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Retail Scheduling &#8212; Greenfield Agent Comparison</h3><p>Five architectural approaches tested against a staff scheduling constraint-satisfaction task. Scored by deterministic checker (rule violations per run).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZFEb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e292add-c1ac-47b5-a7ff-cbab801e41ed_1924x1080.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZFEb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e292add-c1ac-47b5-a7ff-cbab801e41ed_1924x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ZFEb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e292add-c1ac-47b5-a7ff-cbab801e41ed_1924x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ZFEb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e292add-c1ac-47b5-a7ff-cbab801e41ed_1924x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ZFEb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e292add-c1ac-47b5-a7ff-cbab801e41ed_1924x1080.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZFEb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e292add-c1ac-47b5-a7ff-cbab801e41ed_1924x1080.jpeg" width="1456" height="817" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8e292add-c1ac-47b5-a7ff-cbab801e41ed_1924x1080.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:817,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:295312,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199018115?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e292add-c1ac-47b5-a7ff-cbab801e41ed_1924x1080.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZFEb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e292add-c1ac-47b5-a7ff-cbab801e41ed_1924x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ZFEb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e292add-c1ac-47b5-a7ff-cbab801e41ed_1924x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ZFEb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e292add-c1ac-47b5-a7ff-cbab801e41ed_1924x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ZFEb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e292add-c1ac-47b5-a7ff-cbab801e41ed_1924x1080.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6iFG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bf1902c-14c5-406e-ad93-957f9b9e333a_1078x962.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6iFG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bf1902c-14c5-406e-ad93-957f9b9e333a_1078x962.jpeg 424w, https://substackcdn.com/image/fetch/$s_!6iFG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bf1902c-14c5-406e-ad93-957f9b9e333a_1078x962.jpeg 848w, https://substackcdn.com/image/fetch/$s_!6iFG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bf1902c-14c5-406e-ad93-957f9b9e333a_1078x962.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!6iFG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bf1902c-14c5-406e-ad93-957f9b9e333a_1078x962.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6iFG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bf1902c-14c5-406e-ad93-957f9b9e333a_1078x962.jpeg" width="1078" height="962" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4bf1902c-14c5-406e-ad93-957f9b9e333a_1078x962.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:962,&quot;width&quot;:1078,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:382592,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199018115?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bf1902c-14c5-406e-ad93-957f9b9e333a_1078x962.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6iFG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bf1902c-14c5-406e-ad93-957f9b9e333a_1078x962.jpeg 424w, https://substackcdn.com/image/fetch/$s_!6iFG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bf1902c-14c5-406e-ad93-957f9b9e333a_1078x962.jpeg 848w, https://substackcdn.com/image/fetch/$s_!6iFG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bf1902c-14c5-406e-ad93-957f9b9e333a_1078x962.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!6iFG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bf1902c-14c5-406e-ad93-957f9b9e333a_1078x962.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aLd1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feee2b3a5-f5ed-47e1-aca0-5ecd210056be_788x417.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aLd1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feee2b3a5-f5ed-47e1-aca0-5ecd210056be_788x417.png 424w, https://substackcdn.com/image/fetch/$s_!aLd1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feee2b3a5-f5ed-47e1-aca0-5ecd210056be_788x417.png 848w, https://substackcdn.com/image/fetch/$s_!aLd1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feee2b3a5-f5ed-47e1-aca0-5ecd210056be_788x417.png 1272w, https://substackcdn.com/image/fetch/$s_!aLd1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feee2b3a5-f5ed-47e1-aca0-5ecd210056be_788x417.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aLd1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feee2b3a5-f5ed-47e1-aca0-5ecd210056be_788x417.png" width="788" height="417" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eee2b3a5-f5ed-47e1-aca0-5ecd210056be_788x417.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:417,&quot;width&quot;:788,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:30202,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199018115?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feee2b3a5-f5ed-47e1-aca0-5ecd210056be_788x417.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aLd1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feee2b3a5-f5ed-47e1-aca0-5ecd210056be_788x417.png 424w, https://substackcdn.com/image/fetch/$s_!aLd1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feee2b3a5-f5ed-47e1-aca0-5ecd210056be_788x417.png 848w, https://substackcdn.com/image/fetch/$s_!aLd1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feee2b3a5-f5ed-47e1-aca0-5ecd210056be_788x417.png 1272w, https://substackcdn.com/image/fetch/$s_!aLd1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feee2b3a5-f5ed-47e1-aca0-5ecd210056be_788x417.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NWkf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febcf6411-357c-48f6-8cad-ff5a6896b7ce_1141x612.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NWkf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febcf6411-357c-48f6-8cad-ff5a6896b7ce_1141x612.png 424w, https://substackcdn.com/image/fetch/$s_!NWkf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febcf6411-357c-48f6-8cad-ff5a6896b7ce_1141x612.png 848w, https://substackcdn.com/image/fetch/$s_!NWkf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febcf6411-357c-48f6-8cad-ff5a6896b7ce_1141x612.png 1272w, https://substackcdn.com/image/fetch/$s_!NWkf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febcf6411-357c-48f6-8cad-ff5a6896b7ce_1141x612.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NWkf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febcf6411-357c-48f6-8cad-ff5a6896b7ce_1141x612.png" width="1141" height="612" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ebcf6411-357c-48f6-8cad-ff5a6896b7ce_1141x612.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:612,&quot;width&quot;:1141,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:999711,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199018115?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febcf6411-357c-48f6-8cad-ff5a6896b7ce_1141x612.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NWkf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febcf6411-357c-48f6-8cad-ff5a6896b7ce_1141x612.png 424w, https://substackcdn.com/image/fetch/$s_!NWkf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febcf6411-357c-48f6-8cad-ff5a6896b7ce_1141x612.png 848w, https://substackcdn.com/image/fetch/$s_!NWkf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febcf6411-357c-48f6-8cad-ff5a6896b7ce_1141x612.png 1272w, https://substackcdn.com/image/fetch/$s_!NWkf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febcf6411-357c-48f6-8cad-ff5a6896b7ce_1141x612.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Nate B Jones &#8212; The Four-Discipline Framework</h2><p>Jones argues that &#8216;prompting&#8217; has split into four distinct skills. Most practitioners are developing only the first. The gap between those practicing all four and those practicing one is described as 10x and compounding, particularly once agents operate autonomously.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yU4B!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feed45413-0f00-48f8-aac9-bee4fefd6374_1044x564.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yU4B!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feed45413-0f00-48f8-aac9-bee4fefd6374_1044x564.png 424w, https://substackcdn.com/image/fetch/$s_!yU4B!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feed45413-0f00-48f8-aac9-bee4fefd6374_1044x564.png 848w, https://substackcdn.com/image/fetch/$s_!yU4B!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feed45413-0f00-48f8-aac9-bee4fefd6374_1044x564.png 1272w, https://substackcdn.com/image/fetch/$s_!yU4B!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feed45413-0f00-48f8-aac9-bee4fefd6374_1044x564.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yU4B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feed45413-0f00-48f8-aac9-bee4fefd6374_1044x564.png" width="1044" height="564" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eed45413-0f00-48f8-aac9-bee4fefd6374_1044x564.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:564,&quot;width&quot;:1044,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:823372,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199018115?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feed45413-0f00-48f8-aac9-bee4fefd6374_1044x564.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yU4B!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feed45413-0f00-48f8-aac9-bee4fefd6374_1044x564.png 424w, https://substackcdn.com/image/fetch/$s_!yU4B!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feed45413-0f00-48f8-aac9-bee4fefd6374_1044x564.png 848w, https://substackcdn.com/image/fetch/$s_!yU4B!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feed45413-0f00-48f8-aac9-bee4fefd6374_1044x564.png 1272w, https://substackcdn.com/image/fetch/$s_!yU4B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feed45413-0f00-48f8-aac9-bee4fefd6374_1044x564.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>The 35-minute wall: Jones observes that once agents run autonomously beyond ~35 minutes without check-ins, all chat-based prompting assumptions break. Everything must be encoded in the specification before the agent starts.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RCx0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dfe2c42-6bbc-4573-9a5f-72987bf59a13_1130x599.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RCx0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dfe2c42-6bbc-4573-9a5f-72987bf59a13_1130x599.png 424w, https://substackcdn.com/image/fetch/$s_!RCx0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dfe2c42-6bbc-4573-9a5f-72987bf59a13_1130x599.png 848w, https://substackcdn.com/image/fetch/$s_!RCx0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dfe2c42-6bbc-4573-9a5f-72987bf59a13_1130x599.png 1272w, https://substackcdn.com/image/fetch/$s_!RCx0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dfe2c42-6bbc-4573-9a5f-72987bf59a13_1130x599.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RCx0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dfe2c42-6bbc-4573-9a5f-72987bf59a13_1130x599.png" width="1130" height="599" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3dfe2c42-6bbc-4573-9a5f-72987bf59a13_1130x599.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:599,&quot;width&quot;:1130,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1108402,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199018115?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dfe2c42-6bbc-4573-9a5f-72987bf59a13_1130x599.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RCx0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dfe2c42-6bbc-4573-9a5f-72987bf59a13_1130x599.png 424w, https://substackcdn.com/image/fetch/$s_!RCx0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dfe2c42-6bbc-4573-9a5f-72987bf59a13_1130x599.png 848w, https://substackcdn.com/image/fetch/$s_!RCx0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dfe2c42-6bbc-4573-9a5f-72987bf59a13_1130x599.png 1272w, https://substackcdn.com/image/fetch/$s_!RCx0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dfe2c42-6bbc-4573-9a5f-72987bf59a13_1130x599.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Reconciliation and Thoughts on the Two Articles (The Architecture of Awareness and ASCRS The Harness Lab)</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BUaj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e7eaea3-f46f-4291-af4e-d4be37b8f4a1_1131x594.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BUaj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e7eaea3-f46f-4291-af4e-d4be37b8f4a1_1131x594.png 424w, https://substackcdn.com/image/fetch/$s_!BUaj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e7eaea3-f46f-4291-af4e-d4be37b8f4a1_1131x594.png 848w, https://substackcdn.com/image/fetch/$s_!BUaj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e7eaea3-f46f-4291-af4e-d4be37b8f4a1_1131x594.png 1272w, https://substackcdn.com/image/fetch/$s_!BUaj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e7eaea3-f46f-4291-af4e-d4be37b8f4a1_1131x594.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BUaj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e7eaea3-f46f-4291-af4e-d4be37b8f4a1_1131x594.png" width="1131" height="594" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0e7eaea3-f46f-4291-af4e-d4be37b8f4a1_1131x594.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:594,&quot;width&quot;:1131,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1100852,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199018115?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e7eaea3-f46f-4291-af4e-d4be37b8f4a1_1131x594.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BUaj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e7eaea3-f46f-4291-af4e-d4be37b8f4a1_1131x594.png 424w, https://substackcdn.com/image/fetch/$s_!BUaj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e7eaea3-f46f-4291-af4e-d4be37b8f4a1_1131x594.png 848w, https://substackcdn.com/image/fetch/$s_!BUaj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e7eaea3-f46f-4291-af4e-d4be37b8f4a1_1131x594.png 1272w, https://substackcdn.com/image/fetch/$s_!BUaj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e7eaea3-f46f-4291-af4e-d4be37b8f4a1_1131x594.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>What Is Reinforced:</h3><p>&#8226; <strong>H2&#8217;s win is now fully explained. </strong>The system prompt that achieved &#945; = 1.000 was not doing prompt craft alone. It embedded all four of Jones&#8217;s disciplines simultaneously: domain facts (context), CFO-approvable document framing (intent), and explicit G4/G7 ordering with carrier constraints (specification). H9 failed because sub-agents only did prompt craft.</p><p>&#8226; <strong>The V1&#8594;V5 Meridian Mobile demo mirrors V1&#8594;V4 exactly. </strong>Both are iterative debugging sequences, not controlled comparisons. Both surface the same lesson: each version fixes one specific failure mode without rewriting everything else.</p><p>&#8226; <strong>The Generate&#8594;Evaluate&#8594;Repair loop is now benchmarked. </strong>The greenfield scheduling eval confirms it achieves maximum pass rate at 30% fewer tokens and lower latency than adaptive thinking, using a cheaper model. This validates the V3 correction loop architecture quantitatively for the first time.</p><p>&#8226; <strong>Ordering instructions = G4/G7 dependency fix. </strong>Anthropic&#8217;s V4 demo explicitly demonstrates that step order in instructions mirrors human reasoning order. Meta-Harness Discovery 02 (item 3 must lock before item 6 is written) is the same principle enforced at harness level.</p><p>&#8226; <strong>Brownfield/greenfield distinction holds. </strong>Scheduling eval confirms: complex prompts fail on hard constraint tasks (2/5). GER loop or adaptive thinking is required. Operational ASCRS with real-time data feeds is definitively brownfield: the V4 architecture is the correct deployment choice.</p><p>&#8226; <strong>Extended Thinking as diagnostic = Meta-Harness trace analysis. </strong>Anthropic recommends using Extended Thinking to understand where the model struggles, then encoding that reasoning as explicit instructions. The Meta-Harness automates this: reads execution traces across all prior runs, proposes harness edits. Same mechanism, scaled.</p><h3>What Is Extended (Findings Not Included in the Original Articles)</h3><p>&#8226; <strong>The instruction-to-tool-call transition. </strong>Meridian Mobile&#8217;s proration case shows that &#8216;do math correctly&#8217; is not a fixable prompt engineering problem &#8212; it is a task-type mismatch. Arithmetic should never live in the model; it should live in a deterministic function. Implication for ASCRS: anywhere the system derives a number (weighted median planning basis, freight rate calculations), a calculate_weighted_median tool call would be structurally more reliable than instruction-based derivation. This is a direct architectural gap the original design does not address.</p><p>&#8226; <strong>Legacy patches as active harm. </strong>The Meridian Mobile demo identifies instructions added to fix older model limitations that became harmful on newer, more capable models. A defensive instruction designed as a guardrail caused the model to withhold information. Any CLAUDE.md or SKILL.md instructions written against Claude 3.x or earlier Sonnet versions should be audited for this failure mode before production deployment on Sonnet 4.6 or later.</p><p>&#8226; <strong>Stop sequences as API-level output contract. </strong>The Meridian Mobile v2 step adds a stop sequence in the API call (not the prompt) that halts generation at the closing XML tag. This eliminates post-response padding before it reaches the quality gate. For ASCRS, this could reduce gate failures caused by the Strategist generating supplementary content after the required structured elements, reducing loop iterations without changing the rubric.</p><p>&#8226; <strong>Intent engineering is the open gap in ASCRS. </strong>Objective 6 (institutional memory that compounds) remains unverified because outcome fields are placeholders. Jones&#8217;s framing names this precisely: the system has prompt craft (the rubric), context engineering (memory tiers), and specification engineering (gate checklist), but the intent layer &#8212; comparing predictions to actuals and calibrating future scores accordingly &#8212; is not closed. This is an operational discipline requirement, not an engineering project.</p><p>&#8226; <strong>Trade-offs vs. constraints. </strong>Replacing the billing escalation rigid rule with cost-benefit framing (escalation costs X; failing to escalate costs more in refunds and trust) improved both compliance and quality. For ASCRS harness design, binary gate items specified as hard constraints may benefit from being reframed as trade-off framings in the agent instructions, with the binary gate retained for verification purposes only.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!27em!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe46377ec-21cf-4a12-b16e-03d58ab23c23_1138x612.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!27em!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe46377ec-21cf-4a12-b16e-03d58ab23c23_1138x612.png 424w, https://substackcdn.com/image/fetch/$s_!27em!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe46377ec-21cf-4a12-b16e-03d58ab23c23_1138x612.png 848w, https://substackcdn.com/image/fetch/$s_!27em!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe46377ec-21cf-4a12-b16e-03d58ab23c23_1138x612.png 1272w, https://substackcdn.com/image/fetch/$s_!27em!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe46377ec-21cf-4a12-b16e-03d58ab23c23_1138x612.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!27em!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe46377ec-21cf-4a12-b16e-03d58ab23c23_1138x612.png" width="1138" height="612" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e46377ec-21cf-4a12-b16e-03d58ab23c23_1138x612.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:612,&quot;width&quot;:1138,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1053919,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199018115?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe46377ec-21cf-4a12-b16e-03d58ab23c23_1138x612.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!27em!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe46377ec-21cf-4a12-b16e-03d58ab23c23_1138x612.png 424w, https://substackcdn.com/image/fetch/$s_!27em!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe46377ec-21cf-4a12-b16e-03d58ab23c23_1138x612.png 848w, https://substackcdn.com/image/fetch/$s_!27em!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe46377ec-21cf-4a12-b16e-03d58ab23c23_1138x612.png 1272w, https://substackcdn.com/image/fetch/$s_!27em!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe46377ec-21cf-4a12-b16e-03d58ab23c23_1138x612.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0OQL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F707a4cdd-2d4e-4228-9dfa-51732ed1f2c0_1138x615.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0OQL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F707a4cdd-2d4e-4228-9dfa-51732ed1f2c0_1138x615.png 424w, https://substackcdn.com/image/fetch/$s_!0OQL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F707a4cdd-2d4e-4228-9dfa-51732ed1f2c0_1138x615.png 848w, https://substackcdn.com/image/fetch/$s_!0OQL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F707a4cdd-2d4e-4228-9dfa-51732ed1f2c0_1138x615.png 1272w, https://substackcdn.com/image/fetch/$s_!0OQL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F707a4cdd-2d4e-4228-9dfa-51732ed1f2c0_1138x615.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0OQL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F707a4cdd-2d4e-4228-9dfa-51732ed1f2c0_1138x615.png" width="1138" height="615" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/707a4cdd-2d4e-4228-9dfa-51732ed1f2c0_1138x615.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:615,&quot;width&quot;:1138,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1118882,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199018115?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F707a4cdd-2d4e-4228-9dfa-51732ed1f2c0_1138x615.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0OQL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F707a4cdd-2d4e-4228-9dfa-51732ed1f2c0_1138x615.png 424w, https://substackcdn.com/image/fetch/$s_!0OQL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F707a4cdd-2d4e-4228-9dfa-51732ed1f2c0_1138x615.png 848w, https://substackcdn.com/image/fetch/$s_!0OQL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F707a4cdd-2d4e-4228-9dfa-51732ed1f2c0_1138x615.png 1272w, https://substackcdn.com/image/fetch/$s_!0OQL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F707a4cdd-2d4e-4228-9dfa-51732ed1f2c0_1138x615.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Updated Integrated Agentic Stack Framework</h2><p>The original Deployment Matrix (Spec vs. Swarm / H2 vs. V4) is retained. The framework is extended with three new dimensions from the case studies above. The following ASCII diagram replaces/Updates the original.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ccg_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dec90c3-558e-40be-beb5-5b31d97de70d_1135x613.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ccg_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dec90c3-558e-40be-beb5-5b31d97de70d_1135x613.png 424w, https://substackcdn.com/image/fetch/$s_!Ccg_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dec90c3-558e-40be-beb5-5b31d97de70d_1135x613.png 848w, https://substackcdn.com/image/fetch/$s_!Ccg_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dec90c3-558e-40be-beb5-5b31d97de70d_1135x613.png 1272w, https://substackcdn.com/image/fetch/$s_!Ccg_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dec90c3-558e-40be-beb5-5b31d97de70d_1135x613.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ccg_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dec90c3-558e-40be-beb5-5b31d97de70d_1135x613.png" width="1135" height="613" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1dec90c3-558e-40be-beb5-5b31d97de70d_1135x613.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:613,&quot;width&quot;:1135,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1150297,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199018115?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dec90c3-558e-40be-beb5-5b31d97de70d_1135x613.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ccg_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dec90c3-558e-40be-beb5-5b31d97de70d_1135x613.png 424w, https://substackcdn.com/image/fetch/$s_!Ccg_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dec90c3-558e-40be-beb5-5b31d97de70d_1135x613.png 848w, https://substackcdn.com/image/fetch/$s_!Ccg_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dec90c3-558e-40be-beb5-5b31d97de70d_1135x613.png 1272w, https://substackcdn.com/image/fetch/$s_!Ccg_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dec90c3-558e-40be-beb5-5b31d97de70d_1135x613.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><pre><code><code>  THE INTEGRATED AGENTIC STACK  (updated May 2026)
  &#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;

  LAYER 0 &#9472; HAAS RUNTIME (Claude Code / Hermes Agent)
  &#9474; Orchestrates experiments. Reads filesystem.
  &#9474; Never inside inner loop. Silent during H1&#8211;H9 runs.
  &#9474;
  &#9500;&#9472; *NEW: Audit CLAUDE.md for legacy patch harm on model upgrades
  &#9492;&#9472; *NEW: Configure API stop sequences as output contract (not prompt-layer)

  LAYER 1 &#9472; SHARED INFRASTRUCTURE (Immutable Control Layer)
  &#9474; task.md &#183; rubric.json &#183; gold_answer.md &#183; scorer.js
  &#9474; SCORER_MODEL must always differ from DEFAULT_MODEL
  &#9474;
  &#9492;&#9472; *NEW: Replace arithmetic instructions with deterministic tool calls
          (calculate_weighted_median, calculate_proration, etc.)

  LAYER 2 &#9472; EXPERIMENT HARNESS (H1&#8211;H10 / V1&#8211;V4)
  &#9474;
  &#9500;&#9472; SPEC KERNEL (H2 &#8212; Greenfield, all data upfront)
  &#9474;   &#9474; High-density system prompt encoding all four disciplines:
  &#9474;   &#9474;   1. Prompt Craft   &#8212; structure, XML, stop sequence
  &#9474;   &#9474;   2. Context Eng.   &#8212; domain facts, static background in sys prompt
  &#9474;   &#9474;   3. Intent Eng.    &#8212; state the outcome, not just the task
  &#9474;   &#9474;   4. Spec Eng.      &#8212; acceptance criteria + constraint architecture
  &#9474;   &#9492;&#9472; *NEW: Trade-off framing &gt; rigid constraints for judgment calls
  &#9474;
  &#9500;&#9472; CORRECTION LOOP (V3 / GER &#8212; constraint-heavy tasks)
  &#9474;   Generator &#8594; Evaluator (report violations) &#8594; Repairer (targeted fix)
  &#9474;   Benchmarked: 5/5 pass &#183; 6,449 tokens &#183; 79.4s on Sonnet 4.6
  &#9474;   Outperforms adaptive thinking on cost; equivalent on quality
  &#9474;
  &#9492;&#9472; META-HARNESS (V4 &#8212; Brownfield, operational, real-time)
      &#9474; Reads all execution traces across prior runs
      &#9474; Forms causal hypotheses. Proposes harness edits.
      &#9474; Three validated discoveries:
      &#9474;   D01: Weighted quorum (market data gets grace period)
      &#9474;   D02: Sequential item evaluation (G4 locked before G7 written)
      &#9474;   D03: Provisional memory tier (20% threshold, 30-day TTL)
      &#9474;
      &#9492;&#9472; OPEN: Closed outcome loop (Objective 6 &#8212; intent engineering gap)
               Post-event outcome fields must be populated by humans.
               Until closed: every confidence score is uncalibrated.

  LAYER 3 &#9472; ENV / SWAPPABLE INTELLIGENCE
  &#9474; One line in .env &#183; Model is commodity &#183; Harness is differentiator
  &#9492;&#9472; Routing: Opus 4.7 for outer-loop Proposer; Sonnet 4.6 for inner runs
              Light model for classification sub-tasks (H7 pattern)

  &#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;
  DEPLOYMENT DECISION RULE (updated)
  &#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;

  Q1: Is all data present upfront?
      YES &#8594; Use H2 Spec Kernel (invest in all 4 prompt disciplines)
      NO  &#8594; Go to Q2

  Q2: Does the task require hard constraint verification?
      YES (data present) &#8594; Use GER loop on Sonnet 4.6
      NO                 &#8594; Go to Q3

  Q3: Does the agent run autonomously beyond ~35 minutes
      with external state, ERP writes, or multi-source intelligence?
      YES &#8594; Use V4 Meta-Harness (swarm + structural auditing loops)
      NO  &#8594; Use V3 Correction Loop as production baseline

  &#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;
  KEY PRINCIPLES

  The design governs    what the system can see.
  The prompt, /goal governs    how it reasons and takes action.
  The tool call governs what it calculates.          &#8592; NEW*
  &#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;
</code></code></pre><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9-Lu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd49350c-b2e2-4dda-83d1-cd106306064f_1048x619.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9-Lu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd49350c-b2e2-4dda-83d1-cd106306064f_1048x619.png 424w, https://substackcdn.com/image/fetch/$s_!9-Lu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd49350c-b2e2-4dda-83d1-cd106306064f_1048x619.png 848w, https://substackcdn.com/image/fetch/$s_!9-Lu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd49350c-b2e2-4dda-83d1-cd106306064f_1048x619.png 1272w, https://substackcdn.com/image/fetch/$s_!9-Lu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd49350c-b2e2-4dda-83d1-cd106306064f_1048x619.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9-Lu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd49350c-b2e2-4dda-83d1-cd106306064f_1048x619.png" width="1048" height="619" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bd49350c-b2e2-4dda-83d1-cd106306064f_1048x619.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:619,&quot;width&quot;:1048,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:967253,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199018115?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd49350c-b2e2-4dda-83d1-cd106306064f_1048x619.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9-Lu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd49350c-b2e2-4dda-83d1-cd106306064f_1048x619.png 424w, https://substackcdn.com/image/fetch/$s_!9-Lu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd49350c-b2e2-4dda-83d1-cd106306064f_1048x619.png 848w, https://substackcdn.com/image/fetch/$s_!9-Lu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd49350c-b2e2-4dda-83d1-cd106306064f_1048x619.png 1272w, https://substackcdn.com/image/fetch/$s_!9-Lu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd49350c-b2e2-4dda-83d1-cd106306064f_1048x619.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KI0l!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38a1f1b3-c77d-4c38-9d07-1fdc4fe2d352_1120x614.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KI0l!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38a1f1b3-c77d-4c38-9d07-1fdc4fe2d352_1120x614.png 424w, https://substackcdn.com/image/fetch/$s_!KI0l!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38a1f1b3-c77d-4c38-9d07-1fdc4fe2d352_1120x614.png 848w, https://substackcdn.com/image/fetch/$s_!KI0l!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38a1f1b3-c77d-4c38-9d07-1fdc4fe2d352_1120x614.png 1272w, https://substackcdn.com/image/fetch/$s_!KI0l!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38a1f1b3-c77d-4c38-9d07-1fdc4fe2d352_1120x614.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KI0l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38a1f1b3-c77d-4c38-9d07-1fdc4fe2d352_1120x614.png" width="1120" height="614" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/38a1f1b3-c77d-4c38-9d07-1fdc4fe2d352_1120x614.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:614,&quot;width&quot;:1120,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:905435,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/199018115?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38a1f1b3-c77d-4c38-9d07-1fdc4fe2d352_1120x614.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KI0l!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38a1f1b3-c77d-4c38-9d07-1fdc4fe2d352_1120x614.png 424w, https://substackcdn.com/image/fetch/$s_!KI0l!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38a1f1b3-c77d-4c38-9d07-1fdc4fe2d352_1120x614.png 848w, https://substackcdn.com/image/fetch/$s_!KI0l!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38a1f1b3-c77d-4c38-9d07-1fdc4fe2d352_1120x614.png 1272w, https://substackcdn.com/image/fetch/$s_!KI0l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38a1f1b3-c77d-4c38-9d07-1fdc4fe2d352_1120x614.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[THE TOKEN TAX ]]></title><description><![CDATA[Harness Engineering Part II: From Performance to Efficiency]]></description><link>https://interestingengineering.substack.com/p/the-token-tax</link><guid isPermaLink="false">https://interestingengineering.substack.com/p/the-token-tax</guid><dc:creator><![CDATA[Interesting Engineering ++]]></dc:creator><pubDate>Sat, 23 May 2026 18:06:02 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!FIkl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c8581d-3977-4ea9-beda-1ff2c537a53b_1191x651.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FIkl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c8581d-3977-4ea9-beda-1ff2c537a53b_1191x651.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FIkl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c8581d-3977-4ea9-beda-1ff2c537a53b_1191x651.png 424w, https://substackcdn.com/image/fetch/$s_!FIkl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c8581d-3977-4ea9-beda-1ff2c537a53b_1191x651.png 848w, https://substackcdn.com/image/fetch/$s_!FIkl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c8581d-3977-4ea9-beda-1ff2c537a53b_1191x651.png 1272w, https://substackcdn.com/image/fetch/$s_!FIkl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c8581d-3977-4ea9-beda-1ff2c537a53b_1191x651.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FIkl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c8581d-3977-4ea9-beda-1ff2c537a53b_1191x651.png" width="1191" height="651" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d0c8581d-3977-4ea9-beda-1ff2c537a53b_1191x651.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:651,&quot;width&quot;:1191,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1450520,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c8581d-3977-4ea9-beda-1ff2c537a53b_1191x651.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FIkl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c8581d-3977-4ea9-beda-1ff2c537a53b_1191x651.png 424w, https://substackcdn.com/image/fetch/$s_!FIkl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c8581d-3977-4ea9-beda-1ff2c537a53b_1191x651.png 848w, https://substackcdn.com/image/fetch/$s_!FIkl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c8581d-3977-4ea9-beda-1ff2c537a53b_1191x651.png 1272w, https://substackcdn.com/image/fetch/$s_!FIkl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c8581d-3977-4ea9-beda-1ff2c537a53b_1191x651.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Where Part I Left Off,..con&#8217;t</h2><p><a href="https://interestingengineering.substack.com/p/harness-engineering-scaffolding-a">Part I of yesterday ran a controlled experiment</a>. Starting with Claude Haiku &#8212; the smallest, cheapest model in the Claude family &#8212; five harness layers were added one at a time: an output schema, a context file, a verifier agent, a feedback loop, and a gold standard memory injection. The model never changed. The harness did. V5 scored 50 out of 50.</p><blockquote><p><em>Part I proved that harness design matters more than model choice. Part II set out to prove that efficient harness design matters as much as capable harness design. The second claim is harder to demonstrate cleanly &#8212; as two, then three, live experiment runs will show. </em></p></blockquote><p>This article documents what actually happened across three runs: what the predictions were, which ones were confirmed, which were contradicted, and what the lab produced despite its imperfections. Honest about limitations. Useful on principles. I keep these mostly as notes for self. None of the learnings are meant to be prescriptive, but lab runs whilst fun create very different push-pull events from especially larger corporate browfield exercises. But &#8220;turning knobs&#8221; on these, give interesting learning outcomes.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Note that different models were applied here.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rod6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ba928f4-4739-4b3f-810b-e5f01cb3a144_1191x652.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rod6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ba928f4-4739-4b3f-810b-e5f01cb3a144_1191x652.png 424w, https://substackcdn.com/image/fetch/$s_!rod6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ba928f4-4739-4b3f-810b-e5f01cb3a144_1191x652.png 848w, https://substackcdn.com/image/fetch/$s_!rod6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ba928f4-4739-4b3f-810b-e5f01cb3a144_1191x652.png 1272w, https://substackcdn.com/image/fetch/$s_!rod6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ba928f4-4739-4b3f-810b-e5f01cb3a144_1191x652.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rod6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ba928f4-4739-4b3f-810b-e5f01cb3a144_1191x652.png" width="1191" height="652" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3ba928f4-4739-4b3f-810b-e5f01cb3a144_1191x652.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:652,&quot;width&quot;:1191,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1021024,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ba928f4-4739-4b3f-810b-e5f01cb3a144_1191x652.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rod6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ba928f4-4739-4b3f-810b-e5f01cb3a144_1191x652.png 424w, https://substackcdn.com/image/fetch/$s_!rod6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ba928f4-4739-4b3f-810b-e5f01cb3a144_1191x652.png 848w, https://substackcdn.com/image/fetch/$s_!rod6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ba928f4-4739-4b3f-810b-e5f01cb3a144_1191x652.png 1272w, https://substackcdn.com/image/fetch/$s_!rod6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ba928f4-4739-4b3f-810b-e5f01cb3a144_1191x652.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lTlF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5958bdfd-dece-4f03-b1ed-3222336c3caa_1198x654.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lTlF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5958bdfd-dece-4f03-b1ed-3222336c3caa_1198x654.png 424w, https://substackcdn.com/image/fetch/$s_!lTlF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5958bdfd-dece-4f03-b1ed-3222336c3caa_1198x654.png 848w, https://substackcdn.com/image/fetch/$s_!lTlF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5958bdfd-dece-4f03-b1ed-3222336c3caa_1198x654.png 1272w, https://substackcdn.com/image/fetch/$s_!lTlF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5958bdfd-dece-4f03-b1ed-3222336c3caa_1198x654.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lTlF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5958bdfd-dece-4f03-b1ed-3222336c3caa_1198x654.png" width="1198" height="654" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5958bdfd-dece-4f03-b1ed-3222336c3caa_1198x654.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:654,&quot;width&quot;:1198,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1054895,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5958bdfd-dece-4f03-b1ed-3222336c3caa_1198x654.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lTlF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5958bdfd-dece-4f03-b1ed-3222336c3caa_1198x654.png 424w, https://substackcdn.com/image/fetch/$s_!lTlF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5958bdfd-dece-4f03-b1ed-3222336c3caa_1198x654.png 848w, https://substackcdn.com/image/fetch/$s_!lTlF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5958bdfd-dece-4f03-b1ed-3222336c3caa_1198x654.png 1272w, https://substackcdn.com/image/fetch/$s_!lTlF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5958bdfd-dece-4f03-b1ed-3222336c3caa_1198x654.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The Setup</h2><h3>Models</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fCPB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb762585-de51-4eb3-b2db-8db99671fb24_787x197.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fCPB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb762585-de51-4eb3-b2db-8db99671fb24_787x197.png 424w, https://substackcdn.com/image/fetch/$s_!fCPB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb762585-de51-4eb3-b2db-8db99671fb24_787x197.png 848w, https://substackcdn.com/image/fetch/$s_!fCPB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb762585-de51-4eb3-b2db-8db99671fb24_787x197.png 1272w, https://substackcdn.com/image/fetch/$s_!fCPB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb762585-de51-4eb3-b2db-8db99671fb24_787x197.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fCPB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb762585-de51-4eb3-b2db-8db99671fb24_787x197.png" width="787" height="197" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/db762585-de51-4eb3-b2db-8db99671fb24_787x197.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:197,&quot;width&quot;:787,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:20246,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb762585-de51-4eb3-b2db-8db99671fb24_787x197.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fCPB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb762585-de51-4eb3-b2db-8db99671fb24_787x197.png 424w, https://substackcdn.com/image/fetch/$s_!fCPB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb762585-de51-4eb3-b2db-8db99671fb24_787x197.png 848w, https://substackcdn.com/image/fetch/$s_!fCPB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb762585-de51-4eb3-b2db-8db99671fb24_787x197.png 1272w, https://substackcdn.com/image/fetch/$s_!fCPB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb762585-de51-4eb3-b2db-8db99671fb24_787x197.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Two runs (Note: There is a postnote where a third run was undertaken to rectify the capability issue) were conducted. Run 1 used llama-3.2-3b as LIGHT_MODEL &#8212; a choice that proved too small for the task and invalidated three experiments. Run 2 corrected this to llama-3.1-8b, fixed two code-level design errors, and reran everything. Both runs are documented here because their divergences are themselves informative.</p><h3>The Composite Metric</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sg_j!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78bf978-af49-4f6a-9d8b-4dff055952ea_900x139.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sg_j!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78bf978-af49-4f6a-9d8b-4dff055952ea_900x139.png 424w, https://substackcdn.com/image/fetch/$s_!sg_j!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78bf978-af49-4f6a-9d8b-4dff055952ea_900x139.png 848w, https://substackcdn.com/image/fetch/$s_!sg_j!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78bf978-af49-4f6a-9d8b-4dff055952ea_900x139.png 1272w, https://substackcdn.com/image/fetch/$s_!sg_j!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78bf978-af49-4f6a-9d8b-4dff055952ea_900x139.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sg_j!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78bf978-af49-4f6a-9d8b-4dff055952ea_900x139.png" width="900" height="139" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a78bf978-af49-4f6a-9d8b-4dff055952ea_900x139.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:139,&quot;width&quot;:900,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:16292,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78bf978-af49-4f6a-9d8b-4dff055952ea_900x139.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sg_j!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78bf978-af49-4f6a-9d8b-4dff055952ea_900x139.png 424w, https://substackcdn.com/image/fetch/$s_!sg_j!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78bf978-af49-4f6a-9d8b-4dff055952ea_900x139.png 848w, https://substackcdn.com/image/fetch/$s_!sg_j!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78bf978-af49-4f6a-9d8b-4dff055952ea_900x139.png 1272w, https://substackcdn.com/image/fetch/$s_!sg_j!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78bf978-af49-4f6a-9d8b-4dff055952ea_900x139.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Quality carries 70% of the weight &#8212; a cheap bad output is worthless. Efficiency carries 20% &#8212; at scale, token costs determine viability. Speed carries 10%. All values are normalised against the session maximum, so they are comparable across runs within a session but not directly across sessions.</p><h3>Experimental Limitations &#8212; Declared Upfront</h3><p>Two structural constraints shaped what the experiment could and could not show. They are stated here rather than buried in footnotes.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GatZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd035b201-258d-4cb5-bebd-3d9f979d7fcf_789x311.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GatZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd035b201-258d-4cb5-bebd-3d9f979d7fcf_789x311.png 424w, https://substackcdn.com/image/fetch/$s_!GatZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd035b201-258d-4cb5-bebd-3d9f979d7fcf_789x311.png 848w, https://substackcdn.com/image/fetch/$s_!GatZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd035b201-258d-4cb5-bebd-3d9f979d7fcf_789x311.png 1272w, https://substackcdn.com/image/fetch/$s_!GatZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd035b201-258d-4cb5-bebd-3d9f979d7fcf_789x311.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GatZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd035b201-258d-4cb5-bebd-3d9f979d7fcf_789x311.png" width="789" height="311" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d035b201-258d-4cb5-bebd-3d9f979d7fcf_789x311.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:311,&quot;width&quot;:789,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:44902,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd035b201-258d-4cb5-bebd-3d9f979d7fcf_789x311.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GatZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd035b201-258d-4cb5-bebd-3d9f979d7fcf_789x311.png 424w, https://substackcdn.com/image/fetch/$s_!GatZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd035b201-258d-4cb5-bebd-3d9f979d7fcf_789x311.png 848w, https://substackcdn.com/image/fetch/$s_!GatZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd035b201-258d-4cb5-bebd-3d9f979d7fcf_789x311.png 1272w, https://substackcdn.com/image/fetch/$s_!GatZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd035b201-258d-4cb5-bebd-3d9f979d7fcf_789x311.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Despite these constraints, the experiment produced valid findings on five of the seven strategies. The two that failed did so for identifiable design reasons, both of which are themselves instructive. And then rectified, in the third run, as reflected in the Postnote.</p><h2>The Seven Experiments</h2><p>A note for general readers: each experiment changes exactly one thing compared to the baseline, then measures what that single change costs and whether it hurts quality. Think of it as turning one dial at a time on a machine &#8212; each dial is a different way of making the harness cheaper to run. The diagrams below each heading show what the flow looks like before and after the change.</p><h3>Baseline &#8212; The Reference Point</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hlE-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4150efc9-b3f0-46a0-b0c6-4e8ca3eefce9_898x536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hlE-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4150efc9-b3f0-46a0-b0c6-4e8ca3eefce9_898x536.png 424w, https://substackcdn.com/image/fetch/$s_!hlE-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4150efc9-b3f0-46a0-b0c6-4e8ca3eefce9_898x536.png 848w, https://substackcdn.com/image/fetch/$s_!hlE-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4150efc9-b3f0-46a0-b0c6-4e8ca3eefce9_898x536.png 1272w, https://substackcdn.com/image/fetch/$s_!hlE-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4150efc9-b3f0-46a0-b0c6-4e8ca3eefce9_898x536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hlE-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4150efc9-b3f0-46a0-b0c6-4e8ca3eefce9_898x536.png" width="898" height="536" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4150efc9-b3f0-46a0-b0c6-4e8ca3eefce9_898x536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:536,&quot;width&quot;:898,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:32324,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4150efc9-b3f0-46a0-b0c6-4e8ca3eefce9_898x536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hlE-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4150efc9-b3f0-46a0-b0c6-4e8ca3eefce9_898x536.png 424w, https://substackcdn.com/image/fetch/$s_!hlE-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4150efc9-b3f0-46a0-b0c6-4e8ca3eefce9_898x536.png 848w, https://substackcdn.com/image/fetch/$s_!hlE-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4150efc9-b3f0-46a0-b0c6-4e8ca3eefce9_898x536.png 1272w, https://substackcdn.com/image/fetch/$s_!hlE-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4150efc9-b3f0-46a0-b0c6-4e8ca3eefce9_898x536.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Run 1: 45/50 at 2,376 tokens. Run 2: 50/50 at 2,665 tokens.</strong></p><p>The baseline is V2+V3 configuration: full TASK.md context injection plus a full five-dimension verifier. No retry loop. The baseline is the cheapest harness in the experiment by design &#8212; it has no architectural complexity beyond the two most fundamental layers. This matters for interpreting the results: every architectural addition was being compared against a lean two-call baseline, which set a low token ceiling for efficiency gains.</p><p>The 5-point quality variance between runs (45 vs 50) on identical configurations is the clearest evidence of single-run noise. The verifier model &#8212; DeepSeek V4 Flash &#8212; produced different scores on equivalent briefs across sessions. This does not invalidate the experiment; it is a known property of LLM-as-judge evaluation. It does mean the composite rankings should be read as directional rather than precise.</p><h3>E1 &#8212; Prompt Compression: The Cliff Edge Is Lower Than Expected</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RJE_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d06db95-4fe7-495e-807e-289286fdebcf_898x407.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RJE_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d06db95-4fe7-495e-807e-289286fdebcf_898x407.png 424w, https://substackcdn.com/image/fetch/$s_!RJE_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d06db95-4fe7-495e-807e-289286fdebcf_898x407.png 848w, https://substackcdn.com/image/fetch/$s_!RJE_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d06db95-4fe7-495e-807e-289286fdebcf_898x407.png 1272w, https://substackcdn.com/image/fetch/$s_!RJE_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d06db95-4fe7-495e-807e-289286fdebcf_898x407.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RJE_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d06db95-4fe7-495e-807e-289286fdebcf_898x407.png" width="898" height="407" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0d06db95-4fe7-495e-807e-289286fdebcf_898x407.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:407,&quot;width&quot;:898,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:61662,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d06db95-4fe7-495e-807e-289286fdebcf_898x407.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RJE_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d06db95-4fe7-495e-807e-289286fdebcf_898x407.png 424w, https://substackcdn.com/image/fetch/$s_!RJE_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d06db95-4fe7-495e-807e-289286fdebcf_898x407.png 848w, https://substackcdn.com/image/fetch/$s_!RJE_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d06db95-4fe7-495e-807e-289286fdebcf_898x407.png 1272w, https://substackcdn.com/image/fetch/$s_!RJE_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d06db95-4fe7-495e-807e-289286fdebcf_898x407.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>All three levels: 45/50 in Run 2. Tokens: 2,554 (full) &#8594; 2,315 (medium) &#8594; 1,955 (minimal).</strong></p><p>TASK.md was stripped in three stages: full (five named studies with findings, audience profile, evidence standard), medium (study names with key numbers only, evidence standard), and minimal (four study name anchors, two lines). The prediction, drawn from the ACE paper&#8217;s brevity bias finding, was that minimal would cause quality collapse.</p><p>It did not. All three levels scored identically at 45/50 in Run 2, and minimal had scored 50/50 in Run 1. The ACE paper&#8217;s brevity bias is real &#8212; but it applies below the capability level of a frontier-scale model on a well-documented topic. At 120B parameters, McKinsey (2017), WEF Future of Jobs (2023), Autor et al. (2022), and Acemoglu and Restrepo (2022) are already in the model&#8217;s weights. The TASK.md prose was adding formatting overhead, not knowledge signal.</p><blockquote><p><em>The cliff edge exists. It sits below four study-name anchors for a 120B model on this topic. Knowing where the cliff is changes every downstream architectural decision &#8212; you cannot find it without running the ablation.</em></p></blockquote><p><strong>What holds for production: </strong>Run E1&#8217;s compression ablation on any new task before building the retry loop. The minimum viable context level is a task-model pair property. It is not universal, but for capable models on documented topics it is consistently lower than intuition suggests. Every token saved in context injection is a permanent per-task saving that compounds at scale.</p><h3>E2 &#8212; Schema Compression: The One Clean Win</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NdeB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cbd75df-524a-4ae8-9f91-826ed9cf0ac2_899x407.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NdeB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cbd75df-524a-4ae8-9f91-826ed9cf0ac2_899x407.png 424w, https://substackcdn.com/image/fetch/$s_!NdeB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cbd75df-524a-4ae8-9f91-826ed9cf0ac2_899x407.png 848w, https://substackcdn.com/image/fetch/$s_!NdeB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cbd75df-524a-4ae8-9f91-826ed9cf0ac2_899x407.png 1272w, https://substackcdn.com/image/fetch/$s_!NdeB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cbd75df-524a-4ae8-9f91-826ed9cf0ac2_899x407.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NdeB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cbd75df-524a-4ae8-9f91-826ed9cf0ac2_899x407.png" width="899" height="407" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6cbd75df-524a-4ae8-9f91-826ed9cf0ac2_899x407.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:407,&quot;width&quot;:899,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:54283,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cbd75df-524a-4ae8-9f91-826ed9cf0ac2_899x407.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NdeB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cbd75df-524a-4ae8-9f91-826ed9cf0ac2_899x407.png 424w, https://substackcdn.com/image/fetch/$s_!NdeB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cbd75df-524a-4ae8-9f91-826ed9cf0ac2_899x407.png 848w, https://substackcdn.com/image/fetch/$s_!NdeB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cbd75df-524a-4ae8-9f91-826ed9cf0ac2_899x407.png 1272w, https://substackcdn.com/image/fetch/$s_!NdeB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cbd75df-524a-4ae8-9f91-826ed9cf0ac2_899x407.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Run 2: 45/50 at 2,019 tokens. Output tokens: 348. Baseline output tokens: ~600+. Reduction: ~42%.</strong></p><p>The compact table format replaced prose output with a structured grid: one row per claim, THESIS and COUNTER and GAPS as labelled single-line fields. Output token count dropped from roughly 600 to 348 &#8212; a 42% reduction. Total tokens dropped to 2,019, beating baseline by 646 (24%).</p><p>This experiment confirmed its prediction exactly. Output tokens cost the same as input tokens on most providers; Anthropic charges 5&#215; more for output than input on Haiku. A harness that produces the same analytical content in fewer output tokens reduces per-task cost without changing the input side at all.</p><p><strong>The one quality note: </strong>The counterargument dimension is where compact schemas are most vulnerable. A one-cell table entry may contain the correct opposing argument without conveying it with the nuance a prose paragraph provides. In Run 2, E2 scored 45/50 &#8212; the 5-point gap versus baseline is consistent with a slightly weakened counterargument rather than a structural failure. For tasks where the counterargument is analytically critical, test compact schema specifically on that dimension before deploying.</p><p><strong>What holds for production: </strong>Schema compression is the lowest-risk efficiency intervention. It changes output format without touching model selection, context injection, or verification logic. Apply it first. It is the only experiment that beat baseline in total token count without any quality reduction beyond verifier variance.</p><h3>E3 &#8212; Model Routing: The Capability Floor Problem</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zz4L!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feab10103-1dcd-45b5-a379-f89c9d4dad06_899x345.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zz4L!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feab10103-1dcd-45b5-a379-f89c9d4dad06_899x345.png 424w, https://substackcdn.com/image/fetch/$s_!zz4L!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feab10103-1dcd-45b5-a379-f89c9d4dad06_899x345.png 848w, https://substackcdn.com/image/fetch/$s_!zz4L!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feab10103-1dcd-45b5-a379-f89c9d4dad06_899x345.png 1272w, https://substackcdn.com/image/fetch/$s_!zz4L!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feab10103-1dcd-45b5-a379-f89c9d4dad06_899x345.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zz4L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feab10103-1dcd-45b5-a379-f89c9d4dad06_899x345.png" width="899" height="345" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eab10103-1dcd-45b5-a379-f89c9d4dad06_899x345.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:345,&quot;width&quot;:899,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:45410,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feab10103-1dcd-45b5-a379-f89c9d4dad06_899x345.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zz4L!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feab10103-1dcd-45b5-a379-f89c9d4dad06_899x345.png 424w, https://substackcdn.com/image/fetch/$s_!zz4L!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feab10103-1dcd-45b5-a379-f89c9d4dad06_899x345.png 848w, https://substackcdn.com/image/fetch/$s_!zz4L!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feab10103-1dcd-45b5-a379-f89c9d4dad06_899x345.png 1272w, https://substackcdn.com/image/fetch/$s_!zz4L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feab10103-1dcd-45b5-a379-f89c9d4dad06_899x345.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Run 1: 30/50. Run 2: 35/50. Tokens: 2,327 &#8594; 3,265. Generator: llama-3.1-8b.</strong></p><p>H7 in the ASCRS Harness Lab was the most practically efficient architecture: routing cheap classification sub-tasks to a lighter model and synthesis to the capable model produced &#945; = 0.900 at only 26,635 tokens. E3 applied the same principle &#8212; route generation to LIGHT_MODEL, verification to VERIFIER_MODEL.</p><p>Both runs produced quality collapse. Upgrading from 3B to 8B moved the score from 30 to 35 &#8212; improvement in the right direction, but still 15 points below baseline, and token count rose rather than fell because the 8B model produced more verbose output without matching the schema precision of 120B.</p><p>The diagnosis is important: this is not <strong>a failure of the routing principle. It is a failure of model selection within that principle</strong>. The ASCRS H7 result used models much closer in capability &#8212; the cheap model was still capable of its specific sub-task. In this experiment, the generation task (structured analytical brief with named evidence and calibrated confidence) is a reasoning task dressed as a fluency task. <strong>The 120B-to-8B gap of roughly 40&#215; exceeds the capability threshold for this task type.</strong> Note: Read the Postnote for how this was revised and corrected!</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Iy0P!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe724194a-67ca-446e-9af8-99c9de134ad2_785x153.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Iy0P!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe724194a-67ca-446e-9af8-99c9de134ad2_785x153.png 424w, https://substackcdn.com/image/fetch/$s_!Iy0P!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe724194a-67ca-446e-9af8-99c9de134ad2_785x153.png 848w, https://substackcdn.com/image/fetch/$s_!Iy0P!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe724194a-67ca-446e-9af8-99c9de134ad2_785x153.png 1272w, https://substackcdn.com/image/fetch/$s_!Iy0P!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe724194a-67ca-446e-9af8-99c9de134ad2_785x153.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Iy0P!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe724194a-67ca-446e-9af8-99c9de134ad2_785x153.png" width="785" height="153" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e724194a-67ca-446e-9af8-99c9de134ad2_785x153.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:153,&quot;width&quot;:785,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24453,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe724194a-67ca-446e-9af8-99c9de134ad2_785x153.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Iy0P!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe724194a-67ca-446e-9af8-99c9de134ad2_785x153.png 424w, https://substackcdn.com/image/fetch/$s_!Iy0P!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe724194a-67ca-446e-9af8-99c9de134ad2_785x153.png 848w, https://substackcdn.com/image/fetch/$s_!Iy0P!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe724194a-67ca-446e-9af8-99c9de134ad2_785x153.png 1272w, https://substackcdn.com/image/fetch/$s_!Iy0P!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe724194a-67ca-446e-9af8-99c9de134ad2_785x153.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><strong>What holds for production: </strong>Model routing is sound in principle and proven at ASCRS scale. The precondition is empirical capability validation of the light model on the exact task, not assumption based on parameter count. Run a single test call with your system prompt before building the routing layer.</p><h3>E4 &#8212; Conditional Context: A Design Flaw, Not a Model Failure</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jIEd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde43da03-640e-4e68-8274-9e87df29ce73_899x326.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jIEd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde43da03-640e-4e68-8274-9e87df29ce73_899x326.png 424w, https://substackcdn.com/image/fetch/$s_!jIEd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde43da03-640e-4e68-8274-9e87df29ce73_899x326.png 848w, https://substackcdn.com/image/fetch/$s_!jIEd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde43da03-640e-4e68-8274-9e87df29ce73_899x326.png 1272w, https://substackcdn.com/image/fetch/$s_!jIEd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde43da03-640e-4e68-8274-9e87df29ce73_899x326.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jIEd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde43da03-640e-4e68-8274-9e87df29ce73_899x326.png" width="899" height="326" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/de43da03-640e-4e68-8274-9e87df29ce73_899x326.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:326,&quot;width&quot;:899,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:27060,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde43da03-640e-4e68-8274-9e87df29ce73_899x326.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jIEd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde43da03-640e-4e68-8274-9e87df29ce73_899x326.png 424w, https://substackcdn.com/image/fetch/$s_!jIEd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde43da03-640e-4e68-8274-9e87df29ce73_899x326.png 848w, https://substackcdn.com/image/fetch/$s_!jIEd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde43da03-640e-4e68-8274-9e87df29ce73_899x326.png 1272w, https://substackcdn.com/image/fetch/$s_!jIEd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde43da03-640e-4e68-8274-9e87df29ce73_899x326.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eq6S!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefbc1eda-178c-4bb5-bf72-19bdd2b9a0a6_898x182.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eq6S!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefbc1eda-178c-4bb5-bf72-19bdd2b9a0a6_898x182.png 424w, https://substackcdn.com/image/fetch/$s_!eq6S!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefbc1eda-178c-4bb5-bf72-19bdd2b9a0a6_898x182.png 848w, https://substackcdn.com/image/fetch/$s_!eq6S!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefbc1eda-178c-4bb5-bf72-19bdd2b9a0a6_898x182.png 1272w, https://substackcdn.com/image/fetch/$s_!eq6S!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefbc1eda-178c-4bb5-bf72-19bdd2b9a0a6_898x182.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eq6S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefbc1eda-178c-4bb5-bf72-19bdd2b9a0a6_898x182.png" width="898" height="182" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/efbc1eda-178c-4bb5-bf72-19bdd2b9a0a6_898x182.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:182,&quot;width&quot;:898,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:22878,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefbc1eda-178c-4bb5-bf72-19bdd2b9a0a6_898x182.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eq6S!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefbc1eda-178c-4bb5-bf72-19bdd2b9a0a6_898x182.png 424w, https://substackcdn.com/image/fetch/$s_!eq6S!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefbc1eda-178c-4bb5-bf72-19bdd2b9a0a6_898x182.png 848w, https://substackcdn.com/image/fetch/$s_!eq6S!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefbc1eda-178c-4bb5-bf72-19bdd2b9a0a6_898x182.png 1272w, https://substackcdn.com/image/fetch/$s_!eq6S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefbc1eda-178c-4bb5-bf72-19bdd2b9a0a6_898x182.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><strong>Both runs: task_injected, score 45/50, highest token cost (3,318 in Run 2, 3 API calls).</strong></p><p>The conditional context experiment runs a probe call first. If the probe self-supplies the named studies, TASK.md injection is skipped. If not, TASK.md is injected and generation proceeds. The prediction was that a capable model would self-supply named evidence on 20&#8211;40% of runs, saving the full context injection cost on those runs.</p><p>The probe always failed in both runs &#8212; always falling back to task_injected. The Run 1 failure was attributed to using the 3B LIGHT_MODEL as the probe. Run 2 fixed this by using DEFAULT_MODEL (120B) for the probe. It still always took the task_injected path.</p><p>The actual cause is gate-prompt misalignment, independent of model choice. The probe runs with no context &#8212; no TASK.md, no anchors. The 120B model produces valid automation research, but cites it under phrasings that don&#8217;t match the keyword gate: &#8220;McKinsey Global Institute&#8221; as &#8220;McKinsey&#8221; passes; &#8220;Future of Jobs&#8221; without &#8220;WEF&#8221; fails; Acemoglu cited as &#8220;MIT researchers&#8221; fails. The string match gate checks for specific names the model has no reason to use in their exact form without a prompt anchoring them.</p><p>The corrected design would give the probe the minimal anchors &#8212; the same four study names from TASK_minimal.md &#8212; and check whether the model cited them correctly. This becomes a citation fidelity check rather than a free recall check. E1_minimal demonstrated that once the model has the anchors, it uses them reliably. The probe needs the same anchors.</p><p><strong>What holds for production: </strong>Conditional context injection is valid &#8212; Microsoft&#8217;s Azure SRE Agent progressive context discovery is built on this principle. The precondition is gate-prompt alignment: the gate must check for things the probe was given a reason to produce. Without alignment, the probe always fails regardless of model capability.</p><h3>E5 &#8212; Early Exit: Confirmed Without Qualification</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ztIg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6c34ef-3840-4d7b-9642-5383ffd484fb_897x324.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ztIg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6c34ef-3840-4d7b-9642-5383ffd484fb_897x324.png 424w, https://substackcdn.com/image/fetch/$s_!ztIg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6c34ef-3840-4d7b-9642-5383ffd484fb_897x324.png 848w, https://substackcdn.com/image/fetch/$s_!ztIg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6c34ef-3840-4d7b-9642-5383ffd484fb_897x324.png 1272w, https://substackcdn.com/image/fetch/$s_!ztIg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6c34ef-3840-4d7b-9642-5383ffd484fb_897x324.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ztIg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6c34ef-3840-4d7b-9642-5383ffd484fb_897x324.png" width="897" height="324" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7d6c34ef-3840-4d7b-9642-5383ffd484fb_897x324.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:324,&quot;width&quot;:897,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:44739,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6c34ef-3840-4d7b-9642-5383ffd484fb_897x324.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ztIg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6c34ef-3840-4d7b-9642-5383ffd484fb_897x324.png 424w, https://substackcdn.com/image/fetch/$s_!ztIg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6c34ef-3840-4d7b-9642-5383ffd484fb_897x324.png 848w, https://substackcdn.com/image/fetch/$s_!ztIg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6c34ef-3840-4d7b-9642-5383ffd484fb_897x324.png 1272w, https://substackcdn.com/image/fetch/$s_!ztIg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6c34ef-3840-4d7b-9642-5383ffd484fb_897x324.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Both runs: first attempt scores 45/50, early exit fires, retry loop never activates.</strong></p><p>The early exit experiment adds one rule to the feedback loop: if the first attempt clears the quality threshold, do not run retries. The prediction was that this would save 1&#8211;2 verifier calls per run when quality is already good.</p><p>Both runs confirmed this exactly. The first attempt cleared threshold in every E5 run. The retry budget was permanently unused overhead. Against a V2+V3 baseline with no retry loop built in, the absolute token saving is modest &#8212; E5 costs slightly more than baseline because it has the loop infrastructure even when it does not fire. But the principle has no downside.</p><p>The value of early exit compounds with harness complexity. In a V5-scale harness with a 2-retry budget, early exit eliminates up to 4 extra API calls per task on runs that pass threshold on attempt 1 &#8212; which is the majority of runs on a well-specified task with a capable model. Against an 8,000-token V5 baseline, the same early exit that saved almost nothing here could save 2,000&#8211;3,000 tokens per task.</p><p><strong>What holds for production: </strong>Add early exit to every retry loop unconditionally. The cost is one conditional check. There is no configuration where it makes things worse. In more complex harnesses where it has room to save, it will.</p><h3>E6 &#8212; Verifier Compression: The Strongest Single Finding</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KkwR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80cf994a-5a5d-4fcc-bf6f-28e13b0a9199_901x388.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KkwR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80cf994a-5a5d-4fcc-bf6f-28e13b0a9199_901x388.png 424w, https://substackcdn.com/image/fetch/$s_!KkwR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80cf994a-5a5d-4fcc-bf6f-28e13b0a9199_901x388.png 848w, https://substackcdn.com/image/fetch/$s_!KkwR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80cf994a-5a5d-4fcc-bf6f-28e13b0a9199_901x388.png 1272w, https://substackcdn.com/image/fetch/$s_!KkwR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80cf994a-5a5d-4fcc-bf6f-28e13b0a9199_901x388.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KkwR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80cf994a-5a5d-4fcc-bf6f-28e13b0a9199_901x388.png" width="901" height="388" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/80cf994a-5a5d-4fcc-bf6f-28e13b0a9199_901x388.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:388,&quot;width&quot;:901,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:51023,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80cf994a-5a5d-4fcc-bf6f-28e13b0a9199_901x388.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KkwR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80cf994a-5a5d-4fcc-bf6f-28e13b0a9199_901x388.png 424w, https://substackcdn.com/image/fetch/$s_!KkwR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80cf994a-5a5d-4fcc-bf6f-28e13b0a9199_901x388.png 848w, https://substackcdn.com/image/fetch/$s_!KkwR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80cf994a-5a5d-4fcc-bf6f-28e13b0a9199_901x388.png 1272w, https://substackcdn.com/image/fetch/$s_!KkwR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80cf994a-5a5d-4fcc-bf6f-28e13b0a9199_901x388.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Run 2: approx 40/50 at 2,091 tokens. Verifier token reduction: ~96%. Total reduction: 574 tokens (22%).</strong></p><p>The full five-dimension verifier returns five scored dimensions, five improvement notes, and a total &#8212; roughly 500 output tokens per verification call. E6 replaces it with a binary pass/fail gate on two criteria: does the brief contain at least two named dated studies, and does it have a single falsifiable thesis? The compact system prompt caps the response at 150 tokens.</p><p>The prediction was 60&#8211;70% verifier token reduction. The actual reduction was approximately 96% &#8212; the compact gate produced near-minimal output because the binary questions admit short answers. Total token reduction was 574 across the full run (22%), consistent across both runs (Run 1: 506 tokens, Run 2: 574 tokens).</p><p>The brief quality in E6&#8217;s output was high &#8212; three named studies, calibrated confidence ratings with explicit reasons, specific actionable gaps. The binary gate correctly passed it. The approximate score of 40/50 reflects the mapping from binary to numeric (both-pass = 40) rather than genuine quality loss.</p><blockquote><p><em>E6 is the only finding in this lab that gets stronger as the harness gets more complex. In a V5 harness where the verifier runs on every retry &#8212; potentially three times per task &#8212; a 96% verifier reduction saves roughly 1,500 tokens per task. At 1,000 tasks per day, that is 1.5 million tokens saved daily without touching quality.</em></p></blockquote><p><strong>What holds for production: </strong>Once generation quality is stable, replace the full multi-dimension verifier with a compact gate for production runs. Use the full verifier for periodic audits and calibration, not for every call. E6 is the highest single-lever efficiency finding in this lab and scales with harness complexity rather than against it.</p><h3>E7 &#8212; Composite Best: When Combining Losers Produces the Worst Result?</h3><p>This is note entirely accurate. But I will maintain the conclusion - because a simple fix will reverse the outcome. Again, the third run, and the Postnote refer.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4QKB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb077c3d4-4973-45cf-a795-faf79ecf9575_899x430.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4QKB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb077c3d4-4973-45cf-a795-faf79ecf9575_899x430.png 424w, https://substackcdn.com/image/fetch/$s_!4QKB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb077c3d4-4973-45cf-a795-faf79ecf9575_899x430.png 848w, https://substackcdn.com/image/fetch/$s_!4QKB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb077c3d4-4973-45cf-a795-faf79ecf9575_899x430.png 1272w, https://substackcdn.com/image/fetch/$s_!4QKB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb077c3d4-4973-45cf-a795-faf79ecf9575_899x430.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4QKB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb077c3d4-4973-45cf-a795-faf79ecf9575_899x430.png" width="899" height="430" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b077c3d4-4973-45cf-a795-faf79ecf9575_899x430.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:430,&quot;width&quot;:899,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:56546,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb077c3d4-4973-45cf-a795-faf79ecf9575_899x430.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4QKB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb077c3d4-4973-45cf-a795-faf79ecf9575_899x430.png 424w, https://substackcdn.com/image/fetch/$s_!4QKB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb077c3d4-4973-45cf-a795-faf79ecf9575_899x430.png 848w, https://substackcdn.com/image/fetch/$s_!4QKB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb077c3d4-4973-45cf-a795-faf79ecf9575_899x430.png 1272w, https://substackcdn.com/image/fetch/$s_!4QKB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb077c3d4-4973-45cf-a795-faf79ecf9575_899x430.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Run 2: 25/50 at 4,974 tokens. Attempt 1 scored 0/50. Worst run in both sessions.</strong></p><p>E7 combined the intended best element from each experiment: medium TASK.md compression (from E1), compact output schema (from E2), LIGHT_MODEL generation (from E3), and early exit (from E5). The prediction was the highest composite score and a 50%+ token saving versus baseline.</p><p>Attempt 1 returned 0/50 &#8212; the 8B model with TASK_minimal.md and a compact output schema produced invalid output, triggering a retry. Attempt 2 recovered to 25/50. Four API calls at 4,974 tokens is the highest cost of any run across both sessions combined. The predicted best became the demonstrated worst.</p><p>The cause is a critical interaction effect that was not visible until E7 ran. TASK_minimal.md works for the 120B DEFAULT model &#8212; E1_minimal scored 50/50 in Run 1 and 45/50 in Run 2. But E3 had already shown that the 8B LIGHT_MODEL produces quality collapse even with full context. E7 combined minimal context with the model that had already demonstrated failure at full context. The Fix 3 correction (TASK_medium &#8594; TASK_minimal in E7) made the combination worse, not better.</p><p>Note: Hence the third run - see Postnote.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LodA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704fcb60-3e9c-4655-9f9a-fb1db778f794_900x195.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LodA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704fcb60-3e9c-4655-9f9a-fb1db778f794_900x195.png 424w, https://substackcdn.com/image/fetch/$s_!LodA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704fcb60-3e9c-4655-9f9a-fb1db778f794_900x195.png 848w, https://substackcdn.com/image/fetch/$s_!LodA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704fcb60-3e9c-4655-9f9a-fb1db778f794_900x195.png 1272w, https://substackcdn.com/image/fetch/$s_!LodA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704fcb60-3e9c-4655-9f9a-fb1db778f794_900x195.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LodA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704fcb60-3e9c-4655-9f9a-fb1db778f794_900x195.png" width="900" height="195" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/704fcb60-3e9c-4655-9f9a-fb1db778f794_900x195.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:195,&quot;width&quot;:900,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:46237,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704fcb60-3e9c-4655-9f9a-fb1db778f794_900x195.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LodA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704fcb60-3e9c-4655-9f9a-fb1db778f794_900x195.png 424w, https://substackcdn.com/image/fetch/$s_!LodA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704fcb60-3e9c-4655-9f9a-fb1db778f794_900x195.png 848w, https://substackcdn.com/image/fetch/$s_!LodA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704fcb60-3e9c-4655-9f9a-fb1db778f794_900x195.png 1272w, https://substackcdn.com/image/fetch/$s_!LodA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704fcb60-3e9c-4655-9f9a-fb1db778f794_900x195.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><strong>What holds for production: </strong>Do not combine optimisation strategies before validating each strategy&#8217;s preconditions independently. E3&#8217;s failure was known before E7 ran. E7 should have either used a capable LIGHT_MODEL or reverted to DEFAULT for generation. The composite-best experiment is only valid when all its components are individually valid.</p><h2>The Full Results: Both Runs</h2><h3>Run 1 &#8212; LIGHT_MODEL: llama-3.2-3b</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Cr49!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e238c5-9e02-48ad-8588-c48b5fe4594e_898x410.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Cr49!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e238c5-9e02-48ad-8588-c48b5fe4594e_898x410.png 424w, https://substackcdn.com/image/fetch/$s_!Cr49!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e238c5-9e02-48ad-8588-c48b5fe4594e_898x410.png 848w, https://substackcdn.com/image/fetch/$s_!Cr49!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e238c5-9e02-48ad-8588-c48b5fe4594e_898x410.png 1272w, https://substackcdn.com/image/fetch/$s_!Cr49!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e238c5-9e02-48ad-8588-c48b5fe4594e_898x410.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Cr49!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e238c5-9e02-48ad-8588-c48b5fe4594e_898x410.png" width="898" height="410" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/37e238c5-9e02-48ad-8588-c48b5fe4594e_898x410.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:410,&quot;width&quot;:898,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:62520,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e238c5-9e02-48ad-8588-c48b5fe4594e_898x410.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Cr49!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e238c5-9e02-48ad-8588-c48b5fe4594e_898x410.png 424w, https://substackcdn.com/image/fetch/$s_!Cr49!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e238c5-9e02-48ad-8588-c48b5fe4594e_898x410.png 848w, https://substackcdn.com/image/fetch/$s_!Cr49!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e238c5-9e02-48ad-8588-c48b5fe4594e_898x410.png 1272w, https://substackcdn.com/image/fetch/$s_!Cr49!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e238c5-9e02-48ad-8588-c48b5fe4594e_898x410.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Run 1 composite winner: e1_minimal (0.7725). The 3B light model made E3 and E7 unreliable, but the compression experiments and the verifier strategies produced clean results. Baseline scored 45/50 &#8212; one point below the quality ceiling, leaving room for E1_minimal&#8217;s 50/50 to win composite through quality alone.</p><h3>Run 2 &#8212; LIGHT_MODEL: llama-3.1-8b</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fjsU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa850fd49-11cc-4fc1-944e-decdd97d0feb_895x409.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fjsU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa850fd49-11cc-4fc1-944e-decdd97d0feb_895x409.png 424w, https://substackcdn.com/image/fetch/$s_!fjsU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa850fd49-11cc-4fc1-944e-decdd97d0feb_895x409.png 848w, https://substackcdn.com/image/fetch/$s_!fjsU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa850fd49-11cc-4fc1-944e-decdd97d0feb_895x409.png 1272w, https://substackcdn.com/image/fetch/$s_!fjsU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa850fd49-11cc-4fc1-944e-decdd97d0feb_895x409.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fjsU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa850fd49-11cc-4fc1-944e-decdd97d0feb_895x409.png" width="895" height="409" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a850fd49-11cc-4fc1-944e-decdd97d0feb_895x409.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:409,&quot;width&quot;:895,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:64232,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa850fd49-11cc-4fc1-944e-decdd97d0feb_895x409.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fjsU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa850fd49-11cc-4fc1-944e-decdd97d0feb_895x409.png 424w, https://substackcdn.com/image/fetch/$s_!fjsU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa850fd49-11cc-4fc1-944e-decdd97d0feb_895x409.png 848w, https://substackcdn.com/image/fetch/$s_!fjsU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa850fd49-11cc-4fc1-944e-decdd97d0feb_895x409.png 1272w, https://substackcdn.com/image/fetch/$s_!fjsU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa850fd49-11cc-4fc1-944e-decdd97d0feb_895x409.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Run 2 composite winner: baseline (0.8309). Baseline scored 50/50 &#8212; at quality ceiling. No experiment held 50/50, so the 5-point quality gap (45 vs 50) exceeded every efficiency saving produced. E2 came closest at 0.8096. E7 collapsed to 25/50 at the highest token cost in both sessions.</p><h3>Cross-Run Comparison</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Xav5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e4331e-70dc-4eb3-a511-b799e294b2d2_897x188.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Xav5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e4331e-70dc-4eb3-a511-b799e294b2d2_897x188.png 424w, https://substackcdn.com/image/fetch/$s_!Xav5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e4331e-70dc-4eb3-a511-b799e294b2d2_897x188.png 848w, https://substackcdn.com/image/fetch/$s_!Xav5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e4331e-70dc-4eb3-a511-b799e294b2d2_897x188.png 1272w, https://substackcdn.com/image/fetch/$s_!Xav5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e4331e-70dc-4eb3-a511-b799e294b2d2_897x188.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Xav5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e4331e-70dc-4eb3-a511-b799e294b2d2_897x188.png" width="897" height="188" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/97e4331e-70dc-4eb3-a511-b799e294b2d2_897x188.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:188,&quot;width&quot;:897,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:21699,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e4331e-70dc-4eb3-a511-b799e294b2d2_897x188.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Xav5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e4331e-70dc-4eb3-a511-b799e294b2d2_897x188.png 424w, https://substackcdn.com/image/fetch/$s_!Xav5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e4331e-70dc-4eb3-a511-b799e294b2d2_897x188.png 848w, https://substackcdn.com/image/fetch/$s_!Xav5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e4331e-70dc-4eb3-a511-b799e294b2d2_897x188.png 1272w, https://substackcdn.com/image/fetch/$s_!Xav5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e4331e-70dc-4eb3-a511-b799e294b2d2_897x188.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IbF0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34400eb2-7571-4d5a-aa65-92269ccd9a05_899x617.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IbF0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34400eb2-7571-4d5a-aa65-92269ccd9a05_899x617.png 424w, https://substackcdn.com/image/fetch/$s_!IbF0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34400eb2-7571-4d5a-aa65-92269ccd9a05_899x617.png 848w, https://substackcdn.com/image/fetch/$s_!IbF0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34400eb2-7571-4d5a-aa65-92269ccd9a05_899x617.png 1272w, https://substackcdn.com/image/fetch/$s_!IbF0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34400eb2-7571-4d5a-aa65-92269ccd9a05_899x617.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IbF0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34400eb2-7571-4d5a-aa65-92269ccd9a05_899x617.png" width="899" height="617" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/34400eb2-7571-4d5a-aa65-92269ccd9a05_899x617.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:617,&quot;width&quot;:899,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:71691,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34400eb2-7571-4d5a-aa65-92269ccd9a05_899x617.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!IbF0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34400eb2-7571-4d5a-aa65-92269ccd9a05_899x617.png 424w, https://substackcdn.com/image/fetch/$s_!IbF0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34400eb2-7571-4d5a-aa65-92269ccd9a05_899x617.png 848w, https://substackcdn.com/image/fetch/$s_!IbF0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34400eb2-7571-4d5a-aa65-92269ccd9a05_899x617.png 1272w, https://substackcdn.com/image/fetch/$s_!IbF0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34400eb2-7571-4d5a-aa65-92269ccd9a05_899x617.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The most important pattern in the cross-run table: the experiments that moved between runs (baseline, e1_minimal, e1_medium, e7) all moved by exactly 5 points &#8212; the same magnitude as the verifier variance between sessions. The experiments that held flat (e2, e4, e5, e6) are the most reliable signals in the lab. E6 in particular is the most consistent result: 40/50, ~22% token reduction, identical across both runs.</p><p>E3 and E7 improved or worsened by more than 5 points (E3: +5, E7: &#8722;15) &#8212; changes larger than verifier noise, meaning they reflect real design differences rather than measurement variance. E3&#8217;s improvement from 30 to 35 reflects the 3B&#8594;8B upgrade. E7&#8217;s collapse from 40 to 25 reflects the TASK_medium&#8594;TASK_minimal change breaking the 8B model.</p><h2>Prediction Scorecard &#8212; Run 2</h2><p>The table below records Run 2 predictions, actuals, and verdicts against the original expectations. E3 and E7 &#8212; both contradicted in Run 2 &#8212; were retested in Run 3 with a 30B light model. The full three-run scorecard, including Run 3 corrections, appears in the Post Note before the references.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Dtq0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92c6d783-28e6-4412-99b5-35b170871b7f_898x497.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Dtq0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92c6d783-28e6-4412-99b5-35b170871b7f_898x497.png 424w, https://substackcdn.com/image/fetch/$s_!Dtq0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92c6d783-28e6-4412-99b5-35b170871b7f_898x497.png 848w, https://substackcdn.com/image/fetch/$s_!Dtq0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92c6d783-28e6-4412-99b5-35b170871b7f_898x497.png 1272w, https://substackcdn.com/image/fetch/$s_!Dtq0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92c6d783-28e6-4412-99b5-35b170871b7f_898x497.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Dtq0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92c6d783-28e6-4412-99b5-35b170871b7f_898x497.png" width="898" height="497" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/92c6d783-28e6-4412-99b5-35b170871b7f_898x497.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:497,&quot;width&quot;:898,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:93658,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92c6d783-28e6-4412-99b5-35b170871b7f_898x497.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Dtq0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92c6d783-28e6-4412-99b5-35b170871b7f_898x497.png 424w, https://substackcdn.com/image/fetch/$s_!Dtq0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92c6d783-28e6-4412-99b5-35b170871b7f_898x497.png 848w, https://substackcdn.com/image/fetch/$s_!Dtq0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92c6d783-28e6-4412-99b5-35b170871b7f_898x497.png 1272w, https://substackcdn.com/image/fetch/$s_!Dtq0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92c6d783-28e6-4412-99b5-35b170871b7f_898x497.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Three of seven predictions confirmed or exceeded in Run 2. The two most central predictions &#8212; E1_minimal collapse and E7 winning composite &#8212; both failed in opposite directions. E1_minimal proved stronger than predicted (no issues when dealing with a capable model); E7 proved far weaker. E3 and E7&#8217;s contradicted verdicts were later reversed in Run 3 once the light model was corrected. The full picture requires all three runs read together.</p><h2>What the Two Runs Reveal Together</h2><h3>The Composite Metric Is Correct but Noisy at This Scale</h3><p>Run 1 composite winner: E1_minimal (0.7725). Run 2 composite winner: baseline (0.8309). Same experiments, same models, different winners &#8212; caused by a 5-point baseline quality variance between sessions. The composite metric is working correctly; the input data is noisy because each configuration ran once.</p><p>The solution is not to distrust the metric &#8212; it is to run each configuration three times and average. The metric correctly identified E7 as the worst design in both runs (0.35), correctly penalised E4&#8217;s probe overhead, and correctly showed E2 as the closest challenger to baseline. The ranking is directionally right. The cardinal values are noisy.</p><h3>The Token Band Is Too Narrow for This Baseline</h3><p>The efficiency term can swing 0.15 across all runs in this experiment. Quality can swing 0.07 from a single verifier variance event. This means verifier noise can flip the composite winner. Against a V5 baseline of 6,000&#8211;8,000 tokens, E6&#8217;s 500-token saving represents a 6&#8211;8% efficiency gain (efficiency term swing: ~0.06). Against this V2+V3 baseline of 2,665 tokens, the same saving is a 22% gain (swing: ~0.10) but still insufficient to overcome a 5-point quality difference.</p><p>The subscription economics argument in the introduction of this article stands &#8212; but its force is felt at V5 scale, not V2+V3. The efficiency strategies here are correct. They need a more complex baseline to demonstrate their financial impact.</p><h3>What Changed Between Runs &#8212; and What Each Change Revealed</h3><p><strong>Fix 1 &#8212; 8B instead of 3B: </strong>E3 improved 30&#8594;35/50 but token cost rose 2,327&#8594;3,265. The 8B model is more verbose without being capable enough. The finding: for structured analytical generation requiring domain recall and calibrated confidence ratings, 8B is likely insufficient. The minimum viable LIGHT_MODEL for this task is somewhere above 8B &#8212; probably 30B+.</p><p><strong>Fix 2 &#8212; DEFAULT_MODEL probe in E4: </strong>Still always task_injected. The probe model was a red herring. The real flaw is gate-prompt misalignment. The gate checks for specific citation names the probe was never given a reason to use. A properly designed E4 gives the probe the minimal anchors first.</p><p><strong>Fix 3 &#8212; TASK_minimal in E7: </strong>Run 1 E7 scored 40/50 at 2,059 tokens. Run 2 E7 scored 25/50 at 4,974 tokens. The context change that helps the 120B model (minimal anchors sufficient) breaks the 8B model (needs full scaffolding). The lesson is the most important single finding across both runs: minimum viable context is model-dependent, not task-dependent.</p><h2>What Survives: Eight Production-Ready Lessons</h2><p>Each lesson below is drawn from the Token Tax lab results. Where ASCRS data independently confirms, partially confirms, or contextualises a finding, a note is added inline. The strongest lessons are the ones that appear in both labs unprompted &#8212; different task, different scale, same conclusion.</p><h4>1. Run the compression ablation before building the loop</h4><p>E1 is the experiment to run first on any new task. Strip context progressively. Find the floor. For frontier-scale models on documented topics, the floor is lower than intuition suggests. For smaller models or niche domains, it will be higher. Knowing the floor changes every downstream architectural decision &#8212; you cannot find it without running the ablation. This principle held across both runs without contradiction.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Je1f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc146b33c-22e1-466f-9a1d-b6b04612cf06_900x177.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Je1f!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc146b33c-22e1-466f-9a1d-b6b04612cf06_900x177.png 424w, https://substackcdn.com/image/fetch/$s_!Je1f!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc146b33c-22e1-466f-9a1d-b6b04612cf06_900x177.png 848w, https://substackcdn.com/image/fetch/$s_!Je1f!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc146b33c-22e1-466f-9a1d-b6b04612cf06_900x177.png 1272w, https://substackcdn.com/image/fetch/$s_!Je1f!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc146b33c-22e1-466f-9a1d-b6b04612cf06_900x177.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Je1f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc146b33c-22e1-466f-9a1d-b6b04612cf06_900x177.png" width="900" height="177" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c146b33c-22e1-466f-9a1d-b6b04612cf06_900x177.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:177,&quot;width&quot;:900,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:42649,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc146b33c-22e1-466f-9a1d-b6b04612cf06_900x177.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Je1f!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc146b33c-22e1-466f-9a1d-b6b04612cf06_900x177.png 424w, https://substackcdn.com/image/fetch/$s_!Je1f!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc146b33c-22e1-466f-9a1d-b6b04612cf06_900x177.png 848w, https://substackcdn.com/image/fetch/$s_!Je1f!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc146b33c-22e1-466f-9a1d-b6b04612cf06_900x177.png 1272w, https://substackcdn.com/image/fetch/$s_!Je1f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc146b33c-22e1-466f-9a1d-b6b04612cf06_900x177.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h4>2. Verifier compression is the highest single-lever efficiency gain</h4><p>E6&#8217;s 96% verifier token reduction held across both runs, was consistent across both sessions, and scales with harness complexity. In a V5 harness with a retry loop, the same compact gate applied to every verification step saves orders of magnitude more tokens than any architectural change. Use the full verifier for calibration and auditing. Use the compact gate for production runs once quality is stable.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZF6u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feee07af1-71f3-4820-86a3-11c7c002ff93_903x178.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZF6u!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feee07af1-71f3-4820-86a3-11c7c002ff93_903x178.png 424w, https://substackcdn.com/image/fetch/$s_!ZF6u!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feee07af1-71f3-4820-86a3-11c7c002ff93_903x178.png 848w, https://substackcdn.com/image/fetch/$s_!ZF6u!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feee07af1-71f3-4820-86a3-11c7c002ff93_903x178.png 1272w, https://substackcdn.com/image/fetch/$s_!ZF6u!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feee07af1-71f3-4820-86a3-11c7c002ff93_903x178.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZF6u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feee07af1-71f3-4820-86a3-11c7c002ff93_903x178.png" width="903" height="178" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eee07af1-71f3-4820-86a3-11c7c002ff93_903x178.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:178,&quot;width&quot;:903,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:41526,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feee07af1-71f3-4820-86a3-11c7c002ff93_903x178.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZF6u!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feee07af1-71f3-4820-86a3-11c7c002ff93_903x178.png 424w, https://substackcdn.com/image/fetch/$s_!ZF6u!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feee07af1-71f3-4820-86a3-11c7c002ff93_903x178.png 848w, https://substackcdn.com/image/fetch/$s_!ZF6u!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feee07af1-71f3-4820-86a3-11c7c002ff93_903x178.png 1272w, https://substackcdn.com/image/fetch/$s_!ZF6u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feee07af1-71f3-4820-86a3-11c7c002ff93_903x178.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h4>3. Early exit belongs in every retry loop &#8212; unconditionally</h4><p>E5 confirmed this in both runs with no exceptions. One conditional check. No downside. In more complex harnesses it saves significant token cost. Add it by default.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HsHF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e34884f-2474-4ef1-8352-d8f93632a92a_898x197.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HsHF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e34884f-2474-4ef1-8352-d8f93632a92a_898x197.png 424w, https://substackcdn.com/image/fetch/$s_!HsHF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e34884f-2474-4ef1-8352-d8f93632a92a_898x197.png 848w, https://substackcdn.com/image/fetch/$s_!HsHF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e34884f-2474-4ef1-8352-d8f93632a92a_898x197.png 1272w, https://substackcdn.com/image/fetch/$s_!HsHF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e34884f-2474-4ef1-8352-d8f93632a92a_898x197.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HsHF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e34884f-2474-4ef1-8352-d8f93632a92a_898x197.png" width="898" height="197" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3e34884f-2474-4ef1-8352-d8f93632a92a_898x197.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:197,&quot;width&quot;:898,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:46518,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e34884f-2474-4ef1-8352-d8f93632a92a_898x197.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HsHF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e34884f-2474-4ef1-8352-d8f93632a92a_898x197.png 424w, https://substackcdn.com/image/fetch/$s_!HsHF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e34884f-2474-4ef1-8352-d8f93632a92a_898x197.png 848w, https://substackcdn.com/image/fetch/$s_!HsHF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e34884f-2474-4ef1-8352-d8f93632a92a_898x197.png 1272w, https://substackcdn.com/image/fetch/$s_!HsHF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e34884f-2474-4ef1-8352-d8f93632a92a_898x197.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h4>4. Schema compression is the lowest-risk efficiency intervention</h4><p>E2 confirmed its prediction exactly in both runs. It changes output format without touching model selection, context injection, or verification logic. Apply it first when efficiency is a concern. Test the counterargument dimension specifically, as that is where compact formats lose nuance most readily.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oNVI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eac20ad-fa84-456e-83a0-b4ea3cd7e345_901x197.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oNVI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eac20ad-fa84-456e-83a0-b4ea3cd7e345_901x197.png 424w, https://substackcdn.com/image/fetch/$s_!oNVI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eac20ad-fa84-456e-83a0-b4ea3cd7e345_901x197.png 848w, https://substackcdn.com/image/fetch/$s_!oNVI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eac20ad-fa84-456e-83a0-b4ea3cd7e345_901x197.png 1272w, https://substackcdn.com/image/fetch/$s_!oNVI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eac20ad-fa84-456e-83a0-b4ea3cd7e345_901x197.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oNVI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eac20ad-fa84-456e-83a0-b4ea3cd7e345_901x197.png" width="901" height="197" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5eac20ad-fa84-456e-83a0-b4ea3cd7e345_901x197.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:197,&quot;width&quot;:901,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:45664,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eac20ad-fa84-456e-83a0-b4ea3cd7e345_901x197.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oNVI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eac20ad-fa84-456e-83a0-b4ea3cd7e345_901x197.png 424w, https://substackcdn.com/image/fetch/$s_!oNVI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eac20ad-fa84-456e-83a0-b4ea3cd7e345_901x197.png 848w, https://substackcdn.com/image/fetch/$s_!oNVI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eac20ad-fa84-456e-83a0-b4ea3cd7e345_901x197.png 1272w, https://substackcdn.com/image/fetch/$s_!oNVI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eac20ad-fa84-456e-83a0-b4ea3cd7e345_901x197.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h4>5. Model routing requires empirical capability validation, not assumptions</h4><p>E3&#8217;s failure across both runs does not invalidate the ASCRS H7 result &#8212; it contextualises it. The routing principle is sound. The precondition is knowing the capability floor for your specific task and ensuring LIGHT_MODEL sits above it. Run a test call with your actual system prompt before building the routing layer. Do not assume parameter count predicts task performance.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tQ6Z!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e693329-efc6-44a1-b9cb-d3a1ec254ff4_899x262.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tQ6Z!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e693329-efc6-44a1-b9cb-d3a1ec254ff4_899x262.png 424w, https://substackcdn.com/image/fetch/$s_!tQ6Z!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e693329-efc6-44a1-b9cb-d3a1ec254ff4_899x262.png 848w, https://substackcdn.com/image/fetch/$s_!tQ6Z!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e693329-efc6-44a1-b9cb-d3a1ec254ff4_899x262.png 1272w, https://substackcdn.com/image/fetch/$s_!tQ6Z!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e693329-efc6-44a1-b9cb-d3a1ec254ff4_899x262.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tQ6Z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e693329-efc6-44a1-b9cb-d3a1ec254ff4_899x262.png" width="899" height="262" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1e693329-efc6-44a1-b9cb-d3a1ec254ff4_899x262.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:262,&quot;width&quot;:899,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:73392,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e693329-efc6-44a1-b9cb-d3a1ec254ff4_899x262.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tQ6Z!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e693329-efc6-44a1-b9cb-d3a1ec254ff4_899x262.png 424w, https://substackcdn.com/image/fetch/$s_!tQ6Z!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e693329-efc6-44a1-b9cb-d3a1ec254ff4_899x262.png 848w, https://substackcdn.com/image/fetch/$s_!tQ6Z!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e693329-efc6-44a1-b9cb-d3a1ec254ff4_899x262.png 1272w, https://substackcdn.com/image/fetch/$s_!tQ6Z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e693329-efc6-44a1-b9cb-d3a1ec254ff4_899x262.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>6. Conditional context needs gate-prompt alignment</h4><p>E4&#8217;s design flaw was running a fully blind probe and checking for specific citation names the probe had no reason to use. The corrected design gives the probe the minimal anchors and checks for citation fidelity. This maps directly to the Microsoft Azure SRE Agent&#8217;s progressive context discovery pattern &#8212; check whether the model already has what it needs before providing more.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1ROb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0f89b8-dd18-456a-894d-79fee819b614_903x201.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1ROb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0f89b8-dd18-456a-894d-79fee819b614_903x201.png 424w, https://substackcdn.com/image/fetch/$s_!1ROb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0f89b8-dd18-456a-894d-79fee819b614_903x201.png 848w, https://substackcdn.com/image/fetch/$s_!1ROb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0f89b8-dd18-456a-894d-79fee819b614_903x201.png 1272w, https://substackcdn.com/image/fetch/$s_!1ROb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0f89b8-dd18-456a-894d-79fee819b614_903x201.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1ROb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0f89b8-dd18-456a-894d-79fee819b614_903x201.png" width="903" height="201" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/af0f89b8-dd18-456a-894d-79fee819b614_903x201.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:201,&quot;width&quot;:903,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:50020,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0f89b8-dd18-456a-894d-79fee819b614_903x201.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1ROb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0f89b8-dd18-456a-894d-79fee819b614_903x201.png 424w, https://substackcdn.com/image/fetch/$s_!1ROb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0f89b8-dd18-456a-894d-79fee819b614_903x201.png 848w, https://substackcdn.com/image/fetch/$s_!1ROb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0f89b8-dd18-456a-894d-79fee819b614_903x201.png 1272w, https://substackcdn.com/image/fetch/$s_!1ROb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0f89b8-dd18-456a-894d-79fee819b614_903x201.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h4>7. Single-run composite rankings are noisy in narrow token bands</h4><p>If your token range across all configurations is under 10&#215;, run each configuration at least three times and average the results. Single-run rankings in a 2.5&#215; token band are dominated by verifier variance. This is manageable in an experiment and a calibration problem in production. The composite metric is correct in structure; give it stable inputs.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UwZ2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6fdb6f6-29e2-4ce7-8ad7-cb8350dc00a7_902x224.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UwZ2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6fdb6f6-29e2-4ce7-8ad7-cb8350dc00a7_902x224.png 424w, https://substackcdn.com/image/fetch/$s_!UwZ2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6fdb6f6-29e2-4ce7-8ad7-cb8350dc00a7_902x224.png 848w, https://substackcdn.com/image/fetch/$s_!UwZ2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6fdb6f6-29e2-4ce7-8ad7-cb8350dc00a7_902x224.png 1272w, https://substackcdn.com/image/fetch/$s_!UwZ2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6fdb6f6-29e2-4ce7-8ad7-cb8350dc00a7_902x224.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UwZ2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6fdb6f6-29e2-4ce7-8ad7-cb8350dc00a7_902x224.png" width="902" height="224" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e6fdb6f6-29e2-4ce7-8ad7-cb8350dc00a7_902x224.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:224,&quot;width&quot;:902,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:50812,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6fdb6f6-29e2-4ce7-8ad7-cb8350dc00a7_902x224.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UwZ2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6fdb6f6-29e2-4ce7-8ad7-cb8350dc00a7_902x224.png 424w, https://substackcdn.com/image/fetch/$s_!UwZ2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6fdb6f6-29e2-4ce7-8ad7-cb8350dc00a7_902x224.png 848w, https://substackcdn.com/image/fetch/$s_!UwZ2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6fdb6f6-29e2-4ce7-8ad7-cb8350dc00a7_902x224.png 1272w, https://substackcdn.com/image/fetch/$s_!UwZ2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6fdb6f6-29e2-4ce7-8ad7-cb8350dc00a7_902x224.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h4>8. Complexity compounds token cost faster than quality</h4><p>E7 was the most complex configuration in the experiment. It was also the worst result in both sessions. This pattern appeared in the ASCRS lab too &#8212; H9 (five-agent swarm) scored below the bare baseline at the highest token cost. The principle is consistent across both labs, both task domains, and both scales: add a new harness layer only when you can measure its quality contribution, and always track its token cost simultaneously. Without tracking, complexity accumulates invisibly.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zpo4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7f27665-0fbc-4f37-8e3e-bdd483eb4d99_903x242.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zpo4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7f27665-0fbc-4f37-8e3e-bdd483eb4d99_903x242.png 424w, https://substackcdn.com/image/fetch/$s_!zpo4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7f27665-0fbc-4f37-8e3e-bdd483eb4d99_903x242.png 848w, https://substackcdn.com/image/fetch/$s_!zpo4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7f27665-0fbc-4f37-8e3e-bdd483eb4d99_903x242.png 1272w, https://substackcdn.com/image/fetch/$s_!zpo4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7f27665-0fbc-4f37-8e3e-bdd483eb4d99_903x242.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zpo4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7f27665-0fbc-4f37-8e3e-bdd483eb4d99_903x242.png" width="903" height="242" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e7f27665-0fbc-4f37-8e3e-bdd483eb4d99_903x242.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:242,&quot;width&quot;:903,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:69942,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7f27665-0fbc-4f37-8e3e-bdd483eb4d99_903x242.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zpo4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7f27665-0fbc-4f37-8e3e-bdd483eb4d99_903x242.png 424w, https://substackcdn.com/image/fetch/$s_!zpo4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7f27665-0fbc-4f37-8e3e-bdd483eb4d99_903x242.png 848w, https://substackcdn.com/image/fetch/$s_!zpo4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7f27665-0fbc-4f37-8e3e-bdd483eb4d99_903x242.png 1272w, https://substackcdn.com/image/fetch/$s_!zpo4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7f27665-0fbc-4f37-8e3e-bdd483eb4d99_903x242.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>On the Harness-vs-Model Question</h2><p>The two runs raise a natural challenge to Part I&#8217;s central claim. If changing LIGHT_MODEL from 3B to 8B materially changed E3&#8217;s results, does that not suggest the model matters more than the harness? Note: The third run, and Postnote make this more of an issue?</p><p>No &#8212; but it requires a precise answer. The harness-beats-model principle operates above a capability floor. Below that floor, no harness can compensate for what the model cannot do. The 3B model is not a weaker version of 120B for this task &#8212; it is below the threshold where structured analytical reasoning with named evidence is possible at all. Correcting from 3B to 8B is not choosing a better model for performance reasons. It is finding the minimum engine before building the car.</p><blockquote><p><em>The harness is the primary performance variable for any model above the task&#8217;s capability floor. The capability floor is a task-model pair property. Establishing it empirically is the first step in harness design, not an afterthought.</em></p></blockquote><p>Once above the floor &#8212; as the 120B DEFAULT model clearly is in every run &#8212; the harness takes over completely. E1_minimal demonstrates this: four study anchors and the same system prompt as baseline produced results indistinguishable from full context injection. The model&#8217;s weights held the knowledge. The harness shaped how it was expressed.</p><h3>The Subscription Economics Argument, Recalibrated</h3><p>A prior article mentioned that flat-rate AI subscriptions were designed for conversational use but are being consumed by agentic workflows at 5&#8211;10&#215; the token rate. That argument stands &#8212; but the magnitude demonstrated here is smaller than predicted.</p><p>The actual best token saving with quality intact was E2 at 24%, not the predicted 50&#8211;60% from E7. Against a V2+V3 baseline of 2,665 tokens, a 24% saving is 639 tokens per task &#8212; meaningful, but not transformative. Against a V5 baseline of 7,000 tokens, the same E2 saving (same absolute tokens) plus E6&#8217;s verifier compression combined would produce a 35&#8211;45% saving. That is where the subscription economics argument bites.</p><p>The practical guidance for practitioners running agentic workflows under subscription constraints is: apply the strategies that confirmed cleanly (E2 schema compression, E6 verifier compression, E5 early exit, E1 ablation) against the most expensive harness you are running &#8212; which is always the one with the active retry loop and full verification. Not against a lean two-call baseline where the savings have no room to accumulate.</p><blockquote><p><em>The experiment demonstrated the measurement framework and confirmed three strategies cleanly. It failed to demonstrate the composite winner because the baseline was too lean and one model was too weak. The strategies are right. They need a more complex harness to show their full force.</em></p></blockquote><h2>Thoughts - Based on Two Runs</h2><p>Two runs of seven experiments produced a result that was more honest than the one predicted. The predicted composite winner (E7) was the actual worst run. The predicted collapse (E1_minimal) never materialised. Three experiments failed for identifiable design reasons. Five produced portable, replicable principles.</p><p>That is not a failed experiment. It is an experiment that required two runs to find its actual findings &#8212; which is exactly what controlled experiments are supposed to do. The token efficiency strategies confirmed here (schema compression, verifier compression, early exit, context ablation) are real, measurable, and applicable to any production agentic harness above V2+V3 complexity.</p><p>The one that did not need confirming &#8212; because it held in every run of every lab documented in this series &#8212; is the broadest principle: complexity compounds cost faster than it compounds quality. Measure both. Add layers only when you can justify their cost. The harness is the car. Build it efficiently.</p><h2><strong>POST NOTE</strong></h2><p><em>Run 3 &#8212; What a capable light model changed</em></p><p>After Run 2 was complete, a third partial run was conducted &#8212; targeting only the two experiments that the light model had broken. The question was simple: if the 8B model was the bottleneck, does replacing it with a +-30B model fix both E3 and E7?</p><p><strong>Model change: </strong>LIGHT_MODEL changed from meta-llama/llama-3.1-8b-instruct to qwen/qwen3-235b-a22b. Everything else &#8212; topic, system prompt, context files, verifier, composite metric &#8212; unchanged.</p><p>No existing results were modified. Run 3 wrote to an isolated folder with its own results file.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Bies!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae1ab5fc-ffd0-4c70-bbae-480364cfcf55_1192x653.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Bies!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae1ab5fc-ffd0-4c70-bbae-480364cfcf55_1192x653.png 424w, https://substackcdn.com/image/fetch/$s_!Bies!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae1ab5fc-ffd0-4c70-bbae-480364cfcf55_1192x653.png 848w, https://substackcdn.com/image/fetch/$s_!Bies!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae1ab5fc-ffd0-4c70-bbae-480364cfcf55_1192x653.png 1272w, https://substackcdn.com/image/fetch/$s_!Bies!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae1ab5fc-ffd0-4c70-bbae-480364cfcf55_1192x653.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Bies!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae1ab5fc-ffd0-4c70-bbae-480364cfcf55_1192x653.png" width="1192" height="653" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ae1ab5fc-ffd0-4c70-bbae-480364cfcf55_1192x653.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:653,&quot;width&quot;:1192,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1023386,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae1ab5fc-ffd0-4c70-bbae-480364cfcf55_1192x653.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Bies!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae1ab5fc-ffd0-4c70-bbae-480364cfcf55_1192x653.png 424w, https://substackcdn.com/image/fetch/$s_!Bies!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae1ab5fc-ffd0-4c70-bbae-480364cfcf55_1192x653.png 848w, https://substackcdn.com/image/fetch/$s_!Bies!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae1ab5fc-ffd0-4c70-bbae-480364cfcf55_1192x653.png 1272w, https://substackcdn.com/image/fetch/$s_!Bies!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae1ab5fc-ffd0-4c70-bbae-480364cfcf55_1192x653.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Run 3 Results &#8212; E3 and E7</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wMO2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab11ab9-021e-4ddc-9601-25da6559e4df_897x151.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wMO2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab11ab9-021e-4ddc-9601-25da6559e4df_897x151.png 424w, https://substackcdn.com/image/fetch/$s_!wMO2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab11ab9-021e-4ddc-9601-25da6559e4df_897x151.png 848w, https://substackcdn.com/image/fetch/$s_!wMO2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab11ab9-021e-4ddc-9601-25da6559e4df_897x151.png 1272w, https://substackcdn.com/image/fetch/$s_!wMO2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab11ab9-021e-4ddc-9601-25da6559e4df_897x151.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wMO2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab11ab9-021e-4ddc-9601-25da6559e4df_897x151.png" width="897" height="151" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fab11ab9-021e-4ddc-9601-25da6559e4df_897x151.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:151,&quot;width&quot;:897,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:22061,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab11ab9-021e-4ddc-9601-25da6559e4df_897x151.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wMO2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab11ab9-021e-4ddc-9601-25da6559e4df_897x151.png 424w, https://substackcdn.com/image/fetch/$s_!wMO2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab11ab9-021e-4ddc-9601-25da6559e4df_897x151.png 848w, https://substackcdn.com/image/fetch/$s_!wMO2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab11ab9-021e-4ddc-9601-25da6559e4df_897x151.png 1272w, https://substackcdn.com/image/fetch/$s_!wMO2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab11ab9-021e-4ddc-9601-25da6559e4df_897x151.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><strong>E7 attempt 1 result: </strong>Valid output, 40/50, early exit fired immediately. No retry, no collapse. Compare this to Run 2 where attempt 1 returned 0/50 &#8212; invalid structured output &#8212; forced a retry, and still only recovered to 25/50 on attempt 2 at 4,974 tokens across 4 API calls. Run 3 achieved a better score in 2 API calls at 2,586 tokens.</p><h3>Full Prediction Scorecard &#8212; All Three Runs</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VGij!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F358371b5-00d0-49be-b1ad-470887d4c67f_1198x654.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VGij!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F358371b5-00d0-49be-b1ad-470887d4c67f_1198x654.png 424w, https://substackcdn.com/image/fetch/$s_!VGij!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F358371b5-00d0-49be-b1ad-470887d4c67f_1198x654.png 848w, https://substackcdn.com/image/fetch/$s_!VGij!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F358371b5-00d0-49be-b1ad-470887d4c67f_1198x654.png 1272w, https://substackcdn.com/image/fetch/$s_!VGij!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F358371b5-00d0-49be-b1ad-470887d4c67f_1198x654.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VGij!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F358371b5-00d0-49be-b1ad-470887d4c67f_1198x654.png" width="1198" height="654" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/358371b5-00d0-49be-b1ad-470887d4c67f_1198x654.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:654,&quot;width&quot;:1198,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1054895,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F358371b5-00d0-49be-b1ad-470887d4c67f_1198x654.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VGij!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F358371b5-00d0-49be-b1ad-470887d4c67f_1198x654.png 424w, https://substackcdn.com/image/fetch/$s_!VGij!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F358371b5-00d0-49be-b1ad-470887d4c67f_1198x654.png 848w, https://substackcdn.com/image/fetch/$s_!VGij!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F358371b5-00d0-49be-b1ad-470887d4c67f_1198x654.png 1272w, https://substackcdn.com/image/fetch/$s_!VGij!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F358371b5-00d0-49be-b1ad-470887d4c67f_1198x654.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The table below consolidates predictions, Run 2 actuals, and Run 3 corrections into a single snapshot. Green = confirmed or exceeded. Amber = contradicted. A dash in the Run 3 column means the experiment was not retested &#8212; either it had already confirmed cleanly or its flaw was model-independent.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cX15!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ad38696-ea66-4f56-98ea-6d79dd54a4cb_898x500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cX15!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ad38696-ea66-4f56-98ea-6d79dd54a4cb_898x500.png 424w, https://substackcdn.com/image/fetch/$s_!cX15!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ad38696-ea66-4f56-98ea-6d79dd54a4cb_898x500.png 848w, https://substackcdn.com/image/fetch/$s_!cX15!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ad38696-ea66-4f56-98ea-6d79dd54a4cb_898x500.png 1272w, https://substackcdn.com/image/fetch/$s_!cX15!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ad38696-ea66-4f56-98ea-6d79dd54a4cb_898x500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cX15!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ad38696-ea66-4f56-98ea-6d79dd54a4cb_898x500.png" width="898" height="500" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6ad38696-ea66-4f56-98ea-6d79dd54a4cb_898x500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:500,&quot;width&quot;:898,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:93805,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ad38696-ea66-4f56-98ea-6d79dd54a4cb_898x500.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cX15!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ad38696-ea66-4f56-98ea-6d79dd54a4cb_898x500.png 424w, https://substackcdn.com/image/fetch/$s_!cX15!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ad38696-ea66-4f56-98ea-6d79dd54a4cb_898x500.png 848w, https://substackcdn.com/image/fetch/$s_!cX15!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ad38696-ea66-4f56-98ea-6d79dd54a4cb_898x500.png 1272w, https://substackcdn.com/image/fetch/$s_!cX15!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ad38696-ea66-4f56-98ea-6d79dd54a4cb_898x500.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The scorecard tells a cleaner story than the raw results tables. Three predictions were confirmed or exceeded across all runs without model dependency (E2, E5, E6). Two predictions that were contradicted in Run 2 were confirmed in Run 3 once the model was correct (E3, E7). Two remain contradicted because the flaw was in the design, not the model (E1 brevity bias never appeared; E4 gate-prompt misalignment is model-independent). Five of seven predictions are now in the confirmed column across the full experiment series.</p><h4>The harness design for E7 was correct all along</h4><p>Run 2&#8217;s E7 verdict felt like a failure of architecture &#8212; compact schema plus minimal context plus early exit producing the worst result in the lab. Run 3 shows the architecture was never the problem. With a light model capable of honouring the compact schema on a first pass, E7 did exactly what it was designed to do: lowest token cost among multi-call configurations, no wasted retry cycle, score above the quality threshold. The bottleneck in Runs 1 and 2 was the generation model&#8217;s capability floor, not the harness design.</p><h4>The capability floor for this task sits between 8B and 30B</h4><p>Three data points now define the curve for structured analytical generation with named evidence and calibrated confidence:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SLtU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34e2817a-7e52-4faf-8730-1442dcaf9e61_790x156.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SLtU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34e2817a-7e52-4faf-8730-1442dcaf9e61_790x156.png 424w, https://substackcdn.com/image/fetch/$s_!SLtU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34e2817a-7e52-4faf-8730-1442dcaf9e61_790x156.png 848w, https://substackcdn.com/image/fetch/$s_!SLtU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34e2817a-7e52-4faf-8730-1442dcaf9e61_790x156.png 1272w, https://substackcdn.com/image/fetch/$s_!SLtU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34e2817a-7e52-4faf-8730-1442dcaf9e61_790x156.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SLtU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34e2817a-7e52-4faf-8730-1442dcaf9e61_790x156.png" width="790" height="156" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/34e2817a-7e52-4faf-8730-1442dcaf9e61_790x156.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:156,&quot;width&quot;:790,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:16803,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34e2817a-7e52-4faf-8730-1442dcaf9e61_790x156.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SLtU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34e2817a-7e52-4faf-8730-1442dcaf9e61_790x156.png 424w, https://substackcdn.com/image/fetch/$s_!SLtU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34e2817a-7e52-4faf-8730-1442dcaf9e61_790x156.png 848w, https://substackcdn.com/image/fetch/$s_!SLtU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34e2817a-7e52-4faf-8730-1442dcaf9e61_790x156.png 1272w, https://substackcdn.com/image/fetch/$s_!SLtU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34e2817a-7e52-4faf-8730-1442dcaf9e61_790x156.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>This is not a universal rule. The capability floor is a task-model pair property, as E1 demonstrated from the context side. But for structured analytical generation requiring domain recall, schema compliance, and calibrated reasoning, 30B appears to be the minimum viable routing tier. Below that, the saving on generation cost is outweighed by quality degradation and verbose output that increases token count rather than reducing it.</p><h4>E3 improved quality and reduced tokens simultaneously</h4><p>In Runs 1 and 2, E3 was the lab&#8217;s worst-efficiency result &#8212; quality collapse at higher token cost than baseline. Run 3 reversed both simultaneously: 45/50 at 2,431 tokens is a better quality-per-token ratio than baseline&#8217;s 50/50 at 2,665. The model routing principle from ASCRS H7 holds. It required a capable light model to show its effect.</p><h4>E6 remains the highest single-lever finding</h4><p>Run 3 did not retest E6 &#8212; it was already clean across both prior runs. Its 96% verifier token reduction is independent of light model capability because E6 only touches the verifier side. The two findings that compound best in production are E6 (cut verification cost) and a corrected E7 with a 30B+ light model (cut generation cost without quality loss). Together they represent the most practically applicable output of this entire lab.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2cop!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0443643f-a782-4b62-ac64-2d679a0b84c8_1192x653.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2cop!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0443643f-a782-4b62-ac64-2d679a0b84c8_1192x653.png 424w, https://substackcdn.com/image/fetch/$s_!2cop!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0443643f-a782-4b62-ac64-2d679a0b84c8_1192x653.png 848w, https://substackcdn.com/image/fetch/$s_!2cop!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0443643f-a782-4b62-ac64-2d679a0b84c8_1192x653.png 1272w, https://substackcdn.com/image/fetch/$s_!2cop!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0443643f-a782-4b62-ac64-2d679a0b84c8_1192x653.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2cop!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0443643f-a782-4b62-ac64-2d679a0b84c8_1192x653.png" width="1192" height="653" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0443643f-a782-4b62-ac64-2d679a0b84c8_1192x653.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:653,&quot;width&quot;:1192,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1023386,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0443643f-a782-4b62-ac64-2d679a0b84c8_1192x653.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2cop!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0443643f-a782-4b62-ac64-2d679a0b84c8_1192x653.png 424w, https://substackcdn.com/image/fetch/$s_!2cop!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0443643f-a782-4b62-ac64-2d679a0b84c8_1192x653.png 848w, https://substackcdn.com/image/fetch/$s_!2cop!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0443643f-a782-4b62-ac64-2d679a0b84c8_1192x653.png 1272w, https://substackcdn.com/image/fetch/$s_!2cop!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0443643f-a782-4b62-ac64-2d679a0b84c8_1192x653.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Zvso!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F100d05c7-9830-405f-be11-b124216acac3_794x248.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Zvso!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F100d05c7-9830-405f-be11-b124216acac3_794x248.png 424w, https://substackcdn.com/image/fetch/$s_!Zvso!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F100d05c7-9830-405f-be11-b124216acac3_794x248.png 848w, https://substackcdn.com/image/fetch/$s_!Zvso!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F100d05c7-9830-405f-be11-b124216acac3_794x248.png 1272w, https://substackcdn.com/image/fetch/$s_!Zvso!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F100d05c7-9830-405f-be11-b124216acac3_794x248.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Zvso!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F100d05c7-9830-405f-be11-b124216acac3_794x248.png" width="794" height="248" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/100d05c7-9830-405f-be11-b124216acac3_794x248.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:248,&quot;width&quot;:794,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:46740,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198979087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F100d05c7-9830-405f-be11-b124216acac3_794x248.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Zvso!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F100d05c7-9830-405f-be11-b124216acac3_794x248.png 424w, https://substackcdn.com/image/fetch/$s_!Zvso!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F100d05c7-9830-405f-be11-b124216acac3_794x248.png 848w, https://substackcdn.com/image/fetch/$s_!Zvso!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F100d05c7-9830-405f-be11-b124216acac3_794x248.png 1272w, https://substackcdn.com/image/fetch/$s_!Zvso!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F100d05c7-9830-405f-be11-b124216acac3_794x248.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>References</h2><p><strong>From This Series</strong></p><p><strong>[1] </strong>Harness Engineering: A First Principles Build Guide (Part I). Policy brief agent V0&#8211;V5, 50/50 final score. Interesting Engineering++, 2026. </p><p><strong>[2] </strong>ASCRS Harness Lab &#8212; The Integrated Agentic Stack. H1&#8211;H10 controlled benchmark, May 17, 2026. </p><p><strong>Research Papers</strong></p><p><strong>[3] </strong>Zhang, Q. et al. (2025/2026). Agentic Context Engineering (ACE): Evolving Contexts for Self-Improving Language Models. arXiv:2510.04618. Identifies brevity bias &#8212; the tendency for iterative optimisation to collapse toward short, generic prompts that lose critical signal. <a href="https://arxiv.org/abs/2510.04618">https://arxiv.org/abs/2510.04618</a></p><p><strong>[4] </strong>Liu et al. (2023). AgentBench: Evaluating LLMs as Agents. Stanford/Tsinghua. Source of 6&#215; performance gap finding from harness design alone. <a href="https://arxiv.org/abs/2308.03688">https://arxiv.org/abs/2308.03688</a></p><p><strong>[5] </strong>Zhu et al. (April 2026). SemaClaw: A Step Towards General-Purpose Personal AI Agents through Harness Engineering. arXiv:2604.11548. <a href="https://arxiv.org/abs/2604.11548">https://arxiv.org/abs/2604.11548</a></p><p><strong>[6] </strong>Vishnyakova, V. (March 2026). Context Engineering: From Prompts to Corporate Multi-Agent Architecture. arXiv:2603.09619. <a href="https://arxiv.org/abs/2603.09619">https://arxiv.org/abs/2603.09619</a></p><p><strong>Production Case Studies &amp; Practitioner Guides</strong></p><p><strong>[7] </strong>Microsoft Azure SRE Team (April 14, 2026). Harness Engineering for Azure SRE Agent: Building the Agent Self-Improvement Loop. Progressive context discovery pattern. <a href="https://techcommunity.microsoft.com/blog/appsonazureblog/the-agent-that-investigates-itself/4500073">https://techcommunity.microsoft.com/blog/appsonazureblog/the-agent-that-investigates-itself/4500073</a></p><p><strong>[8] </strong>Hashimoto, M. (February 5, 2026). My AI Adoption Journey. Origin of harness engineering. Step 5: Engineer the Harness. <a href="https://mitchellh.com/writing/my-ai-adoption-journey">https://mitchellh.com/writing/my-ai-adoption-journey</a></p><p><strong>[9] </strong>Lopopolo, R. / OpenAI (February 11, 2026). Harness Engineering: Leveraging Codex in an Agent-First World. <a href="https://openai.com/index/harness-engineering/">https://openai.com/index/harness-engineering/</a></p><p><strong>[10] </strong>B&#246;ckeler, B. / Martin Fowler (April 2, 2026). Harness Engineering for Coding Agent Users. <a href="https://martinfowler.com/articles/harness-engineering.html">https://martinfowler.com/articles/harness-engineering.html</a></p><p><strong>[11] </strong>Augment Code (April 17, 2026). Harness Engineering for AI Coding Agents: Constraints That Ship Reliable Code. Hashline experiment: one harness change moved a model from 6.7% to 68.3%. <a href="https://www.augmentcode.com/guides/harness-engineering-ai-coding-agents">https://www.augmentcode.com/guides/harness-engineering-ai-coding-agents</a></p><p><strong>[12] </strong>TechTimes (May 13, 2026). Harness Engineering Emerges as the Fourth Paradigm of AI Engineering. <a href="https://www.techtimes.com/articles/316587/20260513/harness-engineering-emerges-fourth-paradigm-ai-engineering.htm">https://www.techtimes.com/articles/316587/20260513/harness-engineering-emerges-fourth-paradigm-ai-engineering.htm</a></p><p><strong>[13] </strong>Atlan (April 13, 2026). What Is Harness Engineering AI? The Definitive 2026 Guide. <a href="https://atlan.com/know/what-is-harness-engineering/">https://atlan.com/know/what-is-harness-engineering/</a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[HARNESS ENGINEERING - Scaffolding A Small Model]]></title><description><![CDATA[How wrapping a small model in the right scaffolding beats upgrading to a bigger one]]></description><link>https://interestingengineering.substack.com/p/harness-engineering-scaffolding-a</link><guid isPermaLink="false">https://interestingengineering.substack.com/p/harness-engineering-scaffolding-a</guid><dc:creator><![CDATA[Interesting Engineering ++]]></dc:creator><pubDate>Fri, 22 May 2026 17:12:38 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!QW7B!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7a984ba-c850-4831-b840-ac966912e397_1152x638.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QW7B!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7a984ba-c850-4831-b840-ac966912e397_1152x638.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QW7B!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7a984ba-c850-4831-b840-ac966912e397_1152x638.png 424w, https://substackcdn.com/image/fetch/$s_!QW7B!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7a984ba-c850-4831-b840-ac966912e397_1152x638.png 848w, https://substackcdn.com/image/fetch/$s_!QW7B!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7a984ba-c850-4831-b840-ac966912e397_1152x638.png 1272w, https://substackcdn.com/image/fetch/$s_!QW7B!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7a984ba-c850-4831-b840-ac966912e397_1152x638.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QW7B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7a984ba-c850-4831-b840-ac966912e397_1152x638.png" width="1152" height="638" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d7a984ba-c850-4831-b840-ac966912e397_1152x638.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:638,&quot;width&quot;:1152,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1098379,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198855241?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7a984ba-c850-4831-b840-ac966912e397_1152x638.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QW7B!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7a984ba-c850-4831-b840-ac966912e397_1152x638.png 424w, https://substackcdn.com/image/fetch/$s_!QW7B!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7a984ba-c850-4831-b840-ac966912e397_1152x638.png 848w, https://substackcdn.com/image/fetch/$s_!QW7B!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7a984ba-c850-4831-b840-ac966912e397_1152x638.png 1272w, https://substackcdn.com/image/fetch/$s_!QW7B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7a984ba-c850-4831-b840-ac966912e397_1152x638.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Asking Better Questions</h2><p>Almost every conversation in 2025: About AI performance then arrived at the same place: which model should I use? Which is best. If contrasting some of the best today, I still get asked: GPT-5.5 or Claude Opus 4.7? Qwen3.7 Max or Deepseek v4? The implicit assumption is that model choice is the primary lever &#8212; that the gap between a mediocre result and an impressive one lives inside the model weights, and the way to improve is to &#8220;upgrade&#8221;.</p><p>And then slowly but surely, research pointed to this assumption being largely wrong. In controlled experiments, the same underlying model produced performance gaps of up to six times depending not on the model itself, but on the architecture built around it. The model was constant. Only the scaffolding changed.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><blockquote><p><em>A decent model with a great harness beats a great model with a bad harness. The interesting engineering is not in picking the model &#8212; it is in designing the scaffolding around it.</em></p></blockquote><p>This is the thesis of what is now being called <em><strong>harness engineering: the discipline of designing the system that wraps a model deliberately, rather than treating it as an afterthought. And it has a counterintuitive implication. If you are spending money upgrading models before optimising your harness, you are solving the wrong problem</strong></em>.</p><p>This article documents a few live experiments. Taking Claude Haiku &#8212; the smallest, cheapest model in the Claude family, ranked at the bottom of most benchmark comparisons &#8212; and progressively wrapping it in a harness, one layer at a time. We score the output at each stage. The model never changes. Everything else does. </p><h1>What Is a Harness, Exactly?</h1><p>The word comes from software testing. A test harness is the scaffolding that sets up, runs, and evaluates a system under controlled conditions &#8212; the environment the thing being tested operates inside.</p><p>I have written more about harness engineering and various experiments here:</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;554ee1bc-3d50-4d02-89fe-c4b453780a5a&quot;,&quot;caption&quot;:&quot;Had some time on my hands, and applied the features of The Harness Experiment(s) to the Architecture of Awareness design considerations. You will remember from The Harness Experiment (applied to a mini vendor analysis case study) that the results presented as follows:&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;ASCRS Harness Lab - The Integrated Agentic Stack: When Does More Architecture Mean Better AI? A Diagnostic Teardown&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:124460392,&quot;name&quot;:&quot;Interesting Engineering ++&quot;,&quot;bio&quot;:&quot;I spend my time learning about, and understanding our complex world better. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/977225f0-cc19-41f4-9df4-e21d01541411_347x347.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-05-16T17:52:19.700Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!cv0d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fb21fa-dc0a-4c0f-8ce4-6b2e85594843_1160x595.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:198013155,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:2,&quot;comment_count&quot;:2,&quot;publication_id&quot;:1335585,&quot;publication_name&quot;:&quot;Interesting Engineering++&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!-M9w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05150353-1bdc-48d2-b72c-c0bd499513eb_1024x1024.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;b33425a8-7943-4722-96ea-f40a670465f7&quot;,&quot;caption&quot;:&quot;This is an account of a (fairly) controlled experiment &#8212; ten versions of the same task, ten different AI architectures, and several contradictions that get many working on agentic systems excited (but maybe shouldn&#8217;t). The most sophisticated system isn&#8217;t always the best. The simplest one won. What follows is more instructive than the results.&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;The Harness Experiment &quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:124460392,&quot;name&quot;:&quot;Interesting Engineering ++&quot;,&quot;bio&quot;:&quot;I spend my time learning about, and understanding our complex world better. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/977225f0-cc19-41f4-9df4-e21d01541411_347x347.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-05-11T17:08:15.938Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!nWSV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7374887c-97b7-43aa-a2bf-8f930042f07b_862x828.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://interestingengineering.substack.com/p/the-harness-experiment&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:197169160,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:5,&quot;comment_count&quot;:0,&quot;publication_id&quot;:1335585,&quot;publication_name&quot;:&quot;Interesting Engineering++&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!-M9w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05150353-1bdc-48d2-b72c-c0bd499513eb_1024x1024.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;52232fd9-57e5-4e4c-ac10-3d504778d6d8&quot;,&quot;caption&quot;:&quot;I recently wrote &#8216;What Should &#8212; and Should Not &#8212; Evolve in Self-Improving Multi-Agent Systems?&#8217;, which builds a four-tier safety taxonomy from a convergent body of academic research spanning Columbia, Princeton, Renmin University, and Anthropic &#8212; arguing that certain components of any self-improving agentic system must be architecturally frozen, and oth&#8230;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Two Architectures of Control&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:124460392,&quot;name&quot;:&quot;Interesting Engineering ++&quot;,&quot;bio&quot;:&quot;I spend my time learning about, and understanding our complex world better. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/977225f0-cc19-41f4-9df4-e21d01541411_347x347.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-05-05T16:16:20.337Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!tQyg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71e751af-eda2-4137-9f6d-49a4397ea5c9_1405x380.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://interestingengineering.substack.com/p/two-architectures-of-control&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:196496789,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:3,&quot;comment_count&quot;:0,&quot;publication_id&quot;:1335585,&quot;publication_name&quot;:&quot;Interesting Engineering++&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!-M9w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05150353-1bdc-48d2-b72c-c0bd499513eb_1024x1024.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;37d9805a-0f3d-496d-90c4-27fbe3d95e67&quot;,&quot;caption&quot;:&quot;Before You Read: A Structural Introduction&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;The Architecture of Awareness: Design Considerations Of A Shipper's Agentic Logic&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:124460392,&quot;name&quot;:&quot;Interesting Engineering ++&quot;,&quot;bio&quot;:&quot;I spend my time learning about, and understanding our complex world better. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/977225f0-cc19-41f4-9df4-e21d01541411_347x347.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-04-22T17:10:21.275Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!iyM4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16e932de-491b-473e-864f-65b45369eacb_1408x768.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://interestingengineering.substack.com/p/the-architecture-of-awareness-design&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:194979383,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:1,&quot;comment_count&quot;:0,&quot;publication_id&quot;:1335585,&quot;publication_name&quot;:&quot;Interesting Engineering++&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!-M9w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05150353-1bdc-48d2-b72c-c0bd499513eb_1024x1024.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>From my perspective, brainstorming ideas, testing, and prototyping before undertaking a larger corporate agentic initiative is highly recommended. Inspired by <a href="https://youtu.be/C_GG5g38vLU?si=He5yow7Vp_8eI_Yr">Tejas Kumar</a>, whom I highly recommend following, you may wish to try something similar. Consider this a preamble or prologue to the prior written and referenced articles.</p><p>In the context of an AI agent, the harness is <strong>everything except the model itself</strong>. It is the sum of decisions about how the model receives information, what tools it can access, how it handles errors, whether it checks its own work, and whether it improves when it fails. Concretely, a harness consists of five layers:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1nk0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f91ec-b9d4-43e4-bab4-14087a7359c2_785x580.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1nk0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f91ec-b9d4-43e4-bab4-14087a7359c2_785x580.png 424w, https://substackcdn.com/image/fetch/$s_!1nk0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f91ec-b9d4-43e4-bab4-14087a7359c2_785x580.png 848w, https://substackcdn.com/image/fetch/$s_!1nk0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f91ec-b9d4-43e4-bab4-14087a7359c2_785x580.png 1272w, https://substackcdn.com/image/fetch/$s_!1nk0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f91ec-b9d4-43e4-bab4-14087a7359c2_785x580.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1nk0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f91ec-b9d4-43e4-bab4-14087a7359c2_785x580.png" width="785" height="580" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f52f91ec-b9d4-43e4-bab4-14087a7359c2_785x580.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:580,&quot;width&quot;:785,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:49559,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198855241?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f91ec-b9d4-43e4-bab4-14087a7359c2_785x580.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1nk0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f91ec-b9d4-43e4-bab4-14087a7359c2_785x580.png 424w, https://substackcdn.com/image/fetch/$s_!1nk0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f91ec-b9d4-43e4-bab4-14087a7359c2_785x580.png 848w, https://substackcdn.com/image/fetch/$s_!1nk0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f91ec-b9d4-43e4-bab4-14087a7359c2_785x580.png 1272w, https://substackcdn.com/image/fetch/$s_!1nk0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f91ec-b9d4-43e4-bab4-14087a7359c2_785x580.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>None of these layers require a more powerful model. They require more careful engineering. The distinction matters enormously in practice: engineering is deterministic and improvable; waiting for a better model is neither.</p><h2>The Experiment</h2><h3>Setup</h3><p><strong>Model: </strong>claude-haiku-4-5. This is Anthropic&#8217;s smallest, fastest, and cheapest model. It is not the model you would choose for demanding analytical work. That is the point.</p><p><strong>Task: </strong>Given a single topic sentence, produce a structured one-page policy brief. The brief must contain: a falsifiable thesis, three evidence-backed claims, a counterargument, a confidence rating for each claim, and a list of identified knowledge gaps.</p><p><strong>Topic (fixed for all versions): </strong><em>&#8220;The impact of automation on entry-level knowledge work&#8221;. Other choices could include: Remote work and productivity, Social media and adolescent mental health, Cash transfer programs and long-run poverty, Peer effects in education, Antibiotic overprescription and resistance and so on.</em></p><p>The topic(s) is/are deliberately chosen because it has real published evidence, a genuine counterargument, and measurable gaps. This ensures that differences in output quality between versions reflect harness quality &#8212; not whether the subject matter is researchable at all.</p><h3>Scoring: Five Dimensions, Ten Points Each</h3><p>Every output is scored on five dimensions, each worth ten points, for a maximum of fifty. The scoring rubric is fixed across all versions:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!txWG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb16d689a-2d5d-4252-a56e-2bfa50c62697_786x191.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!txWG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb16d689a-2d5d-4252-a56e-2bfa50c62697_786x191.png 424w, https://substackcdn.com/image/fetch/$s_!txWG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb16d689a-2d5d-4252-a56e-2bfa50c62697_786x191.png 848w, https://substackcdn.com/image/fetch/$s_!txWG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb16d689a-2d5d-4252-a56e-2bfa50c62697_786x191.png 1272w, https://substackcdn.com/image/fetch/$s_!txWG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb16d689a-2d5d-4252-a56e-2bfa50c62697_786x191.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!txWG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb16d689a-2d5d-4252-a56e-2bfa50c62697_786x191.png" width="786" height="191" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b16d689a-2d5d-4252-a56e-2bfa50c62697_786x191.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:191,&quot;width&quot;:786,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:22150,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198855241?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb16d689a-2d5d-4252-a56e-2bfa50c62697_786x191.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!txWG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb16d689a-2d5d-4252-a56e-2bfa50c62697_786x191.png 424w, https://substackcdn.com/image/fetch/$s_!txWG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb16d689a-2d5d-4252-a56e-2bfa50c62697_786x191.png 848w, https://substackcdn.com/image/fetch/$s_!txWG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb16d689a-2d5d-4252-a56e-2bfa50c62697_786x191.png 1272w, https://substackcdn.com/image/fetch/$s_!txWG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb16d689a-2d5d-4252-a56e-2bfa50c62697_786x191.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Scoring is done twice: once by the human experimenter reading the output, and &#8212; from Version 3 onward &#8212; once by a verifier agent running automatically. Where the two scores diverge, the discrepancy is itself informative.</p><h2>The Five Versions</h2><h3>Version 0 &#8212; The Bare Call</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!y78W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78b9943d-7799-48a6-8519-4360b78b4dcc_1136x628.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!y78W!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78b9943d-7799-48a6-8519-4360b78b4dcc_1136x628.png 424w, https://substackcdn.com/image/fetch/$s_!y78W!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78b9943d-7799-48a6-8519-4360b78b4dcc_1136x628.png 848w, https://substackcdn.com/image/fetch/$s_!y78W!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78b9943d-7799-48a6-8519-4360b78b4dcc_1136x628.png 1272w, https://substackcdn.com/image/fetch/$s_!y78W!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78b9943d-7799-48a6-8519-4360b78b4dcc_1136x628.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!y78W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78b9943d-7799-48a6-8519-4360b78b4dcc_1136x628.png" width="1136" height="628" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/78b9943d-7799-48a6-8519-4360b78b4dcc_1136x628.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:628,&quot;width&quot;:1136,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:948301,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198855241?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78b9943d-7799-48a6-8519-4360b78b4dcc_1136x628.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!y78W!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78b9943d-7799-48a6-8519-4360b78b4dcc_1136x628.png 424w, https://substackcdn.com/image/fetch/$s_!y78W!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78b9943d-7799-48a6-8519-4360b78b4dcc_1136x628.png 848w, https://substackcdn.com/image/fetch/$s_!y78W!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78b9943d-7799-48a6-8519-4360b78b4dcc_1136x628.png 1272w, https://substackcdn.com/image/fetch/$s_!y78W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78b9943d-7799-48a6-8519-4360b78b4dcc_1136x628.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Version 0 is the control condition. There is no system prompt. There is no output schema. The user message is a single sentence:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!L94n!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5f691ce-463d-4cac-bedc-45f74bc5be2a_787x61.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!L94n!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5f691ce-463d-4cac-bedc-45f74bc5be2a_787x61.png 424w, https://substackcdn.com/image/fetch/$s_!L94n!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5f691ce-463d-4cac-bedc-45f74bc5be2a_787x61.png 848w, https://substackcdn.com/image/fetch/$s_!L94n!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5f691ce-463d-4cac-bedc-45f74bc5be2a_787x61.png 1272w, https://substackcdn.com/image/fetch/$s_!L94n!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5f691ce-463d-4cac-bedc-45f74bc5be2a_787x61.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!L94n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5f691ce-463d-4cac-bedc-45f74bc5be2a_787x61.png" width="787" height="61" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d5f691ce-463d-4cac-bedc-45f74bc5be2a_787x61.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:61,&quot;width&quot;:787,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3970,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198855241?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5f691ce-463d-4cac-bedc-45f74bc5be2a_787x61.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!L94n!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5f691ce-463d-4cac-bedc-45f74bc5be2a_787x61.png 424w, https://substackcdn.com/image/fetch/$s_!L94n!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5f691ce-463d-4cac-bedc-45f74bc5be2a_787x61.png 848w, https://substackcdn.com/image/fetch/$s_!L94n!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5f691ce-463d-4cac-bedc-45f74bc5be2a_787x61.png 1272w, https://substackcdn.com/image/fetch/$s_!L94n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5f691ce-463d-4cac-bedc-45f74bc5be2a_787x61.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>The model receives no information about what a policy brief means to us, who the audience is, what level of evidence is acceptable, or what the output should look like. It draws entirely on its training data&#8217;s representation of the phrase &#8216;policy brief&#8217;.</p><p>What the model produces is typically coherent English that resembles a brief in structure &#8212; sections, paragraphs, a general argument. But the resemblance is superficial. Claims are assertions without evidence. There are no confidence ratings because the concept has not been introduced. The counterargument is either absent or a single dismissive sentence. The gaps section, if it appears at all, says something like &#8216;further research is needed&#8217;.</p><p>This is not a failure of the model. It is a failure of the harness &#8212; which at this stage does not exist.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QaiE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1427cd8-93a6-4665-ac52-21307b804114_786x391.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QaiE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1427cd8-93a6-4665-ac52-21307b804114_786x391.png 424w, https://substackcdn.com/image/fetch/$s_!QaiE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1427cd8-93a6-4665-ac52-21307b804114_786x391.png 848w, https://substackcdn.com/image/fetch/$s_!QaiE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1427cd8-93a6-4665-ac52-21307b804114_786x391.png 1272w, https://substackcdn.com/image/fetch/$s_!QaiE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1427cd8-93a6-4665-ac52-21307b804114_786x391.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QaiE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1427cd8-93a6-4665-ac52-21307b804114_786x391.png" width="786" height="391" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e1427cd8-93a6-4665-ac52-21307b804114_786x391.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:391,&quot;width&quot;:786,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:43838,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198855241?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1427cd8-93a6-4665-ac52-21307b804114_786x391.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QaiE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1427cd8-93a6-4665-ac52-21307b804114_786x391.png 424w, https://substackcdn.com/image/fetch/$s_!QaiE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1427cd8-93a6-4665-ac52-21307b804114_786x391.png 848w, https://substackcdn.com/image/fetch/$s_!QaiE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1427cd8-93a6-4665-ac52-21307b804114_786x391.png 1272w, https://substackcdn.com/image/fetch/$s_!QaiE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1427cd8-93a6-4665-ac52-21307b804114_786x391.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Expected score range: 10&#8211;20 / 50. </strong><em>The output looks like a brief. It does not function like one.</em></p><h3>Version 1 &#8212; The Output Schema</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!k0l9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7355be44-942e-4ff5-a908-80efb2be0263_1139x629.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!k0l9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7355be44-942e-4ff5-a908-80efb2be0263_1139x629.png 424w, https://substackcdn.com/image/fetch/$s_!k0l9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7355be44-942e-4ff5-a908-80efb2be0263_1139x629.png 848w, https://substackcdn.com/image/fetch/$s_!k0l9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7355be44-942e-4ff5-a908-80efb2be0263_1139x629.png 1272w, https://substackcdn.com/image/fetch/$s_!k0l9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7355be44-942e-4ff5-a908-80efb2be0263_1139x629.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!k0l9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7355be44-942e-4ff5-a908-80efb2be0263_1139x629.png" width="1139" height="629" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7355be44-942e-4ff5-a908-80efb2be0263_1139x629.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:629,&quot;width&quot;:1139,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1177958,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198855241?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7355be44-942e-4ff5-a908-80efb2be0263_1139x629.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!k0l9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7355be44-942e-4ff5-a908-80efb2be0263_1139x629.png 424w, https://substackcdn.com/image/fetch/$s_!k0l9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7355be44-942e-4ff5-a908-80efb2be0263_1139x629.png 848w, https://substackcdn.com/image/fetch/$s_!k0l9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7355be44-942e-4ff5-a908-80efb2be0263_1139x629.png 1272w, https://substackcdn.com/image/fetch/$s_!k0l9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7355be44-942e-4ff5-a908-80efb2be0263_1139x629.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The first harness layer is a system prompt containing an exact output schema. We are not asking the model to write well &#8212; we are telling it precisely what the output must contain and in what format. The schema specifies:</p><p>1. A thesis &#8212; one sentence, falsifiable, not a question</p><p>2. Three claims, each with evidence and a confidence rating plus reason</p><p>3. A counterargument &#8212; the strongest opposing case</p><p>4. Three identified gaps &#8212; specific things we do not yet know</p><p><strong>What changes: </strong>The model now knows what it is being asked to produce. Structure appears immediately. Confidence ratings appear for the first time, because the schema requires them. The counterargument gets its own section.</p><p><strong>What does not change: </strong>The model still has no evidence to draw on. It fills the evidence fields with plausible-sounding but unverified claims. Confidence ratings appear but have no logical grounding &#8212; the model assigns &#8216;High&#8217; to claims it cannot actually verify.</p><blockquote><p><em>A schema tells the model the shape of the answer. It does not tell the model what to put inside it.</em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WStY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94c6d06c-9ebf-4ec8-849c-7752983a9977_787x469.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WStY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94c6d06c-9ebf-4ec8-849c-7752983a9977_787x469.png 424w, https://substackcdn.com/image/fetch/$s_!WStY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94c6d06c-9ebf-4ec8-849c-7752983a9977_787x469.png 848w, https://substackcdn.com/image/fetch/$s_!WStY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94c6d06c-9ebf-4ec8-849c-7752983a9977_787x469.png 1272w, https://substackcdn.com/image/fetch/$s_!WStY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94c6d06c-9ebf-4ec8-849c-7752983a9977_787x469.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WStY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94c6d06c-9ebf-4ec8-849c-7752983a9977_787x469.png" width="787" height="469" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/94c6d06c-9ebf-4ec8-849c-7752983a9977_787x469.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:469,&quot;width&quot;:787,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:63850,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198855241?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94c6d06c-9ebf-4ec8-849c-7752983a9977_787x469.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WStY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94c6d06c-9ebf-4ec8-849c-7752983a9977_787x469.png 424w, https://substackcdn.com/image/fetch/$s_!WStY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94c6d06c-9ebf-4ec8-849c-7752983a9977_787x469.png 848w, https://substackcdn.com/image/fetch/$s_!WStY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94c6d06c-9ebf-4ec8-849c-7752983a9977_787x469.png 1272w, https://substackcdn.com/image/fetch/$s_!WStY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94c6d06c-9ebf-4ec8-849c-7752983a9977_787x469.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Expected score range: 20&#8211;30 / 50. </strong><em>A structured shell. The model is now filling a form it understands.</em></p><h3>Version 2 &#8212; The Context File</h3><p>This is the version where the output quality makes its most visible jump. Version 2 introduces a context file &#8212; TASK.md &#8212; which is loaded and injected into the user message before the model call. The file contains three things:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ADrI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07b1a35a-1182-4a7b-984a-6e89c7f0505d_787x356.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ADrI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07b1a35a-1182-4a7b-984a-6e89c7f0505d_787x356.png 424w, https://substackcdn.com/image/fetch/$s_!ADrI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07b1a35a-1182-4a7b-984a-6e89c7f0505d_787x356.png 848w, https://substackcdn.com/image/fetch/$s_!ADrI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07b1a35a-1182-4a7b-984a-6e89c7f0505d_787x356.png 1272w, https://substackcdn.com/image/fetch/$s_!ADrI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07b1a35a-1182-4a7b-984a-6e89c7f0505d_787x356.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ADrI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07b1a35a-1182-4a7b-984a-6e89c7f0505d_787x356.png" width="787" height="356" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/07b1a35a-1182-4a7b-984a-6e89c7f0505d_787x356.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:356,&quot;width&quot;:787,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:32176,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198855241?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07b1a35a-1182-4a7b-984a-6e89c7f0505d_787x356.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ADrI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07b1a35a-1182-4a7b-984a-6e89c7f0505d_787x356.png 424w, https://substackcdn.com/image/fetch/$s_!ADrI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07b1a35a-1182-4a7b-984a-6e89c7f0505d_787x356.png 848w, https://substackcdn.com/image/fetch/$s_!ADrI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07b1a35a-1182-4a7b-984a-6e89c7f0505d_787x356.png 1272w, https://substackcdn.com/image/fetch/$s_!ADrI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07b1a35a-1182-4a7b-984a-6e89c7f0505d_787x356.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Why this works: </strong>A language model&#8217;s context window is its working memory. Whatever you put into it before generation begins shapes what comes out. Version 0 and V1 gave the model an empty workspace and asked it to write. Version 2 pre-loads the workspace with the material a human analyst would have gathered before sitting down to write.</p><p>The model does not need to know things from training. It needs to be able to use things from context. This distinction is foundational to harness engineering &#8212; you are not trying to find a model that already knows everything; you are building a system that provides the right information at the right time.</p><p><strong>What changes: </strong>Evidence fields now contain named studies with years and specific findings. Confidence ratings become defensible &#8212; &#8216;High&#8217; is assigned to claims backed by multiple named sources; &#8216;Low&#8217; appears where evidence is genuinely thin. The output reads like something a real analyst produced.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VK1m!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb638201-fac1-44f4-9be2-67391eaa55f4_791x491.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VK1m!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb638201-fac1-44f4-9be2-67391eaa55f4_791x491.png 424w, https://substackcdn.com/image/fetch/$s_!VK1m!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb638201-fac1-44f4-9be2-67391eaa55f4_791x491.png 848w, https://substackcdn.com/image/fetch/$s_!VK1m!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb638201-fac1-44f4-9be2-67391eaa55f4_791x491.png 1272w, https://substackcdn.com/image/fetch/$s_!VK1m!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb638201-fac1-44f4-9be2-67391eaa55f4_791x491.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VK1m!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb638201-fac1-44f4-9be2-67391eaa55f4_791x491.png" width="791" height="491" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bb638201-fac1-44f4-9be2-67391eaa55f4_791x491.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:491,&quot;width&quot;:791,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:65943,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198855241?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb638201-fac1-44f4-9be2-67391eaa55f4_791x491.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VK1m!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb638201-fac1-44f4-9be2-67391eaa55f4_791x491.png 424w, https://substackcdn.com/image/fetch/$s_!VK1m!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb638201-fac1-44f4-9be2-67391eaa55f4_791x491.png 848w, https://substackcdn.com/image/fetch/$s_!VK1m!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb638201-fac1-44f4-9be2-67391eaa55f4_791x491.png 1272w, https://substackcdn.com/image/fetch/$s_!VK1m!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb638201-fac1-44f4-9be2-67391eaa55f4_791x491.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Expected score range: 30&#8211;37 / 50. </strong><em>Evidence becomes real. The model is now working with material, not improvising.</em></p><h3>Version 3 &#8212; The Verifier Agent</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HHHV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359682cf-8057-4cd0-8820-e531580d908f_1102x590.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HHHV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359682cf-8057-4cd0-8820-e531580d908f_1102x590.png 424w, https://substackcdn.com/image/fetch/$s_!HHHV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359682cf-8057-4cd0-8820-e531580d908f_1102x590.png 848w, https://substackcdn.com/image/fetch/$s_!HHHV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359682cf-8057-4cd0-8820-e531580d908f_1102x590.png 1272w, https://substackcdn.com/image/fetch/$s_!HHHV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359682cf-8057-4cd0-8820-e531580d908f_1102x590.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HHHV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359682cf-8057-4cd0-8820-e531580d908f_1102x590.png" width="1102" height="590" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/359682cf-8057-4cd0-8820-e531580d908f_1102x590.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:590,&quot;width&quot;:1102,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:919106,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198855241?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359682cf-8057-4cd0-8820-e531580d908f_1102x590.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HHHV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359682cf-8057-4cd0-8820-e531580d908f_1102x590.png 424w, https://substackcdn.com/image/fetch/$s_!HHHV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359682cf-8057-4cd0-8820-e531580d908f_1102x590.png 848w, https://substackcdn.com/image/fetch/$s_!HHHV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359682cf-8057-4cd0-8820-e531580d908f_1102x590.png 1272w, https://substackcdn.com/image/fetch/$s_!HHHV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F359682cf-8057-4cd0-8820-e531580d908f_1102x590.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Versions 0 through 2 are all one-directional. Information flows in, output flows out. No part of the system looks at the output and decides whether it is good enough. Version 3 introduces a second model call &#8212; the verifier &#8212; whose sole job is to read the generated brief and return a structured JSON score with improvement notes.</p><p><strong>What a verifier is: </strong>It is the same model (claude-haiku-4-5) operating under a different system prompt. The generator has a schema for writing. The verifier has a schema for judging. The two calls are independent &#8212; the verifier has no memory of generating the brief; it reads it cold, as an editor would.</p><p>The verifier returns a JSON object: a numerical score per dimension, a total, and a specific improvement note for each dimension. For example:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fvIf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76cda5f9-b063-40d7-b05f-84faabc617ac_789x113.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fvIf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76cda5f9-b063-40d7-b05f-84faabc617ac_789x113.png 424w, https://substackcdn.com/image/fetch/$s_!fvIf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76cda5f9-b063-40d7-b05f-84faabc617ac_789x113.png 848w, https://substackcdn.com/image/fetch/$s_!fvIf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76cda5f9-b063-40d7-b05f-84faabc617ac_789x113.png 1272w, https://substackcdn.com/image/fetch/$s_!fvIf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76cda5f9-b063-40d7-b05f-84faabc617ac_789x113.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fvIf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76cda5f9-b063-40d7-b05f-84faabc617ac_789x113.png" width="789" height="113" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/76cda5f9-b063-40d7-b05f-84faabc617ac_789x113.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:113,&quot;width&quot;:789,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12625,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198855241?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76cda5f9-b063-40d7-b05f-84faabc617ac_789x113.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fvIf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76cda5f9-b063-40d7-b05f-84faabc617ac_789x113.png 424w, https://substackcdn.com/image/fetch/$s_!fvIf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76cda5f9-b063-40d7-b05f-84faabc617ac_789x113.png 848w, https://substackcdn.com/image/fetch/$s_!fvIf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76cda5f9-b063-40d7-b05f-84faabc617ac_789x113.png 1272w, https://substackcdn.com/image/fetch/$s_!fvIf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76cda5f9-b063-40d7-b05f-84faabc617ac_789x113.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><strong>What this adds: </strong>You now have machine-readable feedback. The verifier catches things the eye misses &#8212; it never gets tired, never gives the benefit of the doubt, never skips a dimension. It also surfaces the specific weakest dimension, which tells you exactly where the next investment should go.</p><p><strong>Important note: </strong>In V3, the verifier is an observer only. It reads the output and scores it, but it does not change it and does not trigger a retry. The loop is not yet closed. That happens in V4.</p><blockquote><p><em>You cannot improve what you cannot measure. The verifier is the measurement instrument. Without it, every run is a shot in the dark.</em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Gyhs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F371fdf99-e186-48d5-ba66-4bfe6b6e16c3_787x511.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Gyhs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F371fdf99-e186-48d5-ba66-4bfe6b6e16c3_787x511.png 424w, https://substackcdn.com/image/fetch/$s_!Gyhs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F371fdf99-e186-48d5-ba66-4bfe6b6e16c3_787x511.png 848w, https://substackcdn.com/image/fetch/$s_!Gyhs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F371fdf99-e186-48d5-ba66-4bfe6b6e16c3_787x511.png 1272w, https://substackcdn.com/image/fetch/$s_!Gyhs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F371fdf99-e186-48d5-ba66-4bfe6b6e16c3_787x511.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Gyhs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F371fdf99-e186-48d5-ba66-4bfe6b6e16c3_787x511.png" width="787" height="511" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/371fdf99-e186-48d5-ba66-4bfe6b6e16c3_787x511.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:511,&quot;width&quot;:787,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:78243,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198855241?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F371fdf99-e186-48d5-ba66-4bfe6b6e16c3_787x511.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Gyhs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F371fdf99-e186-48d5-ba66-4bfe6b6e16c3_787x511.png 424w, https://substackcdn.com/image/fetch/$s_!Gyhs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F371fdf99-e186-48d5-ba66-4bfe6b6e16c3_787x511.png 848w, https://substackcdn.com/image/fetch/$s_!Gyhs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F371fdf99-e186-48d5-ba66-4bfe6b6e16c3_787x511.png 1272w, https://substackcdn.com/image/fetch/$s_!Gyhs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F371fdf99-e186-48d5-ba66-4bfe6b6e16c3_787x511.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Expected score range: 37&#8211;42 / 50. </strong><em>The verifier does not improve the brief. It exposes what is wrong with it &#8212; which is the prerequisite for improvement.</em></p><h3>Version 4 &#8212; The Feedback Loop</h3><p>This is the version where the harness is supposed to become genuinely intelligent. The feedback loop connects the verifier&#8217;s output back to the generator &#8212; if the score falls below a threshold (40/50), the verifier&#8217;s improvement notes are injected into the conversation as a new user message, and the model is asked to rewrite the brief addressing every note.</p><p><strong>What &#8216;injecting into the conversation&#8217; means: </strong>Language models process a conversation as a sequence of messages. By appending the original draft and the verifier&#8217;s notes as additional messages &#8212; &#8216;your brief scored 34/50, here are the specific issues&#8217; &#8212; we give the model the experience of receiving editorial feedback. It can see what it produced and exactly what was wrong with it. The second draft is conditioned on both.</p><p><strong>The state file: </strong>Version 4 also introduces state.json &#8212; a record of every run, including timestamp, score, number of attempts, and the best brief produced so far. This is the harness acquiring memory across sessions. You can close the project, return the next day, and the system knows its own history.</p><p><strong>What happens in practice: </strong>The first attempt typically scores 34&#8211;39. The verifier fires, returns notes. The second attempt typically scores 40&#8211;46. In most runs, two attempts are sufficient. In some, the first attempt clears the threshold and the loop does not fire at all.</p><blockquote><p><em>The loop does not make the model smarter. It gives the model the one thing it was previously denied: the chance to see its own mistakes and try again.</em></p></blockquote><p>This maps directly to how skilled human writing actually works. First drafts are not submitted. They are reviewed, annotated, and revised. The harness simply automates that process.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EUsH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6b4551c-a8f0-4df7-84a7-6b1c2a0cdaa3_787x470.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EUsH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6b4551c-a8f0-4df7-84a7-6b1c2a0cdaa3_787x470.png 424w, https://substackcdn.com/image/fetch/$s_!EUsH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6b4551c-a8f0-4df7-84a7-6b1c2a0cdaa3_787x470.png 848w, https://substackcdn.com/image/fetch/$s_!EUsH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6b4551c-a8f0-4df7-84a7-6b1c2a0cdaa3_787x470.png 1272w, https://substackcdn.com/image/fetch/$s_!EUsH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6b4551c-a8f0-4df7-84a7-6b1c2a0cdaa3_787x470.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EUsH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6b4551c-a8f0-4df7-84a7-6b1c2a0cdaa3_787x470.png" width="787" height="470" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c6b4551c-a8f0-4df7-84a7-6b1c2a0cdaa3_787x470.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:470,&quot;width&quot;:787,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:62084,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198855241?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6b4551c-a8f0-4df7-84a7-6b1c2a0cdaa3_787x470.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EUsH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6b4551c-a8f0-4df7-84a7-6b1c2a0cdaa3_787x470.png 424w, https://substackcdn.com/image/fetch/$s_!EUsH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6b4551c-a8f0-4df7-84a7-6b1c2a0cdaa3_787x470.png 848w, https://substackcdn.com/image/fetch/$s_!EUsH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6b4551c-a8f0-4df7-84a7-6b1c2a0cdaa3_787x470.png 1272w, https://substackcdn.com/image/fetch/$s_!EUsH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6b4551c-a8f0-4df7-84a7-6b1c2a0cdaa3_787x470.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Expected score range: 42&#8211;47 / 50. </strong><em>The harness now self-corrects. Failure triggers a second attempt with targeted instructions &#8212; not a generic retry.</em></p><h3>Version 5 &#8212; Memory</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BNpJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ec39744-a6af-411a-a6f9-9d9b96c46138_1135x622.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BNpJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ec39744-a6af-411a-a6f9-9d9b96c46138_1135x622.png 424w, https://substackcdn.com/image/fetch/$s_!BNpJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ec39744-a6af-411a-a6f9-9d9b96c46138_1135x622.png 848w, https://substackcdn.com/image/fetch/$s_!BNpJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ec39744-a6af-411a-a6f9-9d9b96c46138_1135x622.png 1272w, https://substackcdn.com/image/fetch/$s_!BNpJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ec39744-a6af-411a-a6f9-9d9b96c46138_1135x622.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BNpJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ec39744-a6af-411a-a6f9-9d9b96c46138_1135x622.png" width="1135" height="622" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5ec39744-a6af-411a-a6f9-9d9b96c46138_1135x622.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:622,&quot;width&quot;:1135,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1051556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198855241?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ec39744-a6af-411a-a6f9-9d9b96c46138_1135x622.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BNpJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ec39744-a6af-411a-a6f9-9d9b96c46138_1135x622.png 424w, https://substackcdn.com/image/fetch/$s_!BNpJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ec39744-a6af-411a-a6f9-9d9b96c46138_1135x622.png 848w, https://substackcdn.com/image/fetch/$s_!BNpJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ec39744-a6af-411a-a6f9-9d9b96c46138_1135x622.png 1272w, https://substackcdn.com/image/fetch/$s_!BNpJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ec39744-a6af-411a-a6f9-9d9b96c46138_1135x622.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The final layer is memory: a gold standard example injected into the context before generation begins. Before writing the brief, the model reads a previous output that scored 44+ on the rubric. It knows, in concrete terms, what it is aiming for.</p><p><strong>Why examples work better than instructions: </strong>Instructions tell the model what to do. Examples show the model what good looks like. This is called in-context learning &#8212; the model adapts its behaviour based on patterns observed in the prompt, not just rules stated in the system prompt. An example of a well-calibrated confidence rating teaches something that the instruction &#8216;write well-calibrated confidence ratings&#8217; cannot.</p><p><strong>Where the example comes from: </strong>You. After running V4, you take the best output from state.json and save it as examples/gold_standard.md. The harness is now improving itself using its own past performance. This is the feedback loop operating at the harness level rather than the output level.</p><p><strong>The threshold rises: </strong>Because the starting point is higher in V5, the retry threshold is raised from 40 to 43. The bar moves up with the capability. This is how harnesses are maintained &#8212; you do not set a threshold and leave it; you raise it as performance improves.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GfrF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dad710e-0aba-49c9-85c4-c08468c72a60_794x492.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GfrF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dad710e-0aba-49c9-85c4-c08468c72a60_794x492.png 424w, https://substackcdn.com/image/fetch/$s_!GfrF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dad710e-0aba-49c9-85c4-c08468c72a60_794x492.png 848w, https://substackcdn.com/image/fetch/$s_!GfrF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dad710e-0aba-49c9-85c4-c08468c72a60_794x492.png 1272w, https://substackcdn.com/image/fetch/$s_!GfrF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dad710e-0aba-49c9-85c4-c08468c72a60_794x492.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GfrF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dad710e-0aba-49c9-85c4-c08468c72a60_794x492.png" width="794" height="492" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8dad710e-0aba-49c9-85c4-c08468c72a60_794x492.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:492,&quot;width&quot;:794,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:67935,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198855241?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dad710e-0aba-49c9-85c4-c08468c72a60_794x492.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GfrF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dad710e-0aba-49c9-85c4-c08468c72a60_794x492.png 424w, https://substackcdn.com/image/fetch/$s_!GfrF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dad710e-0aba-49c9-85c4-c08468c72a60_794x492.png 848w, https://substackcdn.com/image/fetch/$s_!GfrF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dad710e-0aba-49c9-85c4-c08468c72a60_794x492.png 1272w, https://substackcdn.com/image/fetch/$s_!GfrF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dad710e-0aba-49c9-85c4-c08468c72a60_794x492.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Expected score range: 45&#8211;50 / 50. </strong><em>The model has seen the destination. It knows what it is building toward.</em></p><h2>Results: The Complete Picture</h2><p>The table below tracks both the expected performance range for each version and &#8212; once you run the experiment &#8212; your actual recorded scores. The delta column shows improvement attributable to each harness layer added.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WXgV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc2be52a-bf97-4c86-9c92-a7b032ec79a2_787x265.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WXgV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc2be52a-bf97-4c86-9c92-a7b032ec79a2_787x265.png 424w, https://substackcdn.com/image/fetch/$s_!WXgV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc2be52a-bf97-4c86-9c92-a7b032ec79a2_787x265.png 848w, https://substackcdn.com/image/fetch/$s_!WXgV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc2be52a-bf97-4c86-9c92-a7b032ec79a2_787x265.png 1272w, https://substackcdn.com/image/fetch/$s_!WXgV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc2be52a-bf97-4c86-9c92-a7b032ec79a2_787x265.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WXgV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc2be52a-bf97-4c86-9c92-a7b032ec79a2_787x265.png" width="787" height="265" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bc2be52a-bf97-4c86-9c92-a7b032ec79a2_787x265.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:265,&quot;width&quot;:787,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:22448,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198855241?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc2be52a-bf97-4c86-9c92-a7b032ec79a2_787x265.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WXgV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc2be52a-bf97-4c86-9c92-a7b032ec79a2_787x265.png 424w, https://substackcdn.com/image/fetch/$s_!WXgV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc2be52a-bf97-4c86-9c92-a7b032ec79a2_787x265.png 848w, https://substackcdn.com/image/fetch/$s_!WXgV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc2be52a-bf97-4c86-9c92-a7b032ec79a2_787x265.png 1272w, https://substackcdn.com/image/fetch/$s_!WXgV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc2be52a-bf97-4c86-9c92-a7b032ec79a2_787x265.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Your numbers will not match the expected ranges exactly. Mine didn&#8217;t. &#8212; they should not. Model outputs carry natural variance, and the verifier&#8217;s scoring introduces its own interpretation. What should hold is the directional trend: each layer added produces a higher score than the layer before it. If a layer does not produce improvement, that is a signal worth investigating &#8212; it means either the layer was implemented incorrectly or the previous layer already saturated that dimension.</p><p>Note: The guide&#8217;s prediction held: same model, 10&#215; better output through harness engineering alone. The schema (V1) did the heaviest lifting &#8212; +35 in one step. The gold standard example (V5) solved the one dimension nothing else could fix. The feedback loop (V4) showed no gain here because the first attempt always cleared the threshold; its value would show on a harder task or lower threshold.</p><blockquote><p><em>The expected trajectory is a rising staircase, not a smooth curve. Each step corresponds to one engineering decision. The interesting question is not how high the final score is &#8212; it is which step produced the biggest jump.</em></p></blockquote><h2>What the Results Reveal</h2><h3>The Biggest Jump Is Almost Always V2</h3><p>Across most runs of this experiment, the largest single improvement comes from adding TASK.md &#8212; the context file. This is counterintuitive. The schema in V1 is the most visible intervention. The feedback loop in V4 is the most architecturally impressive. But the context file is the one that changes what the model has to work with.</p><p>The reason: evidence quality and confidence calibration together account for twenty of the fifty available points. Both dimensions depend entirely on whether the model has real material in its context. The schema and the feedback loop both operate on the form of the output. The context file changes the substance.</p><h3>The Feedback Loop Is the Most Robust Layer</h3><p>The feedback loop in V4 does not always produce the biggest single jump, but it is the most consistently effective layer. In runs where the earlier layers have left a weak counterargument or uncalibrated confidence ratings, the verifier identifies the exact problem and the retry fixes it. The loop acts as a safety net &#8212; it catches the failures that the earlier layers did not prevent.</p><p>Note: The guide&#8217;s prediction held: same model, 10&#215; better output through harness engineering alone. The schema (V1) did the heaviest lifting &#8212; +35 in one step. The gold standard example (V5) solved the one dimension nothing else could fix. The feedback loop (V4) showed no gain here because the first attempt always cleared the threshold; its value would show on a harder task or lower threshold.</p><h3>Memory Has a Ceiling Effect</h3><p>If the experiment is working well, V5 produces only a modest improvement over V4 &#8212; perhaps two or three points. This is actually the sign of a successful experiment: it means V4 is already producing near-ceiling outputs, and the gold standard example has relatively little room to add. If V5 produces a large jump, it usually means V4 underperformed, and the example provided the quality signal that earlier layers should have delivered.</p><h3>The Model Never Changed</h3><p>Claude Haiku is not a model designed for demanding analytical work. It is designed for speed and cost efficiency. Every version in this experiment uses the same weights, the same parameters, the same underlying intelligence. A V5 brief produced by Haiku through a full harness will reliably outperform a V0 brief produced by a much more expensive model with no harness. That is not a claim about model capability. It is a claim about system design.</p><h2>What Comes Next: V6 and Beyond</h2><p>Once you have run all five versions and recorded your results, the natural question is what a sixth layer would look like. I&#8217;m too lazy to continue with the additional runs, but if you do have the time, try them. I will be discussing or experimenting with them anyway in a future article. Three directions are most productive:</p><h3>V6-A: Tool Use</h3><p>Give the model access to a web search tool so it can retrieve current evidence rather than drawing on training data. The TASK.md in V2 provides static evidence &#8212; V6-A makes it dynamic. Confidence ratings become more accurate when the model can verify claims in real time.</p><h3>V6-B: Multi-Agent Specialisation</h3><p>Replace the single generator with three specialised agents: one writes the thesis and claims, one writes the counterargument, one synthesises and checks for internal consistency. Each agent is narrowly focused. Specialisation almost always outperforms generalisation on structured tasks.</p><h3>V6-C: Adaptive Schema</h3><p>Allow the verifier&#8217;s notes to rewrite the system_prompt.md for the next session. The harness improves itself across runs. This is the feedback loop operating not on the output of a single run, but on the design of the harness itself. It is a slow, deliberate form of self-improvement &#8212; each session&#8217;s failures become the next session&#8217;s constraints.</p><p>This final direction &#8212; <strong>a harness that improves its own architecture based on performance data &#8212; is where the field is heading</strong>. It sits at the edge of what is currently practical and what is genuinely experimental. </p><p>The five versions in this guide are the foundation for that work. Have fun with them, and it will be fun to try.</p><h2>The Model Is an Engine</h2><p>There is a useful analogy for what harness engineering actually is. A car engine is the thing that produces power. It matters enormously &#8212; a weak engine limits what the car can do. But the engine does not determine where the car goes, whether it stays on the road, whether it signals before turning, or whether it stops at the right point. The rest of the car does that.</p><p>Most of the current conversation about AI performance is about engines. Which is more powerful. Which is cheaper to run. Which will be released next quarter. Harness engineering is about the rest of the car.</p><p>A well-engineered harness around a modest model is reliable, predictable, and improvable. Every failure mode it encounters becomes a harness layer. Every harness layer is a permanent improvement. The model providers will keep releasing better engines. The engineers who have already built good cars will benefit most from those upgrades &#8212; because a better engine in a well-engineered car compounds. A better engine in a car with no steering still crashes.</p><blockquote><p><em>Build the harness first. Then, when you upgrade the model, you get a multiplicative improvement &#8212; not an additive one.</em></p></blockquote><p>Note: The experiment in this article is small by design. A single model, a single topic, five versions, fifty points. But the principles it demonstrates &#8212; structure, context, verification, feedback, memory &#8212; are the same principles that govern every production AI system handling complex work at scale. The policy brief is a teaching tool. The harness is the point.</p><h2><strong>POST NOTE</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iIqk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F362b96bd-3122-4a0e-8ee9-3630197f3b67_1139x614.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iIqk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F362b96bd-3122-4a0e-8ee9-3630197f3b67_1139x614.png 424w, https://substackcdn.com/image/fetch/$s_!iIqk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F362b96bd-3122-4a0e-8ee9-3630197f3b67_1139x614.png 848w, https://substackcdn.com/image/fetch/$s_!iIqk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F362b96bd-3122-4a0e-8ee9-3630197f3b67_1139x614.png 1272w, https://substackcdn.com/image/fetch/$s_!iIqk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F362b96bd-3122-4a0e-8ee9-3630197f3b67_1139x614.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iIqk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F362b96bd-3122-4a0e-8ee9-3630197f3b67_1139x614.png" width="1139" height="614" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/362b96bd-3122-4a0e-8ee9-3630197f3b67_1139x614.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:614,&quot;width&quot;:1139,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1037911,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198855241?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F362b96bd-3122-4a0e-8ee9-3630197f3b67_1139x614.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iIqk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F362b96bd-3122-4a0e-8ee9-3630197f3b67_1139x614.png 424w, https://substackcdn.com/image/fetch/$s_!iIqk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F362b96bd-3122-4a0e-8ee9-3630197f3b67_1139x614.png 848w, https://substackcdn.com/image/fetch/$s_!iIqk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F362b96bd-3122-4a0e-8ee9-3630197f3b67_1139x614.png 1272w, https://substackcdn.com/image/fetch/$s_!iIqk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F362b96bd-3122-4a0e-8ee9-3630197f3b67_1139x614.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>When the experiment meets the real world: lessons from the <a href="https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated">ASCRS Harness Lab</a></em></p><p>The experiment in this article is designed to be accessible. A fixed topic, a fixed model, five harness layers added one at a time, a human-readable scoring rubric. The goal was to demonstrate, as plainly as possible, that the architecture around a model matters more than the model itself.</p><p>After completing these five versions, you will remember a more demanding test was run: the same harness engineering logic applied to a real analytical domain &#8212; pharmaceutical supply chain crisis response. The scenario: a Strait of Hormuz disruption, 23 purchase orders at risk, a six-hour decision window, and a CFO who needed a signed-off rerouting brief by 14:00 UTC. Ten harness architectures (H1 through H10) were tested against the same model, the same data, and the same six-criterion scoring rubric.</p><p>The results confirmed some of what the five-version experiment predicts &#8212; and sharply contradicted other parts. Both sets of findings are correct. Understanding why they diverge is where the real practical value lies.</p><h3>What the ASCRS Lab Confirmed</h3><p><strong>The prompt is still the highest-leverage single intervention. </strong>H2 &#8212; a structured system prompt with explicit carrier selection rules, financial ordering constraints, and anti-hallucination guardrails &#8212; scored &#945; = 1.000 (a perfect score across all six criteria) at barely more tokens than the bare baseline. It cost nothing in infrastructure, added negligible latency, and required no coordination logic. This maps directly onto the V1&#8594;V2 jump in the five-version experiment: specifying the output precisely is the single most important thing you can do before adding any other layer.</p><p><strong>The verifier must not grade its own output. </strong>A critical design constraint surfaced in the lab: if the scorer model and the generation model are the same, self-grading inflates scores by 15&#8211;30%. The measurement layer is only trustworthy when the judge is independent. This is the machine-readable version of the same principle that motivates having a separate verifier agent in V3.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DTW5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd02542c6-3583-415e-a0b7-cc48bb558b87_1164x627.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DTW5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd02542c6-3583-415e-a0b7-cc48bb558b87_1164x627.png 424w, https://substackcdn.com/image/fetch/$s_!DTW5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd02542c6-3583-415e-a0b7-cc48bb558b87_1164x627.png 848w, https://substackcdn.com/image/fetch/$s_!DTW5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd02542c6-3583-415e-a0b7-cc48bb558b87_1164x627.png 1272w, https://substackcdn.com/image/fetch/$s_!DTW5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd02542c6-3583-415e-a0b7-cc48bb558b87_1164x627.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DTW5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd02542c6-3583-415e-a0b7-cc48bb558b87_1164x627.png" width="1164" height="627" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d02542c6-3583-415e-a0b7-cc48bb558b87_1164x627.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:627,&quot;width&quot;:1164,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1098662,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198855241?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd02542c6-3583-415e-a0b7-cc48bb558b87_1164x627.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DTW5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd02542c6-3583-415e-a0b7-cc48bb558b87_1164x627.png 424w, https://substackcdn.com/image/fetch/$s_!DTW5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd02542c6-3583-415e-a0b7-cc48bb558b87_1164x627.png 848w, https://substackcdn.com/image/fetch/$s_!DTW5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd02542c6-3583-415e-a0b7-cc48bb558b87_1164x627.png 1272w, https://substackcdn.com/image/fetch/$s_!DTW5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd02542c6-3583-415e-a0b7-cc48bb558b87_1164x627.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Memory without quality control is a failure mode, not a feature. </strong>H6 loaded the best output from H5 as skill memory before generating. H5&#8217;s draft contained two conflicting cost figures &#8212; EUR 4.2M in one section, EUR 7.14M in another &#8212; that the model had not fully resolved internally. H6 inherited both simultaneously. The scorer penalised both the weighting criterion and the financial consistency criterion to zero. The alpha dropped from 0.750 to 0.300 &#8212; a catastrophic regression caused entirely by injecting unvalidated content into context. In the five-version experiment, V5 assumes the gold standard is a quality-controlled output. That assumption is doing significant work. If you load a flawed output as memory, you do not improve on it &#8212; you amplify it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JIF3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebb8c44-c5c9-4e7e-8d2a-2b02011ec24b_1115x618.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JIF3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebb8c44-c5c9-4e7e-8d2a-2b02011ec24b_1115x618.png 424w, https://substackcdn.com/image/fetch/$s_!JIF3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebb8c44-c5c9-4e7e-8d2a-2b02011ec24b_1115x618.png 848w, https://substackcdn.com/image/fetch/$s_!JIF3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebb8c44-c5c9-4e7e-8d2a-2b02011ec24b_1115x618.png 1272w, https://substackcdn.com/image/fetch/$s_!JIF3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebb8c44-c5c9-4e7e-8d2a-2b02011ec24b_1115x618.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JIF3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebb8c44-c5c9-4e7e-8d2a-2b02011ec24b_1115x618.png" width="1115" height="618" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4ebb8c44-c5c9-4e7e-8d2a-2b02011ec24b_1115x618.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:618,&quot;width&quot;:1115,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1227437,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198855241?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebb8c44-c5c9-4e7e-8d2a-2b02011ec24b_1115x618.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JIF3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebb8c44-c5c9-4e7e-8d2a-2b02011ec24b_1115x618.png 424w, https://substackcdn.com/image/fetch/$s_!JIF3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebb8c44-c5c9-4e7e-8d2a-2b02011ec24b_1115x618.png 848w, https://substackcdn.com/image/fetch/$s_!JIF3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebb8c44-c5c9-4e7e-8d2a-2b02011ec24b_1115x618.png 1272w, https://substackcdn.com/image/fetch/$s_!JIF3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebb8c44-c5c9-4e7e-8d2a-2b02011ec24b_1115x618.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Where the ASCRS Lab Diverged &#8212; and Why</h3><p>The five-version experiment predicts a rising staircase: each harness layer added produces a higher score. The ASCRS Lab found something more complicated. H3 (sequential tools) scored below the bare baseline. H6 (skill memory) collapsed. H8 (simulated human review) degraded content that was already sound. H9 (five-agent swarm) scored below H1. More architecture produced worse results in five of ten cases.</p><blockquote><p><em>The ASCRS Lab&#8217;s central finding: a plain, well-written prompt beat every multi-agent architecture tested. A five-agent swarm with a specialist reviewer and an orchestrator scored below the bare model with no instructions at all.</em></p></blockquote><p>This does not contradict the five-version experiment. It contextualises it. The two experiments tested different things:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HtEk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9c74216-e8d4-4191-8900-2ac8e2bb175e_791x303.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HtEk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9c74216-e8d4-4191-8900-2ac8e2bb175e_791x303.png 424w, https://substackcdn.com/image/fetch/$s_!HtEk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9c74216-e8d4-4191-8900-2ac8e2bb175e_791x303.png 848w, https://substackcdn.com/image/fetch/$s_!HtEk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9c74216-e8d4-4191-8900-2ac8e2bb175e_791x303.png 1272w, https://substackcdn.com/image/fetch/$s_!HtEk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9c74216-e8d4-4191-8900-2ac8e2bb175e_791x303.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HtEk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9c74216-e8d4-4191-8900-2ac8e2bb175e_791x303.png" width="791" height="303" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d9c74216-e8d4-4191-8900-2ac8e2bb175e_791x303.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:303,&quot;width&quot;:791,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:34519,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198855241?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9c74216-e8d4-4191-8900-2ac8e2bb175e_791x303.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HtEk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9c74216-e8d4-4191-8900-2ac8e2bb175e_791x303.png 424w, https://substackcdn.com/image/fetch/$s_!HtEk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9c74216-e8d4-4191-8900-2ac8e2bb175e_791x303.png 848w, https://substackcdn.com/image/fetch/$s_!HtEk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9c74216-e8d4-4191-8900-2ac8e2bb175e_791x303.png 1272w, https://substackcdn.com/image/fetch/$s_!HtEk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9c74216-e8d4-4191-8900-2ac8e2bb175e_791x303.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The five-version experiment deliberately chose a task where more structure genuinely helps: a policy brief with a defined schema, a fixed audience, and explicit evidence to work from. That is a <strong>greenfield task</strong> &#8212; all information is present upfront, the output format is known, and the model has everything it needs in context. Under those conditions, each harness layer adds signal without adding coordination overhead, and scores rise predictably.</p><p>The ASCRS domain is different in a specific and important way. A pharmaceutical rerouting brief requires domain-specific constraints that are not derivable from the task description alone &#8212; which carrier maintains &#8722;20&#176;C cold chain certification on a given routing, how to weight historical crisis precedents, the correct ordering of a financial planning figure relative to an escalation trigger. These constraints cannot be discovered by adding more agents. They must be specified. A swarm of five specialist agents, each producing independently correct fragments, failed because the orchestrator had no mechanism for selecting the best fragment &#8212; it averaged them. Every criterion landed at exactly 0.5.</p><h3>Five Lessons That Apply Differently Depending on Your Context</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mJe5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17fe8184-d29f-45c4-a1fe-6fb623d5524d_1145x628.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mJe5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17fe8184-d29f-45c4-a1fe-6fb623d5524d_1145x628.png 424w, https://substackcdn.com/image/fetch/$s_!mJe5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17fe8184-d29f-45c4-a1fe-6fb623d5524d_1145x628.png 848w, https://substackcdn.com/image/fetch/$s_!mJe5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17fe8184-d29f-45c4-a1fe-6fb623d5524d_1145x628.png 1272w, https://substackcdn.com/image/fetch/$s_!mJe5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17fe8184-d29f-45c4-a1fe-6fb623d5524d_1145x628.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mJe5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17fe8184-d29f-45c4-a1fe-6fb623d5524d_1145x628.png" width="1145" height="628" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/17fe8184-d29f-45c4-a1fe-6fb623d5524d_1145x628.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:628,&quot;width&quot;:1145,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1101665,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198855241?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17fe8184-d29f-45c4-a1fe-6fb623d5524d_1145x628.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mJe5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17fe8184-d29f-45c4-a1fe-6fb623d5524d_1145x628.png 424w, https://substackcdn.com/image/fetch/$s_!mJe5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17fe8184-d29f-45c4-a1fe-6fb623d5524d_1145x628.png 848w, https://substackcdn.com/image/fetch/$s_!mJe5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17fe8184-d29f-45c4-a1fe-6fb623d5524d_1145x628.png 1272w, https://substackcdn.com/image/fetch/$s_!mJe5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17fe8184-d29f-45c4-a1fe-6fb623d5524d_1145x628.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>1. Write the specification before you build the architecture</h4><p>The ASCRS Lab introduced a design rule it calls the H2 First Rule: never build a multi-agent swarm to fix what a better prompt could solve. If the model lacks domain knowledge &#8212; if it does not know that PO-2853 requires Qatar Airways Cargo, or that the financial planning figure must precede the escalation trigger &#8212; coordination loops will loop on the same failure. The specification is the knowledge. The architecture is the coordination mechanism. One cannot substitute for the other.</p><p><em>For general readers: </em>Before adding agents, tools, or memory layers to any AI system, write down in plain language every constraint a correct answer must satisfy. If you cannot write that specification, no architecture will produce consistent results.</p><h4>2. Match architecture to task structure, not to task complexity</h4><p>The ASCRS Lab synthesised its two experiments into a decision matrix: on <strong>greenfield tasks</strong> (all data upfront, single-turn reasoning, document output), well-written prompts win. On <strong>brownfield or greyfield tasks</strong> (real-time data feeds, multi-phase operations, ERP database writes, external state queries), structured multi-agent coordination wins. The question to ask is not &#8216;how complex is this task?&#8217; but &#8216;does this task require genuine parallelism, or does it require integrated reasoning?&#8217;</p><p><em>For general readers: </em>If your AI task involves looking something up and writing a document, a good prompt is probably sufficient. If it involves monitoring systems continuously, coordinating across multiple data sources, and updating records over time, a multi-agent architecture with explicit coordination logic is justified.</p><h4>3. The reviewer must check logical dependency, not keyword presence</h4><p>The H9 swarm included a dedicated reviewer sub-agent (SA_reviewer) whose job was to catch a specific type of financial inconsistency before the output was approved. It failed. It confirmed that the correct numbers appeared in both sections of the document without verifying that they derived from the same scenario model. In the five-version experiment, the verifier agent (V3/V4) scores against a rubric that includes specific calibration criteria. That rubric is doing the work the reviewer in H9 failed to do. The lesson: verification agents need criteria that test logical dependency, not surface presence.</p><p><em>For general readers: </em>When you build an AI system that checks its own work, the checking agent needs to know what &#8216;correct&#8217; means at the logical level &#8212; not just whether certain words or numbers appear in the output.</p><h4>4. The scorer separation rule is not optional in any domain</h4><p>Self-grading inflates alpha by 15&#8211;30% in the ASCRS domain. In the five-version experiment, the verifier is a separate model call with a separate system prompt &#8212; but it is the same underlying model. In high-stakes domains &#8212; medical, financial, legal, logistics &#8212; the independence of the evaluation layer is not a design nicety. It is the condition under which the measurement can be trusted at all. The practical implication: whenever you use AI to evaluate AI output, the evaluation model and the generation model should be different, with different incentives encoded in their respective system prompts.</p><p><em>For businesses: </em>If your AI system produces outputs that have financial, legal, or safety consequences, your verification layer must be structurally independent of your generation layer. The same model grading its own output is not verification &#8212; it is self-certification. For now, what i continue recommending - keep the human in the loop! HITL.</p><h4>5. The cost of coordination is real and scales with agent count</h4><p>H9&#8217;s five-agent swarm consumed 58,090 tokens &#8212; nearly four times H1&#8217;s 14,028 &#8212; for a quality score below the bare baseline. H7 (model routing, the most practically efficient architecture in the lab) scored &#945; = 0.900 at 26,635 tokens by routing cheap subtasks to a lightweight model and synthesis to the capable model. The ASCRS Lab&#8217;s production scoring function &#8212; weighting quality 70%, token cost 20%, and latency 10% &#8212; ranks H2 first, H7 second, H1 third. Every other architecture costs more for less. In production at scale, the coordination tax compounds. Route cheap tasks to cheap models; reserve expensive capacity for synthesis only.</p><p><em>For businesses: </em>Before deploying a multi-agent architecture in production, calculate the token and latency cost of each agent boundary. Every agent is a merge point. Every merge point is a coordination cost. That cost should be justified by a quality gain that a better-specified single-agent system cannot achieve.</p><p><em>The five-version experiment in this article proves that adding harness layers to a weak model produces measurable improvement. The ASCRS Harness Lab shows that this principle has boundaries. Both findings are useful. The first tells you that the harness is worth building. The second tells you which layers to build first, which to build carefully, and which to avoid until the specification is right. Write the spec. Add one layer. Measure. Repeat.</em></p><h2>References &amp; Further Reading</h2><p><strong>Research &amp; Foundational Papers</strong></p><p><strong>[1] </strong>Liu et al. (2023). AgentBench: Evaluating LLMs as Agents. Stanford / Tsinghua. The source of the 6&#215; performance gap finding cited throughout this article. <a href="https://arxiv.org/abs/2308.03688">https://arxiv.org/abs/2308.03688</a></p><p><strong>[2] </strong>Acemoglu, D. &amp; Restrepo, P. (2022). Tasks, Automation, and the Death of Middle-Skill Jobs. American Economic Review. <a href="https://www.aeaweb.org/articles?id=10.1257/jep.33.2.3">https://www.aeaweb.org/articles?id=10.1257/jep.33.2.3</a></p><p><strong>[3] </strong>Autor, D. et al. (2022). New Frontiers: The Origins and Content of New Work. Quarterly Journal of Economics. <a href="https://www.nber.org/papers/w30389">https://www.nber.org/papers/w30389</a></p><p><strong>[4] </strong>McKinsey Global Institute (2017). A Future That Works: Automation, Employment, and Productivity. <a href="https://www.mckinsey.com/featured-insights/digital-disruption/harnessing-automation-for-a-future-that-works">https://www.mckinsey.com/featured-insights/digital-disruption/harnessing-automation-for-a-future-that-works</a></p><p><strong>[5] </strong>World Economic Forum (2023). Future of Jobs Report 2023. <a href="https://www.weforum.org/publications/the-future-of-jobs-report-2023/">https://www.weforum.org/publications/the-future-of-jobs-report-2023/</a></p><p><strong>Harness Engineering &#8212; Key Writing</strong></p><p><strong>[6] </strong>Kumar, T. (2025). Harnesses in AI: A Deep Dive. IBM. YouTube. The talk that inspired this experiment. </p><blockquote><div id="youtube2-C_GG5g38vLU" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;C_GG5g38vLU&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/C_GG5g38vLU?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div></blockquote><p><strong>[7] </strong>Anthropic (2025). Claude Haiku 4-5 Model Documentation. <a href="https://docs.anthropic.com/en/docs/about-claude/models">https://docs.anthropic.com/en/docs/about-claude/models</a></p><p><strong>2026 &#8212; Origin Documents</strong></p><p><strong>[8] </strong>Hashimoto, M. (February 5, 2026). My AI Adoption Journey. mitchellh.com. The post that coined the term &#8216;harness engineering&#8217;. Step 5: Engineer the Harness. <a href="https://mitchellh.com/writing/my-ai-adoption-journey">https://mitchellh.com/writing/my-ai-adoption-journey</a></p><p><strong>[9] </strong>Lopopolo, R. / OpenAI (February 11, 2026). Harness Engineering: Leveraging Codex in an Agent-First World. Five months, 1M lines, zero hand-written code. &#8216;Humans steer. Agents execute.&#8217; <a href="https://openai.com/index/harness-engineering/">https://openai.com/index/harness-engineering/</a></p><p><strong>[10] </strong>B&#246;ckeler, B. / Martin Fowler (April 2, 2026). Harness Engineering for Coding Agent Users. martinfowler.com. Canonical guides-and-sensors taxonomy. Full article. <a href="https://martinfowler.com/articles/harness-engineering.html">https://martinfowler.com/articles/harness-engineering.html</a></p><p><strong>[11] </strong>B&#246;ckeler, B. / Martin Fowler (February 17, 2026). Harness Engineering &#8212; First Thoughts. martinfowler.com. The original memo responding to Hashimoto and Lopopolo. <a href="https://martinfowler.com/articles/exploring-gen-ai/harness-engineering-memo.html">https://martinfowler.com/articles/exploring-gen-ai/harness-engineering-memo.html</a></p><p><strong>2026 &#8212; Research Papers</strong></p><p><strong>[12] </strong>Zhu et al. (April 2026). SemaClaw: A Step Towards General-Purpose Personal AI Agents through Harness Engineering. arXiv:2604.11548. First academic paper to formally position harness engineering as a standalone engineering discipline. <a href="https://arxiv.org/abs/2604.11548">https://arxiv.org/abs/2604.11548</a></p><p><strong>[13] </strong>Vishnyakova, V. (March 2026). Context Engineering: From Prompts to Corporate Multi-Agent Architecture. arXiv:2603.09619. Four-level pyramid model: prompt &#8594; context &#8594; intent &#8594; specification engineering. <a href="https://arxiv.org/abs/2603.09619">https://arxiv.org/abs/2603.09619</a></p><p><strong>[14] </strong>OpenDev authors (March 2026). Building AI Coding Agents for the Terminal: Scaffolding, Harness, Context Engineering, and Lessons Learned. arXiv:2603.05344. Four-layer architecture with automated cross-session memory. <a href="https://arxiv.org/html/2603.05344v1">https://arxiv.org/html/2603.05344v1</a></p><p><strong>[15] </strong>Zhang, Q. et al. (October 2025, published ICLR 2026). Agentic Context Engineering (ACE): Evolving Contexts for Self-Improving Language Models. arXiv:2510.04618. +10.6% on agent benchmarks. Identifies brevity bias and context collapse. <a href="https://arxiv.org/abs/2510.04618">https://arxiv.org/abs/2510.04618</a></p><p><strong>2026 &#8212; Production Case Studies</strong></p><p><strong>[16] </strong>Microsoft / Azure SRE Team (April 14, 2026). Harness Engineering for Azure SRE Agent: Building the Agent Self-Improvement Loop. 35,000+ incidents, time-to-mitigation from 40.5 hours to 3 minutes. The agent investigated its own KV cache regression. <a href="https://techcommunity.microsoft.com/blog/appsonazureblog/the-agent-that-investigates-itself/4500073">https://techcommunity.microsoft.com/blog/appsonazureblog/the-agent-that-investigates-itself/4500073</a></p><p><strong>[17] </strong>Microsoft / Azure SRE Team (April 9, 2026). How We Build and Use Azure SRE Agent with Agentic Workflows. Customer Zero blog: 50,000+ developer hours saved. Built the agent using agents. <a href="https://techcommunity.microsoft.com/blog/appsonazureblog/how-we-build-and-use-azure-sre-agent-with-agentic-workflows/4508753">https://techcommunity.microsoft.com/blog/appsonazureblog/how-we-build-and-use-azure-sre-agent-with-agentic-workflows/4508753</a></p><p><strong>2026 &#8212; Practitioner Guides</strong></p><p><strong>[18] </strong>Masood, A. (April 2026). Agent Harness Engineering: The Rise of the AI Control Plane. Medium. Enterprise control plane framing, 88% production gap statistic, Plan-Execute-Verify loops. <a href="https://medium.com/@adnanmasood/agent-harness-engineering-the-rise-of-the-ai-control-plane-938ead884b1d">https://medium.com/@adnanmasood/agent-harness-engineering-the-rise-of-the-ai-control-plane-938ead884b1d</a></p><p><strong>[19] </strong>Augment Code (April 17, 2026). Harness Engineering for AI Coding Agents: Constraints That Ship Reliable Code. Deterministic vs. probabilistic distinction. Hashline experiment: harness-only change moved one model from 6.7% to 68.3% benchmark score. <a href="https://www.augmentcode.com/guides/harness-engineering-ai-coding-agents">https://www.augmentcode.com/guides/harness-engineering-ai-coding-agents</a></p><p><strong>[20] </strong>Milvus (April 9, 2026). What Is Harness Engineering for AI Agents? Covers Anthropic&#8217;s three-agent Planner/Generator/Evaluator experiment vs. solo agent on 2D game engine task. <a href="https://milvus.io/blog/harness-engineering-ai-agents.md">https://milvus.io/blog/harness-engineering-ai-agents.md</a></p><p><strong>[21] </strong>MindWired AI (March 30, 2026). Harness Engineering 101: How to Make AI Agents Actually Reliable. Three-era timeline (Prompt &#8594; Context &#8594; Harness), Stripe Minions case study (1,300 PRs/week). <a href="https://mindwiredai.com/2026/03/30/harness-engineering-guide-reliable-ai-agents/">https://mindwiredai.com/2026/03/30/harness-engineering-guide-reliable-ai-agents/</a></p><p><strong>2026 &#8212; Survey &amp; Definitional Pieces</strong></p><p><strong>[22] </strong>TechTimes (May 13, 2026). &#8216;Harness Engineering&#8217; Emerges as the Fourth Paradigm of AI Engineering. Most current survey. METR estimate: Claude Opus 4.6 has a 50%-time-horizon of ~14.5 hours on software tasks. <a href="https://www.techtimes.com/articles/316587/20260513/harness-engineering-emerges-fourth-paradigm-ai-engineering.htm">https://www.techtimes.com/articles/316587/20260513/harness-engineering-emerges-fourth-paradigm-ai-engineering.htm</a></p><p><strong>[23] </strong>SmartScope (March 2026, updated April 2026). What Is Harness Engineering: A New Concept Defining the Outside of Context Engineering. Best origin timeline: Hashimoto &#8594; Lopopolo &#8594; Mollick &#8594; Fowler, all within weeks. Includes Hashline data. <a href="https://smartscope.blog/en/blog/harness-engineering-overview/">https://smartscope.blog/en/blog/harness-engineering-overview/</a></p><p><strong>[24] </strong>Atlan (April 13, 2026). What Is Harness Engineering AI? The Definitive 2026 Guide. Guides-and-sensors taxonomy; Agent = Model + Harness formula; 27% of agent failures trace to data quality, not model or harness. <a href="https://atlan.com/know/what-is-harness-engineering/">https://atlan.com/know/what-is-harness-engineering/</a></p><p><strong>[25] </strong>Software Improvement Group (April 24, 2026). What Is Harness Engineering? Clean synthesis of origin story for general readers. Contextualises harness engineering relative to enterprise AI adoption. <a href="https://www.softwareimprovementgroup.com/blog/what-is-harness-engineering/">https://www.softwareimprovementgroup.com/blog/what-is-harness-engineering/</a></p><p><strong>[26] </strong>NxCode (March 2026). What Is Harness Engineering? Complete Guide for AI Agent Development 2026. Best piece for positioning harness engineering relative to MLOps and DevOps disciplines. <a href="https://www.nxcode.io/resources/news/what-is-harness-engineering-complete-guide-2026">https://www.nxcode.io/resources/news/what-is-harness-engineering-complete-guide-2026</a></p><p><strong>Benchmarks &amp; Evaluation</strong></p><p><strong>[27] </strong>Jimenez, C. et al. (2024). SWE-bench: Can Language Models Resolve Real-World GitHub Issues? The benchmark used to measure coding agent performance in harness research. <a href="https://arxiv.org/abs/2310.06770">https://arxiv.org/abs/2310.06770</a></p><p><strong>[28] </strong>Liu et al. (2023). AgentBench: Evaluating LLMs as Agents. Stanford / Tsinghua. Source of the 6&#215; performance gap finding. <a href="https://arxiv.org/abs/2308.03688">https://arxiv.org/abs/2308.03688</a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Priced to Scale, Priced To Fail: How Flat-Rate AI Subscriptions Create a Subsidy for Its Best Customers — and a Seven-Lever Strategy to Reverse It]]></title><description><![CDATA[A Strategic Framework for Sustainable Token Economics in the Agentic Era]]></description><link>https://interestingengineering.substack.com/p/priced-to-scale-priced-to-fail-how</link><guid isPermaLink="false">https://interestingengineering.substack.com/p/priced-to-scale-priced-to-fail-how</guid><dc:creator><![CDATA[Interesting Engineering ++]]></dc:creator><pubDate>Mon, 18 May 2026 02:38:44 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!_ojo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44c353ea-e4b4-419b-8139-d52dd34306cf_3822x2073.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="image-gallery-embed" data-attrs="{&quot;gallery&quot;:{&quot;images&quot;:[{&quot;type&quot;:&quot;image/jpeg&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/44c353ea-e4b4-419b-8139-d52dd34306cf_3822x2073.jpeg&quot;}],&quot;caption&quot;:&quot;&quot;,&quot;alt&quot;:&quot;&quot;,&quot;staticGalleryImage&quot;:{&quot;type&quot;:&quot;image/jpeg&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/44c353ea-e4b4-419b-8139-d52dd34306cf_3822x2073.jpeg&quot;}},&quot;isEditorNode&quot;:true}"></div><p>I have had, like many users of AI/AI Agentic &#8220;Tools&#8221; (we&#8217;ll generalize for now), a nagging concern about <strong>how far the &#8220;subscription models&#8221; will be able to sustain themselves</strong>, whilst companies like Anthropic, OpenAI, Perplexity etc build their various subscriber bases. Users can of course de-risk by running models or systems locally, hoping that the latest open-source or at least open-weight models will perform as well as their proprietary counterparts. And, from my perspective, the gap is narrowing. Some of the best local LLMs to date, for example - Qwen3, gpt-oss-20b/gpt-oss-120b, Deepseek V4, Mistral, Gemma 4 are amazing.  But, for now, these only take you so far. So I want to ask a slightly different question. Note in all assumptions - I use Anthropic&#8217;s Claude models to base all calculations. </p><div class="image-gallery-embed" data-attrs="{&quot;gallery&quot;:{&quot;images&quot;:[{&quot;type&quot;:&quot;image/jpeg&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/df4f57d0-6cb1-467a-a3ed-101f8da66142_3822x2057.jpeg&quot;}],&quot;caption&quot;:&quot;&quot;,&quot;alt&quot;:&quot;&quot;,&quot;staticGalleryImage&quot;:{&quot;type&quot;:&quot;image/jpeg&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/df4f57d0-6cb1-467a-a3ed-101f8da66142_3822x2057.jpeg&quot;}},&quot;isEditorNode&quot;:true}"></div><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><strong>Strategically, what set of structural changes would allow AI providers to reverse this dynamic, retain heavy users, and capture the margin the infrastructure economics demand? </strong></p><p>Before I proceed, a great recent read which encapsulates the concerns I speak of - is Eric Broda&#8217;s May 2026 <a href="https://agenticmesh.substack.com/p/the-brutal-reality-of-token-economics">A Brutal Reality of Token Economics</a>! His analysis demonstrates, with rigorous arithmetic, that <strong>flat-rate AI subscriptions are structurally loss-making for any user who operates above certain agentic thresholds</strong>. A median Claude Code developer consuming <strong>140 million tokens per month</strong> on a $200 plan generates an implied provider (Anthropic and/or their VC Investors) subsidy of approximately <strong>$254 per user per month</strong>. At the extreme end, a 500M-token Sonnet-heavy user generates an implied provider subsidy of <strong>$1,420 per month</strong>. Logically, how long can such &#8220;subsidies&#8221; last?! So my question instead is this - w<strong>hat would it take to reverse these negative dynamics? </strong></p><p style="text-align: justify;">The answer, possibly, lies somewhere between a <strong>seven-lever strategy I will map out.</strong> The first five &#8212; <strong>tiered usage metering, intelligent model routing, infrastructure inversion from cloud rental to owned depreciation, value-based enterprise licensing, and workflow-level token governance</strong> &#8212; are already visible in small doses, if you thread together strategies being applied by various providers. A sixth lever has emerged from a wave of <strong>efficiency-first architectural innovations pioneered primarily by Chinese AI laboratories</strong>. <strong>DeepSeek&#8217;s Multi-Head Latent Attention (MLA), Mixture-of-Experts (MoE) sparse activation, FP8 mixed-precision inference, Multi-Token Prediction (MTP), and production-grade speculative decoding &#8212; combined with Moonshot AI&#8217;s Kimi K2 Thinking quantization-aware training and Alibaba&#8217;s Qwen3 adaptive compute routing &#8212;</strong> together constitute a blueprint for reducing inference serve costs by a further <strong>40&#8211;80% beyond the first five levers</strong>, compounding to turn a $1,420 monthly subsidy into a gross margin exceeding 80%. Then the seventh - <strong>context engineering.</strong> There are assumptions involved, so getting there will very much depend on how these various AI Provider business models evolve.</p><p style="text-align: justify;">Perhaps what we need for heavy users is a <strong>hybrid structure</strong>: a $400 flat plan that includes 200M tokens, then $4.50/MTok overage on the remaining 300M tokens ($1,350). That overage is functionally pay-per-use. So the architecture is:</p><ul><li><p><strong>Light and moderate users</strong> &#8212; stay on flat subscriptions, already profitable, leave them alone</p></li><li><p><strong>Heavy agentic users</strong> &#8212; graduate to a higher tier with a larger included block, then pay metered rates above it. They keep the psychological simplicity of a subscription; the provider captures economics proportional to actual consumption</p></li><li><p><strong>Large enterprises</strong> &#8212; migrate to negotiated API or token bundles, where per-team chargebacks and FinOps visibility are in the enterprise&#8217;s own interest anyway</p></li></ul><p>The efficiency gains (Levers 2, 3, 6, 7) are what make this transition non-adversarial: by reducing serve costs to (estimated based on assumption - see <strong>Appendix I</strong>) $1.29/MTok, the provider can set overage at $4.50/MTok and still earn a 70%+ gross margin on that overage &#8212; while the user is still paying well below direct API list prices. <em><strong>Without the efficiency gains, raising prices just loses customers. With them, you can raise revenue and lower the user&#8217;s effective cost-per-output simultaneously.</strong></em></p><p><em>Note: <strong>See Appendix I for more detailed estimated calculations. </strong></em></p><blockquote><p><em>The gist of what I am saying is this:</em></p><p>The $3.24 <strong>cost of revenue per million tokens</strong> today &#8212; derived from Broda&#8217;s framework: blended API list price &#215; 60% COR ratio. It is what it costs you, on average, to serve a million tokens across your infrastructure.</p><p>The near-term architectural stack (FP8/4 + speculative decoding + adaptive compute routing + prefill/decode separation) could bring that down to <strong>~$1.29/MTok</strong> &#8212; a 60% COR reduction &#8212; without significantly changing what users experience. <strong>Infrastructure Inversion</strong> is another big one. </p><p>The full MLA + MoE architectural generation drops it further to <strong>~$0.42/MTok</strong> &#8212; an 87% COR reduction.</p><p><strong>The critical nuance:</strong> reducing COR alone doesn&#8217;t fix the subsidy. If you keep charging $200 flat and your COR falls to $1.29, the 500M-token user still generates a loss &#8212; just a smaller one ($645 serve cost vs $200 revenue = &#8722;$445 instead of &#8722;$1,420). The COR reduction is what <em>enables</em> the pricing restructure to work without losing customers. At $1.29/MTok serve cost you can offer overage at $4.50/MTok and the user is paying well below API list price &#8212; so they have no reason to leave &#8212; while you&#8217;re earning a ~70% margin on every overage token.</p><p>The implication for the argument is: as <strong>these levers drive COR</strong> from $3.24 to $1.29/MTok, Anthropic gains the room <strong>to either reduce API prices</strong> (competitive pressure from DeepSeek will likely force this) <strong>or maintain them and expand margins dramatically</strong>. </p><p><strong>The providers that get the COR down first have the most options. That's the way I see it. Hats off therefore to the many China based Model releases with their various constraint-driven optimizations, which I will discuss and include below. </strong></p></blockquote><h2>1. The Economic Problem, Precisely Stated</h2><p style="text-align: justify;">Using Broda&#8217;s published framework as our baseline:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!y2vP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe78e7a2e-670b-4be4-b65d-4af593fb511e_782x197.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!y2vP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe78e7a2e-670b-4be4-b65d-4af593fb511e_782x197.png 424w, https://substackcdn.com/image/fetch/$s_!y2vP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe78e7a2e-670b-4be4-b65d-4af593fb511e_782x197.png 848w, https://substackcdn.com/image/fetch/$s_!y2vP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe78e7a2e-670b-4be4-b65d-4af593fb511e_782x197.png 1272w, https://substackcdn.com/image/fetch/$s_!y2vP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe78e7a2e-670b-4be4-b65d-4af593fb511e_782x197.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!y2vP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe78e7a2e-670b-4be4-b65d-4af593fb511e_782x197.png" width="782" height="197" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e78e7a2e-670b-4be4-b65d-4af593fb511e_782x197.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:197,&quot;width&quot;:782,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:17151,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe78e7a2e-670b-4be4-b65d-4af593fb511e_782x197.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!y2vP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe78e7a2e-670b-4be4-b65d-4af593fb511e_782x197.png 424w, https://substackcdn.com/image/fetch/$s_!y2vP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe78e7a2e-670b-4be4-b65d-4af593fb511e_782x197.png 848w, https://substackcdn.com/image/fetch/$s_!y2vP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe78e7a2e-670b-4be4-b65d-4af593fb511e_782x197.png 1272w, https://substackcdn.com/image/fetch/$s_!y2vP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe78e7a2e-670b-4be4-b65d-4af593fb511e_782x197.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!POEc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcf0cd88-43bc-47f7-8fd4-6b8df60a5a02_785x421.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!POEc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcf0cd88-43bc-47f7-8fd4-6b8df60a5a02_785x421.png 424w, https://substackcdn.com/image/fetch/$s_!POEc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcf0cd88-43bc-47f7-8fd4-6b8df60a5a02_785x421.png 848w, https://substackcdn.com/image/fetch/$s_!POEc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcf0cd88-43bc-47f7-8fd4-6b8df60a5a02_785x421.png 1272w, https://substackcdn.com/image/fetch/$s_!POEc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcf0cd88-43bc-47f7-8fd4-6b8df60a5a02_785x421.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!POEc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcf0cd88-43bc-47f7-8fd4-6b8df60a5a02_785x421.png" width="785" height="421" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fcf0cd88-43bc-47f7-8fd4-6b8df60a5a02_785x421.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:421,&quot;width&quot;:785,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:45915,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcf0cd88-43bc-47f7-8fd4-6b8df60a5a02_785x421.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!POEc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcf0cd88-43bc-47f7-8fd4-6b8df60a5a02_785x421.png 424w, https://substackcdn.com/image/fetch/$s_!POEc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcf0cd88-43bc-47f7-8fd4-6b8df60a5a02_785x421.png 848w, https://substackcdn.com/image/fetch/$s_!POEc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcf0cd88-43bc-47f7-8fd4-6b8df60a5a02_785x421.png 1272w, https://substackcdn.com/image/fetch/$s_!POEc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcf0cd88-43bc-47f7-8fd4-6b8df60a5a02_785x421.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: <a href="https://agenticmesh.substack.com/p/the-brutal-reality-of-token-economics">Eric Broda</a></figcaption></figure></div><h2>2. The Six-Lever Strategic Framework</h2><p style="text-align: justify;">The following six levers, applied in combination, could eliminate the subsidy problem without degrading model quality or driving away high-value enterprise customers.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Qfb7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfb00094-0d65-4270-9c15-d04517e66b19_3822x2066.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Qfb7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfb00094-0d65-4270-9c15-d04517e66b19_3822x2066.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Qfb7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfb00094-0d65-4270-9c15-d04517e66b19_3822x2066.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Qfb7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfb00094-0d65-4270-9c15-d04517e66b19_3822x2066.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Qfb7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfb00094-0d65-4270-9c15-d04517e66b19_3822x2066.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Qfb7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfb00094-0d65-4270-9c15-d04517e66b19_3822x2066.jpeg" width="1456" height="787" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cfb00094-0d65-4270-9c15-d04517e66b19_3822x2066.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:787,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1310503,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfb00094-0d65-4270-9c15-d04517e66b19_3822x2066.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Qfb7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfb00094-0d65-4270-9c15-d04517e66b19_3822x2066.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Qfb7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfb00094-0d65-4270-9c15-d04517e66b19_3822x2066.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Qfb7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfb00094-0d65-4270-9c15-d04517e66b19_3822x2066.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Qfb7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfb00094-0d65-4270-9c15-d04517e66b19_3822x2066.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3 style="text-align: justify;"><strong>Lever 1 &#8212; Tiered Usage Metering</strong></h3><p style="text-align: justify;">Retain the subscription&#8217;s psychological appeal while introducing metered overage above the breakeven threshold.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PXhS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e5d6c5a-5972-4c60-845a-24f3d2e72892_783x602.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PXhS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e5d6c5a-5972-4c60-845a-24f3d2e72892_783x602.png 424w, https://substackcdn.com/image/fetch/$s_!PXhS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e5d6c5a-5972-4c60-845a-24f3d2e72892_783x602.png 848w, https://substackcdn.com/image/fetch/$s_!PXhS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e5d6c5a-5972-4c60-845a-24f3d2e72892_783x602.png 1272w, https://substackcdn.com/image/fetch/$s_!PXhS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e5d6c5a-5972-4c60-845a-24f3d2e72892_783x602.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PXhS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e5d6c5a-5972-4c60-845a-24f3d2e72892_783x602.png" width="783" height="602" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4e5d6c5a-5972-4c60-845a-24f3d2e72892_783x602.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:602,&quot;width&quot;:783,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:56158,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e5d6c5a-5972-4c60-845a-24f3d2e72892_783x602.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PXhS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e5d6c5a-5972-4c60-845a-24f3d2e72892_783x602.png 424w, https://substackcdn.com/image/fetch/$s_!PXhS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e5d6c5a-5972-4c60-845a-24f3d2e72892_783x602.png 848w, https://substackcdn.com/image/fetch/$s_!PXhS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e5d6c5a-5972-4c60-845a-24f3d2e72892_783x602.png 1272w, https://substackcdn.com/image/fetch/$s_!PXhS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e5d6c5a-5972-4c60-845a-24f3d2e72892_783x602.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Lever 2 &#8212; Intelligent Model Routing</h3><p style="text-align: justify;">Direct tasks to the cheapest model capable of completing them. Most agentic tasks do not require frontier reasoning.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sAdQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f186b6a-9cad-4d9b-8e99-5c034f070ef3_787x297.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sAdQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f186b6a-9cad-4d9b-8e99-5c034f070ef3_787x297.png 424w, https://substackcdn.com/image/fetch/$s_!sAdQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f186b6a-9cad-4d9b-8e99-5c034f070ef3_787x297.png 848w, https://substackcdn.com/image/fetch/$s_!sAdQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f186b6a-9cad-4d9b-8e99-5c034f070ef3_787x297.png 1272w, https://substackcdn.com/image/fetch/$s_!sAdQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f186b6a-9cad-4d9b-8e99-5c034f070ef3_787x297.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sAdQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f186b6a-9cad-4d9b-8e99-5c034f070ef3_787x297.png" width="787" height="297" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2f186b6a-9cad-4d9b-8e99-5c034f070ef3_787x297.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:297,&quot;width&quot;:787,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:25401,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f186b6a-9cad-4d9b-8e99-5c034f070ef3_787x297.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sAdQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f186b6a-9cad-4d9b-8e99-5c034f070ef3_787x297.png 424w, https://substackcdn.com/image/fetch/$s_!sAdQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f186b6a-9cad-4d9b-8e99-5c034f070ef3_787x297.png 848w, https://substackcdn.com/image/fetch/$s_!sAdQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f186b6a-9cad-4d9b-8e99-5c034f070ef3_787x297.png 1272w, https://substackcdn.com/image/fetch/$s_!sAdQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f186b6a-9cad-4d9b-8e99-5c034f070ef3_787x297.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Lever 3 &#8212; Infrastructure Inversion: From Cloud Rental to Owned Depreciation</h3><p style="text-align: justify;">When Anthropic runs inference on AWS or GCP, approximately <strong>20&#8211;30%</strong> of each cloud dollar is hyperscaler margin &#8212; not hardware, not power, not people. Migrating 50% of inference workload to owned infrastructure over three years recaptures that margin layer entirely. This is the arithmetic behind Microsoft&#8217;s $80B data centre programme, Google&#8217;s Ironwood TPU vertical integration, and Meta&#8217;s open infrastructure build-out.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UK74!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8258f38-531f-49f1-9d4a-f118e02af4d2_785x526.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UK74!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8258f38-531f-49f1-9d4a-f118e02af4d2_785x526.png 424w, https://substackcdn.com/image/fetch/$s_!UK74!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8258f38-531f-49f1-9d4a-f118e02af4d2_785x526.png 848w, https://substackcdn.com/image/fetch/$s_!UK74!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8258f38-531f-49f1-9d4a-f118e02af4d2_785x526.png 1272w, https://substackcdn.com/image/fetch/$s_!UK74!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8258f38-531f-49f1-9d4a-f118e02af4d2_785x526.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UK74!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8258f38-531f-49f1-9d4a-f118e02af4d2_785x526.png" width="785" height="526" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c8258f38-531f-49f1-9d4a-f118e02af4d2_785x526.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:526,&quot;width&quot;:785,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:50785,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8258f38-531f-49f1-9d4a-f118e02af4d2_785x526.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UK74!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8258f38-531f-49f1-9d4a-f118e02af4d2_785x526.png 424w, https://substackcdn.com/image/fetch/$s_!UK74!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8258f38-531f-49f1-9d4a-f118e02af4d2_785x526.png 848w, https://substackcdn.com/image/fetch/$s_!UK74!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8258f38-531f-49f1-9d4a-f118e02af4d2_785x526.png 1272w, https://substackcdn.com/image/fetch/$s_!UK74!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8258f38-531f-49f1-9d4a-f118e02af4d2_785x526.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Lever 4 &#8212; Value-Based Enterprise Licensing</h3><p style="text-align: justify;">Price per accepted output &#8212; code commit, document review, support ticket &#8212; rather than per token. This decouples revenue from token volume entirely.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IL9a!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f3e1b00-5f49-4b4d-8c20-7cef768caadc_782x300.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IL9a!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f3e1b00-5f49-4b4d-8c20-7cef768caadc_782x300.png 424w, https://substackcdn.com/image/fetch/$s_!IL9a!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f3e1b00-5f49-4b4d-8c20-7cef768caadc_782x300.png 848w, https://substackcdn.com/image/fetch/$s_!IL9a!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f3e1b00-5f49-4b4d-8c20-7cef768caadc_782x300.png 1272w, https://substackcdn.com/image/fetch/$s_!IL9a!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f3e1b00-5f49-4b4d-8c20-7cef768caadc_782x300.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IL9a!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f3e1b00-5f49-4b4d-8c20-7cef768caadc_782x300.png" width="782" height="300" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7f3e1b00-5f49-4b4d-8c20-7cef768caadc_782x300.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:300,&quot;width&quot;:782,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:25428,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f3e1b00-5f49-4b4d-8c20-7cef768caadc_782x300.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!IL9a!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f3e1b00-5f49-4b4d-8c20-7cef768caadc_782x300.png 424w, https://substackcdn.com/image/fetch/$s_!IL9a!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f3e1b00-5f49-4b4d-8c20-7cef768caadc_782x300.png 848w, https://substackcdn.com/image/fetch/$s_!IL9a!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f3e1b00-5f49-4b4d-8c20-7cef768caadc_782x300.png 1272w, https://substackcdn.com/image/fetch/$s_!IL9a!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f3e1b00-5f49-4b4d-8c20-7cef768caadc_782x300.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Lever 5 &#8212; Enterprise Token Governance</h3><p style="text-align: justify;">Build tooling that helps enterprises optimise their own token spend. The provider wins twice: serve costs fall and the governance layer is itself a monetisable product.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!s47S!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8de2d5c-4603-4ff9-8036-84b39eeb1b02_791x256.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!s47S!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8de2d5c-4603-4ff9-8036-84b39eeb1b02_791x256.png 424w, https://substackcdn.com/image/fetch/$s_!s47S!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8de2d5c-4603-4ff9-8036-84b39eeb1b02_791x256.png 848w, https://substackcdn.com/image/fetch/$s_!s47S!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8de2d5c-4603-4ff9-8036-84b39eeb1b02_791x256.png 1272w, https://substackcdn.com/image/fetch/$s_!s47S!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8de2d5c-4603-4ff9-8036-84b39eeb1b02_791x256.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!s47S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8de2d5c-4603-4ff9-8036-84b39eeb1b02_791x256.png" width="791" height="256" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e8de2d5c-4603-4ff9-8036-84b39eeb1b02_791x256.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:256,&quot;width&quot;:791,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:20692,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8de2d5c-4603-4ff9-8036-84b39eeb1b02_791x256.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!s47S!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8de2d5c-4603-4ff9-8036-84b39eeb1b02_791x256.png 424w, https://substackcdn.com/image/fetch/$s_!s47S!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8de2d5c-4603-4ff9-8036-84b39eeb1b02_791x256.png 848w, https://substackcdn.com/image/fetch/$s_!s47S!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8de2d5c-4603-4ff9-8036-84b39eeb1b02_791x256.png 1272w, https://substackcdn.com/image/fetch/$s_!s47S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8de2d5c-4603-4ff9-8036-84b39eeb1b02_791x256.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Lever 6 &#8212; The Architectural Innovation Stack: What China&#8217;s Efficiency-First Labs Have Published</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4OoK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79b7d006-6596-4a4c-8e49-14d76cae4894_3822x2068.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4OoK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79b7d006-6596-4a4c-8e49-14d76cae4894_3822x2068.jpeg 424w, https://substackcdn.com/image/fetch/$s_!4OoK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79b7d006-6596-4a4c-8e49-14d76cae4894_3822x2068.jpeg 848w, https://substackcdn.com/image/fetch/$s_!4OoK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79b7d006-6596-4a4c-8e49-14d76cae4894_3822x2068.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!4OoK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79b7d006-6596-4a4c-8e49-14d76cae4894_3822x2068.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4OoK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79b7d006-6596-4a4c-8e49-14d76cae4894_3822x2068.jpeg" width="1456" height="788" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/79b7d006-6596-4a4c-8e49-14d76cae4894_3822x2068.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:788,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1342755,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79b7d006-6596-4a4c-8e49-14d76cae4894_3822x2068.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4OoK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79b7d006-6596-4a4c-8e49-14d76cae4894_3822x2068.jpeg 424w, https://substackcdn.com/image/fetch/$s_!4OoK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79b7d006-6596-4a4c-8e49-14d76cae4894_3822x2068.jpeg 848w, https://substackcdn.com/image/fetch/$s_!4OoK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79b7d006-6596-4a4c-8e49-14d76cae4894_3822x2068.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!4OoK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79b7d006-6596-4a4c-8e49-14d76cae4894_3822x2068.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The first five levers are <em>pricing and operational</em> interventions. Lever 6 is <em>architectural</em>. Between January 2025 and mid-2026, Chinese AI laboratories &#8212; inspite of them operating under export controls on advanced semiconductors and therefore under extreme pressure to do more with less &#8212; published a series of research papers and open-weight models that collectively represent some of the most significant set of inference efficiency innovations since the transformer architecture itself. These are not theoretical: they are deployed in production at scale. Studying the technical reports, almost every AI provider has access to the blueprints.</p><p style="text-align: justify;">The competitive implication is stark: <strong>DeepSeek V2 is priced at $0.14 / $0.28 per million tokens</strong> (input/output) vs Anthropic Sonnet 4.6 at $3.00 / $15.00 &#8212; a <strong>20&#8211;50&#215; pricing gap</strong> &#8212; for comparable coding performance on many benchmarks. The efficiency techniques enabling that gap are documented and reproducible. Providers that do not adopt them will face compounding margin pressure as the market converges toward architecturally efficient pricing.</p><h4>Innovation 1 &#8212; Multi-Head Latent Attention (MLA)</h4><p style="text-align: justify;"><strong>Source: DeepSeek V2 (May 2024), DeepSeek V3 (December 2024) | </strong>Papers: arxiv.org/abs/2405.04434, arxiv.org/abs/2502.14837</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5b5X!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97132bac-e476-4d8a-b781-54875d72ee61_3822x2075.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5b5X!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97132bac-e476-4d8a-b781-54875d72ee61_3822x2075.jpeg 424w, https://substackcdn.com/image/fetch/$s_!5b5X!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97132bac-e476-4d8a-b781-54875d72ee61_3822x2075.jpeg 848w, https://substackcdn.com/image/fetch/$s_!5b5X!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97132bac-e476-4d8a-b781-54875d72ee61_3822x2075.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!5b5X!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97132bac-e476-4d8a-b781-54875d72ee61_3822x2075.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5b5X!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97132bac-e476-4d8a-b781-54875d72ee61_3822x2075.jpeg" width="1456" height="790" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/97132bac-e476-4d8a-b781-54875d72ee61_3822x2075.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:790,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1216726,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97132bac-e476-4d8a-b781-54875d72ee61_3822x2075.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5b5X!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97132bac-e476-4d8a-b781-54875d72ee61_3822x2075.jpeg 424w, https://substackcdn.com/image/fetch/$s_!5b5X!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97132bac-e476-4d8a-b781-54875d72ee61_3822x2075.jpeg 848w, https://substackcdn.com/image/fetch/$s_!5b5X!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97132bac-e476-4d8a-b781-54875d72ee61_3822x2075.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!5b5X!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97132bac-e476-4d8a-b781-54875d72ee61_3822x2075.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;"></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!E5CH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2637cfa4-5770-46b8-ac30-6a50a7d733c5_785x365.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!E5CH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2637cfa4-5770-46b8-ac30-6a50a7d733c5_785x365.png 424w, https://substackcdn.com/image/fetch/$s_!E5CH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2637cfa4-5770-46b8-ac30-6a50a7d733c5_785x365.png 848w, https://substackcdn.com/image/fetch/$s_!E5CH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2637cfa4-5770-46b8-ac30-6a50a7d733c5_785x365.png 1272w, https://substackcdn.com/image/fetch/$s_!E5CH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2637cfa4-5770-46b8-ac30-6a50a7d733c5_785x365.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!E5CH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2637cfa4-5770-46b8-ac30-6a50a7d733c5_785x365.png" width="785" height="365" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2637cfa4-5770-46b8-ac30-6a50a7d733c5_785x365.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:365,&quot;width&quot;:785,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:45876,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2637cfa4-5770-46b8-ac30-6a50a7d733c5_785x365.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!E5CH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2637cfa4-5770-46b8-ac30-6a50a7d733c5_785x365.png 424w, https://substackcdn.com/image/fetch/$s_!E5CH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2637cfa4-5770-46b8-ac30-6a50a7d733c5_785x365.png 848w, https://substackcdn.com/image/fetch/$s_!E5CH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2637cfa4-5770-46b8-ac30-6a50a7d733c5_785x365.png 1272w, https://substackcdn.com/image/fetch/$s_!E5CH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2637cfa4-5770-46b8-ac30-6a50a7d733c5_785x365.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4Cz1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53dfe982-db09-4060-93e3-1fc7a93e9c5d_787x472.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4Cz1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53dfe982-db09-4060-93e3-1fc7a93e9c5d_787x472.png 424w, https://substackcdn.com/image/fetch/$s_!4Cz1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53dfe982-db09-4060-93e3-1fc7a93e9c5d_787x472.png 848w, https://substackcdn.com/image/fetch/$s_!4Cz1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53dfe982-db09-4060-93e3-1fc7a93e9c5d_787x472.png 1272w, https://substackcdn.com/image/fetch/$s_!4Cz1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53dfe982-db09-4060-93e3-1fc7a93e9c5d_787x472.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4Cz1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53dfe982-db09-4060-93e3-1fc7a93e9c5d_787x472.png" width="787" height="472" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/53dfe982-db09-4060-93e3-1fc7a93e9c5d_787x472.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:472,&quot;width&quot;:787,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:41466,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53dfe982-db09-4060-93e3-1fc7a93e9c5d_787x472.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4Cz1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53dfe982-db09-4060-93e3-1fc7a93e9c5d_787x472.png 424w, https://substackcdn.com/image/fetch/$s_!4Cz1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53dfe982-db09-4060-93e3-1fc7a93e9c5d_787x472.png 848w, https://substackcdn.com/image/fetch/$s_!4Cz1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53dfe982-db09-4060-93e3-1fc7a93e9c5d_787x472.png 1272w, https://substackcdn.com/image/fetch/$s_!4Cz1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53dfe982-db09-4060-93e3-1fc7a93e9c5d_787x472.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Innovation 2 &#8212; Mixture of Experts (MoE) Sparse Activation</h4><p style="text-align: justify;"><strong>Source: DeepSeek V3 (671B/37B active), Kimi K2 (1T/32B active), Qwen3 (235B/22B active)</strong> | 2025&#8211;2026 releases</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_U-2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5123101d-8843-4d7d-ae27-c3669a3b296c_3822x2064.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_U-2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5123101d-8843-4d7d-ae27-c3669a3b296c_3822x2064.jpeg 424w, https://substackcdn.com/image/fetch/$s_!_U-2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5123101d-8843-4d7d-ae27-c3669a3b296c_3822x2064.jpeg 848w, https://substackcdn.com/image/fetch/$s_!_U-2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5123101d-8843-4d7d-ae27-c3669a3b296c_3822x2064.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!_U-2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5123101d-8843-4d7d-ae27-c3669a3b296c_3822x2064.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_U-2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5123101d-8843-4d7d-ae27-c3669a3b296c_3822x2064.jpeg" width="1456" height="786" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5123101d-8843-4d7d-ae27-c3669a3b296c_3822x2064.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:786,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1318977,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5123101d-8843-4d7d-ae27-c3669a3b296c_3822x2064.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_U-2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5123101d-8843-4d7d-ae27-c3669a3b296c_3822x2064.jpeg 424w, https://substackcdn.com/image/fetch/$s_!_U-2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5123101d-8843-4d7d-ae27-c3669a3b296c_3822x2064.jpeg 848w, https://substackcdn.com/image/fetch/$s_!_U-2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5123101d-8843-4d7d-ae27-c3669a3b296c_3822x2064.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!_U-2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5123101d-8843-4d7d-ae27-c3669a3b296c_3822x2064.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;"></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!n5x5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac456028-1a52-4e08-9b43-8ceb8a7506f7_785x353.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!n5x5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac456028-1a52-4e08-9b43-8ceb8a7506f7_785x353.png 424w, https://substackcdn.com/image/fetch/$s_!n5x5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac456028-1a52-4e08-9b43-8ceb8a7506f7_785x353.png 848w, https://substackcdn.com/image/fetch/$s_!n5x5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac456028-1a52-4e08-9b43-8ceb8a7506f7_785x353.png 1272w, https://substackcdn.com/image/fetch/$s_!n5x5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac456028-1a52-4e08-9b43-8ceb8a7506f7_785x353.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!n5x5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac456028-1a52-4e08-9b43-8ceb8a7506f7_785x353.png" width="785" height="353" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ac456028-1a52-4e08-9b43-8ceb8a7506f7_785x353.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:353,&quot;width&quot;:785,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:47952,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac456028-1a52-4e08-9b43-8ceb8a7506f7_785x353.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!n5x5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac456028-1a52-4e08-9b43-8ceb8a7506f7_785x353.png 424w, https://substackcdn.com/image/fetch/$s_!n5x5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac456028-1a52-4e08-9b43-8ceb8a7506f7_785x353.png 848w, https://substackcdn.com/image/fetch/$s_!n5x5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac456028-1a52-4e08-9b43-8ceb8a7506f7_785x353.png 1272w, https://substackcdn.com/image/fetch/$s_!n5x5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac456028-1a52-4e08-9b43-8ceb8a7506f7_785x353.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ex6s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F312345ff-b7b2-41d4-a077-09300e82fbf9_786x493.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ex6s!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F312345ff-b7b2-41d4-a077-09300e82fbf9_786x493.png 424w, https://substackcdn.com/image/fetch/$s_!ex6s!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F312345ff-b7b2-41d4-a077-09300e82fbf9_786x493.png 848w, https://substackcdn.com/image/fetch/$s_!ex6s!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F312345ff-b7b2-41d4-a077-09300e82fbf9_786x493.png 1272w, https://substackcdn.com/image/fetch/$s_!ex6s!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F312345ff-b7b2-41d4-a077-09300e82fbf9_786x493.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ex6s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F312345ff-b7b2-41d4-a077-09300e82fbf9_786x493.png" width="786" height="493" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/312345ff-b7b2-41d4-a077-09300e82fbf9_786x493.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:493,&quot;width&quot;:786,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:44299,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F312345ff-b7b2-41d4-a077-09300e82fbf9_786x493.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ex6s!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F312345ff-b7b2-41d4-a077-09300e82fbf9_786x493.png 424w, https://substackcdn.com/image/fetch/$s_!ex6s!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F312345ff-b7b2-41d4-a077-09300e82fbf9_786x493.png 848w, https://substackcdn.com/image/fetch/$s_!ex6s!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F312345ff-b7b2-41d4-a077-09300e82fbf9_786x493.png 1272w, https://substackcdn.com/image/fetch/$s_!ex6s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F312345ff-b7b2-41d4-a077-09300e82fbf9_786x493.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Innovation 3 &#8212; FP8 Mixed-Precision Inference (and Training)</h4><p style="text-align: justify;"><strong>Source: DeepSeek V3 FP8 training framework (Dec 2024), Kimi K2 INT4-QAT (Nov 2025), Qwen3-2507 FP8 checkpoint</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2q_z!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ce49f2-9c67-412c-8f7b-b783ba166b2a_3822x2070.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2q_z!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ce49f2-9c67-412c-8f7b-b783ba166b2a_3822x2070.jpeg 424w, https://substackcdn.com/image/fetch/$s_!2q_z!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ce49f2-9c67-412c-8f7b-b783ba166b2a_3822x2070.jpeg 848w, https://substackcdn.com/image/fetch/$s_!2q_z!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ce49f2-9c67-412c-8f7b-b783ba166b2a_3822x2070.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!2q_z!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ce49f2-9c67-412c-8f7b-b783ba166b2a_3822x2070.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2q_z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ce49f2-9c67-412c-8f7b-b783ba166b2a_3822x2070.jpeg" width="1456" height="789" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/83ce49f2-9c67-412c-8f7b-b783ba166b2a_3822x2070.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:789,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1115938,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ce49f2-9c67-412c-8f7b-b783ba166b2a_3822x2070.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2q_z!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ce49f2-9c67-412c-8f7b-b783ba166b2a_3822x2070.jpeg 424w, https://substackcdn.com/image/fetch/$s_!2q_z!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ce49f2-9c67-412c-8f7b-b783ba166b2a_3822x2070.jpeg 848w, https://substackcdn.com/image/fetch/$s_!2q_z!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ce49f2-9c67-412c-8f7b-b783ba166b2a_3822x2070.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!2q_z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ce49f2-9c67-412c-8f7b-b783ba166b2a_3822x2070.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;"></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9O2-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb6cc47d-1596-47f0-8fde-a03c80713da1_787x377.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9O2-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb6cc47d-1596-47f0-8fde-a03c80713da1_787x377.png 424w, https://substackcdn.com/image/fetch/$s_!9O2-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb6cc47d-1596-47f0-8fde-a03c80713da1_787x377.png 848w, https://substackcdn.com/image/fetch/$s_!9O2-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb6cc47d-1596-47f0-8fde-a03c80713da1_787x377.png 1272w, https://substackcdn.com/image/fetch/$s_!9O2-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb6cc47d-1596-47f0-8fde-a03c80713da1_787x377.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9O2-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb6cc47d-1596-47f0-8fde-a03c80713da1_787x377.png" width="787" height="377" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fb6cc47d-1596-47f0-8fde-a03c80713da1_787x377.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:377,&quot;width&quot;:787,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:49476,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb6cc47d-1596-47f0-8fde-a03c80713da1_787x377.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9O2-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb6cc47d-1596-47f0-8fde-a03c80713da1_787x377.png 424w, https://substackcdn.com/image/fetch/$s_!9O2-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb6cc47d-1596-47f0-8fde-a03c80713da1_787x377.png 848w, https://substackcdn.com/image/fetch/$s_!9O2-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb6cc47d-1596-47f0-8fde-a03c80713da1_787x377.png 1272w, https://substackcdn.com/image/fetch/$s_!9O2-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb6cc47d-1596-47f0-8fde-a03c80713da1_787x377.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qKBu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F335b643f-1264-4225-bd5f-9377011741f7_787x440.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qKBu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F335b643f-1264-4225-bd5f-9377011741f7_787x440.png 424w, https://substackcdn.com/image/fetch/$s_!qKBu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F335b643f-1264-4225-bd5f-9377011741f7_787x440.png 848w, https://substackcdn.com/image/fetch/$s_!qKBu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F335b643f-1264-4225-bd5f-9377011741f7_787x440.png 1272w, https://substackcdn.com/image/fetch/$s_!qKBu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F335b643f-1264-4225-bd5f-9377011741f7_787x440.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qKBu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F335b643f-1264-4225-bd5f-9377011741f7_787x440.png" width="787" height="440" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/335b643f-1264-4225-bd5f-9377011741f7_787x440.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:440,&quot;width&quot;:787,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:37878,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F335b643f-1264-4225-bd5f-9377011741f7_787x440.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qKBu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F335b643f-1264-4225-bd5f-9377011741f7_787x440.png 424w, https://substackcdn.com/image/fetch/$s_!qKBu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F335b643f-1264-4225-bd5f-9377011741f7_787x440.png 848w, https://substackcdn.com/image/fetch/$s_!qKBu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F335b643f-1264-4225-bd5f-9377011741f7_787x440.png 1272w, https://substackcdn.com/image/fetch/$s_!qKBu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F335b643f-1264-4225-bd5f-9377011741f7_787x440.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Innovation 4 &#8212; Multi-Token Prediction (MTP) and Production Speculative Decoding</h4><p style="text-align: justify;"><strong>Source: DeepSeek V3 MTP (Dec 2024); EAGLE-3 / MEDUSA frameworks (2025); Together AI ATLAS (2025&#8211;2026)</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ubiP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcc43308-d4c0-4f12-b020-b2d13f225314_3822x2063.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ubiP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcc43308-d4c0-4f12-b020-b2d13f225314_3822x2063.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ubiP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcc43308-d4c0-4f12-b020-b2d13f225314_3822x2063.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ubiP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcc43308-d4c0-4f12-b020-b2d13f225314_3822x2063.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ubiP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcc43308-d4c0-4f12-b020-b2d13f225314_3822x2063.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ubiP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcc43308-d4c0-4f12-b020-b2d13f225314_3822x2063.jpeg" width="1456" height="786" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dcc43308-d4c0-4f12-b020-b2d13f225314_3822x2063.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:786,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1442591,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcc43308-d4c0-4f12-b020-b2d13f225314_3822x2063.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ubiP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcc43308-d4c0-4f12-b020-b2d13f225314_3822x2063.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ubiP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcc43308-d4c0-4f12-b020-b2d13f225314_3822x2063.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ubiP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcc43308-d4c0-4f12-b020-b2d13f225314_3822x2063.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ubiP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcc43308-d4c0-4f12-b020-b2d13f225314_3822x2063.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;"></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EIMr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab5550d-e5b8-4cc4-82f6-de60f74dc1ef_785x445.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EIMr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab5550d-e5b8-4cc4-82f6-de60f74dc1ef_785x445.png 424w, https://substackcdn.com/image/fetch/$s_!EIMr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab5550d-e5b8-4cc4-82f6-de60f74dc1ef_785x445.png 848w, https://substackcdn.com/image/fetch/$s_!EIMr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab5550d-e5b8-4cc4-82f6-de60f74dc1ef_785x445.png 1272w, https://substackcdn.com/image/fetch/$s_!EIMr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab5550d-e5b8-4cc4-82f6-de60f74dc1ef_785x445.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EIMr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab5550d-e5b8-4cc4-82f6-de60f74dc1ef_785x445.png" width="785" height="445" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fab5550d-e5b8-4cc4-82f6-de60f74dc1ef_785x445.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:445,&quot;width&quot;:785,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:59313,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab5550d-e5b8-4cc4-82f6-de60f74dc1ef_785x445.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EIMr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab5550d-e5b8-4cc4-82f6-de60f74dc1ef_785x445.png 424w, https://substackcdn.com/image/fetch/$s_!EIMr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab5550d-e5b8-4cc4-82f6-de60f74dc1ef_785x445.png 848w, https://substackcdn.com/image/fetch/$s_!EIMr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab5550d-e5b8-4cc4-82f6-de60f74dc1ef_785x445.png 1272w, https://substackcdn.com/image/fetch/$s_!EIMr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab5550d-e5b8-4cc4-82f6-de60f74dc1ef_785x445.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-iRJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279b6688-f5c6-4c58-bf73-56725f20f3a9_787x516.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-iRJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279b6688-f5c6-4c58-bf73-56725f20f3a9_787x516.png 424w, https://substackcdn.com/image/fetch/$s_!-iRJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279b6688-f5c6-4c58-bf73-56725f20f3a9_787x516.png 848w, https://substackcdn.com/image/fetch/$s_!-iRJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279b6688-f5c6-4c58-bf73-56725f20f3a9_787x516.png 1272w, https://substackcdn.com/image/fetch/$s_!-iRJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279b6688-f5c6-4c58-bf73-56725f20f3a9_787x516.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-iRJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279b6688-f5c6-4c58-bf73-56725f20f3a9_787x516.png" width="787" height="516" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/279b6688-f5c6-4c58-bf73-56725f20f3a9_787x516.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:516,&quot;width&quot;:787,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:47619,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279b6688-f5c6-4c58-bf73-56725f20f3a9_787x516.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-iRJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279b6688-f5c6-4c58-bf73-56725f20f3a9_787x516.png 424w, https://substackcdn.com/image/fetch/$s_!-iRJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279b6688-f5c6-4c58-bf73-56725f20f3a9_787x516.png 848w, https://substackcdn.com/image/fetch/$s_!-iRJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279b6688-f5c6-4c58-bf73-56725f20f3a9_787x516.png 1272w, https://substackcdn.com/image/fetch/$s_!-iRJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279b6688-f5c6-4c58-bf73-56725f20f3a9_787x516.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Innovation 5 &#8212; Adaptive Compute Routing and Prefill/Decode Separation</h4><p style="text-align: justify;"><strong>Source: Qwen3 Thinking/Non-Thinking Mode (Apr 2025); Kimi K2 Interleaved Thinking (Nov 2025); Moonshot Mooncake (2025)</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kNht!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268d5a94-d1b0-4bbc-a55c-d23376f468b0_3822x2073.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kNht!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268d5a94-d1b0-4bbc-a55c-d23376f468b0_3822x2073.jpeg 424w, https://substackcdn.com/image/fetch/$s_!kNht!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268d5a94-d1b0-4bbc-a55c-d23376f468b0_3822x2073.jpeg 848w, https://substackcdn.com/image/fetch/$s_!kNht!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268d5a94-d1b0-4bbc-a55c-d23376f468b0_3822x2073.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!kNht!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268d5a94-d1b0-4bbc-a55c-d23376f468b0_3822x2073.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kNht!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268d5a94-d1b0-4bbc-a55c-d23376f468b0_3822x2073.jpeg" width="1456" height="790" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/268d5a94-d1b0-4bbc-a55c-d23376f468b0_3822x2073.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:790,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1389549,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268d5a94-d1b0-4bbc-a55c-d23376f468b0_3822x2073.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kNht!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268d5a94-d1b0-4bbc-a55c-d23376f468b0_3822x2073.jpeg 424w, https://substackcdn.com/image/fetch/$s_!kNht!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268d5a94-d1b0-4bbc-a55c-d23376f468b0_3822x2073.jpeg 848w, https://substackcdn.com/image/fetch/$s_!kNht!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268d5a94-d1b0-4bbc-a55c-d23376f468b0_3822x2073.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!kNht!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268d5a94-d1b0-4bbc-a55c-d23376f468b0_3822x2073.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;"></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qsqa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62132670-5d77-4266-8a67-8be7ac50d7ec_790x420.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qsqa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62132670-5d77-4266-8a67-8be7ac50d7ec_790x420.png 424w, https://substackcdn.com/image/fetch/$s_!qsqa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62132670-5d77-4266-8a67-8be7ac50d7ec_790x420.png 848w, https://substackcdn.com/image/fetch/$s_!qsqa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62132670-5d77-4266-8a67-8be7ac50d7ec_790x420.png 1272w, https://substackcdn.com/image/fetch/$s_!qsqa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62132670-5d77-4266-8a67-8be7ac50d7ec_790x420.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qsqa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62132670-5d77-4266-8a67-8be7ac50d7ec_790x420.png" width="790" height="420" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/62132670-5d77-4266-8a67-8be7ac50d7ec_790x420.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:420,&quot;width&quot;:790,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:58315,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62132670-5d77-4266-8a67-8be7ac50d7ec_790x420.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qsqa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62132670-5d77-4266-8a67-8be7ac50d7ec_790x420.png 424w, https://substackcdn.com/image/fetch/$s_!qsqa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62132670-5d77-4266-8a67-8be7ac50d7ec_790x420.png 848w, https://substackcdn.com/image/fetch/$s_!qsqa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62132670-5d77-4266-8a67-8be7ac50d7ec_790x420.png 1272w, https://substackcdn.com/image/fetch/$s_!qsqa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62132670-5d77-4266-8a67-8be7ac50d7ec_790x420.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pH9r!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55b529b8-5d92-431a-aafc-4bdc73506787_786x517.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pH9r!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55b529b8-5d92-431a-aafc-4bdc73506787_786x517.png 424w, https://substackcdn.com/image/fetch/$s_!pH9r!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55b529b8-5d92-431a-aafc-4bdc73506787_786x517.png 848w, https://substackcdn.com/image/fetch/$s_!pH9r!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55b529b8-5d92-431a-aafc-4bdc73506787_786x517.png 1272w, https://substackcdn.com/image/fetch/$s_!pH9r!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55b529b8-5d92-431a-aafc-4bdc73506787_786x517.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pH9r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55b529b8-5d92-431a-aafc-4bdc73506787_786x517.png" width="786" height="517" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/55b529b8-5d92-431a-aafc-4bdc73506787_786x517.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:517,&quot;width&quot;:786,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:49379,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55b529b8-5d92-431a-aafc-4bdc73506787_786x517.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pH9r!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55b529b8-5d92-431a-aafc-4bdc73506787_786x517.png 424w, https://substackcdn.com/image/fetch/$s_!pH9r!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55b529b8-5d92-431a-aafc-4bdc73506787_786x517.png 848w, https://substackcdn.com/image/fetch/$s_!pH9r!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55b529b8-5d92-431a-aafc-4bdc73506787_786x517.png 1272w, https://substackcdn.com/image/fetch/$s_!pH9r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55b529b8-5d92-431a-aafc-4bdc73506787_786x517.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>The Compounded Architectural Transformation: Near-Term vs Full Architecture</h3><p style="text-align: justify;">The five architectural innovations above operate at two horizons. The following table shows the step-by-step serve cost walk:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rxWJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3ced7b-57d9-4ad9-944e-3ec64489adcb_3822x2086.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rxWJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3ced7b-57d9-4ad9-944e-3ec64489adcb_3822x2086.jpeg 424w, https://substackcdn.com/image/fetch/$s_!rxWJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3ced7b-57d9-4ad9-944e-3ec64489adcb_3822x2086.jpeg 848w, https://substackcdn.com/image/fetch/$s_!rxWJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3ced7b-57d9-4ad9-944e-3ec64489adcb_3822x2086.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!rxWJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3ced7b-57d9-4ad9-944e-3ec64489adcb_3822x2086.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rxWJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3ced7b-57d9-4ad9-944e-3ec64489adcb_3822x2086.jpeg" width="1456" height="795" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3f3ced7b-57d9-4ad9-944e-3ec64489adcb_3822x2086.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:795,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1208022,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3ced7b-57d9-4ad9-944e-3ec64489adcb_3822x2086.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rxWJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3ced7b-57d9-4ad9-944e-3ec64489adcb_3822x2086.jpeg 424w, https://substackcdn.com/image/fetch/$s_!rxWJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3ced7b-57d9-4ad9-944e-3ec64489adcb_3822x2086.jpeg 848w, https://substackcdn.com/image/fetch/$s_!rxWJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3ced7b-57d9-4ad9-944e-3ec64489adcb_3822x2086.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!rxWJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3ced7b-57d9-4ad9-944e-3ec64489adcb_3822x2086.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;"></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!F1A0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F882f2032-56b5-433f-9b90-2f9bd1f6ed85_785x386.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!F1A0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F882f2032-56b5-433f-9b90-2f9bd1f6ed85_785x386.png 424w, https://substackcdn.com/image/fetch/$s_!F1A0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F882f2032-56b5-433f-9b90-2f9bd1f6ed85_785x386.png 848w, https://substackcdn.com/image/fetch/$s_!F1A0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F882f2032-56b5-433f-9b90-2f9bd1f6ed85_785x386.png 1272w, https://substackcdn.com/image/fetch/$s_!F1A0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F882f2032-56b5-433f-9b90-2f9bd1f6ed85_785x386.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!F1A0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F882f2032-56b5-433f-9b90-2f9bd1f6ed85_785x386.png" width="785" height="386" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/882f2032-56b5-433f-9b90-2f9bd1f6ed85_785x386.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:386,&quot;width&quot;:785,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:40017,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F882f2032-56b5-433f-9b90-2f9bd1f6ed85_785x386.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!F1A0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F882f2032-56b5-433f-9b90-2f9bd1f6ed85_785x386.png 424w, https://substackcdn.com/image/fetch/$s_!F1A0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F882f2032-56b5-433f-9b90-2f9bd1f6ed85_785x386.png 848w, https://substackcdn.com/image/fetch/$s_!F1A0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F882f2032-56b5-433f-9b90-2f9bd1f6ed85_785x386.png 1272w, https://substackcdn.com/image/fetch/$s_!F1A0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F882f2032-56b5-433f-9b90-2f9bd1f6ed85_785x386.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HPae!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5431f78-9111-423b-964d-ce8fe6af89cb_785x165.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HPae!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5431f78-9111-423b-964d-ce8fe6af89cb_785x165.png 424w, https://substackcdn.com/image/fetch/$s_!HPae!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5431f78-9111-423b-964d-ce8fe6af89cb_785x165.png 848w, https://substackcdn.com/image/fetch/$s_!HPae!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5431f78-9111-423b-964d-ce8fe6af89cb_785x165.png 1272w, https://substackcdn.com/image/fetch/$s_!HPae!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5431f78-9111-423b-964d-ce8fe6af89cb_785x165.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HPae!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5431f78-9111-423b-964d-ce8fe6af89cb_785x165.png" width="785" height="165" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c5431f78-9111-423b-964d-ce8fe6af89cb_785x165.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:165,&quot;width&quot;:785,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:31349,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5431f78-9111-423b-964d-ce8fe6af89cb_785x165.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HPae!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5431f78-9111-423b-964d-ce8fe6af89cb_785x165.png 424w, https://substackcdn.com/image/fetch/$s_!HPae!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5431f78-9111-423b-964d-ce8fe6af89cb_785x165.png 848w, https://substackcdn.com/image/fetch/$s_!HPae!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5431f78-9111-423b-964d-ce8fe6af89cb_785x165.png 1272w, https://substackcdn.com/image/fetch/$s_!HPae!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5431f78-9111-423b-964d-ce8fe6af89cb_785x165.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Lever 7 &#8212; Context Engineering: Lazy Loading, RAG, and Memory Architecture</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GR1Z!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4eccc79-d30f-4013-a8a5-5eb7d920624b_3822x2077.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GR1Z!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4eccc79-d30f-4013-a8a5-5eb7d920624b_3822x2077.jpeg 424w, https://substackcdn.com/image/fetch/$s_!GR1Z!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4eccc79-d30f-4013-a8a5-5eb7d920624b_3822x2077.jpeg 848w, https://substackcdn.com/image/fetch/$s_!GR1Z!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4eccc79-d30f-4013-a8a5-5eb7d920624b_3822x2077.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!GR1Z!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4eccc79-d30f-4013-a8a5-5eb7d920624b_3822x2077.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GR1Z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4eccc79-d30f-4013-a8a5-5eb7d920624b_3822x2077.jpeg" width="1456" height="791" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c4eccc79-d30f-4013-a8a5-5eb7d920624b_3822x2077.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:791,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1358503,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4eccc79-d30f-4013-a8a5-5eb7d920624b_3822x2077.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GR1Z!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4eccc79-d30f-4013-a8a5-5eb7d920624b_3822x2077.jpeg 424w, https://substackcdn.com/image/fetch/$s_!GR1Z!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4eccc79-d30f-4013-a8a5-5eb7d920624b_3822x2077.jpeg 848w, https://substackcdn.com/image/fetch/$s_!GR1Z!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4eccc79-d30f-4013-a8a5-5eb7d920624b_3822x2077.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!GR1Z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4eccc79-d30f-4013-a8a5-5eb7d920624b_3822x2077.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The first six levers address <em>how much it costs to serve a token</em>. Lever 7 addresses <em>how many tokens are needed in the first place.</em> Context engineering &#8212; the discipline of loading only what is needed, when it is needed &#8212; is the highest-leverage harness-layer intervention available to both providers and enterprises today. It requires no model changes, no infrastructure investment, and no pricing negotiation. It requires only disciplined context architecture.</p><h4>Technique 7a &#8212; Lazy Loading and Skill-Based Context</h4><p style="text-align: justify;">The SKILL.md pattern used in this system is the canonical production example. Instead of pre-loading all possible knowledge and instructions into every context window, only the skills relevant to the current task are fetched and loaded. The cost arithmetic is stark:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0G3z!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff70b3a0-62fb-4495-82bf-f332dced553d_788x342.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0G3z!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff70b3a0-62fb-4495-82bf-f332dced553d_788x342.png 424w, https://substackcdn.com/image/fetch/$s_!0G3z!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff70b3a0-62fb-4495-82bf-f332dced553d_788x342.png 848w, https://substackcdn.com/image/fetch/$s_!0G3z!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff70b3a0-62fb-4495-82bf-f332dced553d_788x342.png 1272w, https://substackcdn.com/image/fetch/$s_!0G3z!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff70b3a0-62fb-4495-82bf-f332dced553d_788x342.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0G3z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff70b3a0-62fb-4495-82bf-f332dced553d_788x342.png" width="788" height="342" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ff70b3a0-62fb-4495-82bf-f332dced553d_788x342.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:342,&quot;width&quot;:788,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:30232,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff70b3a0-62fb-4495-82bf-f332dced553d_788x342.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0G3z!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff70b3a0-62fb-4495-82bf-f332dced553d_788x342.png 424w, https://substackcdn.com/image/fetch/$s_!0G3z!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff70b3a0-62fb-4495-82bf-f332dced553d_788x342.png 848w, https://substackcdn.com/image/fetch/$s_!0G3z!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff70b3a0-62fb-4495-82bf-f332dced553d_788x342.png 1272w, https://substackcdn.com/image/fetch/$s_!0G3z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff70b3a0-62fb-4495-82bf-f332dced553d_788x342.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Technique 7b &#8212; Retrieval-Augmented Generation (RAG)</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Y-U5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab6d8c7-5595-4dab-ac4a-719a3f3da391_3822x2072.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Y-U5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab6d8c7-5595-4dab-ac4a-719a3f3da391_3822x2072.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Y-U5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab6d8c7-5595-4dab-ac4a-719a3f3da391_3822x2072.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Y-U5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab6d8c7-5595-4dab-ac4a-719a3f3da391_3822x2072.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Y-U5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab6d8c7-5595-4dab-ac4a-719a3f3da391_3822x2072.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Y-U5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab6d8c7-5595-4dab-ac4a-719a3f3da391_3822x2072.jpeg" width="1456" height="789" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fab6d8c7-5595-4dab-ac4a-719a3f3da391_3822x2072.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:789,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1210970,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab6d8c7-5595-4dab-ac4a-719a3f3da391_3822x2072.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Y-U5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab6d8c7-5595-4dab-ac4a-719a3f3da391_3822x2072.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Y-U5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab6d8c7-5595-4dab-ac4a-719a3f3da391_3822x2072.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Y-U5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab6d8c7-5595-4dab-ac4a-719a3f3da391_3822x2072.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Y-U5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab6d8c7-5595-4dab-ac4a-719a3f3da391_3822x2072.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">Instead of loading an entire codebase or document corpus into context, a vector database stores embedded representations. At query time, only the <strong>semantically relevant chunks</strong> &#8212; typically 1,000&#8211;5,000 tokens &#8212; are retrieved and loaded. The rest stays in the vector store at near-zero retrieval cost. For enterprise codebases, this replaces loading 100,000&#8211;500,000 tokens per call with loading 2,000&#8211;8,000 tokens per call.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5GmR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa9c987e-8e80-4ab5-9981-51241643e4f7_788x340.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5GmR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa9c987e-8e80-4ab5-9981-51241643e4f7_788x340.png 424w, https://substackcdn.com/image/fetch/$s_!5GmR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa9c987e-8e80-4ab5-9981-51241643e4f7_788x340.png 848w, https://substackcdn.com/image/fetch/$s_!5GmR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa9c987e-8e80-4ab5-9981-51241643e4f7_788x340.png 1272w, https://substackcdn.com/image/fetch/$s_!5GmR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa9c987e-8e80-4ab5-9981-51241643e4f7_788x340.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5GmR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa9c987e-8e80-4ab5-9981-51241643e4f7_788x340.png" width="788" height="340" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/aa9c987e-8e80-4ab5-9981-51241643e4f7_788x340.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:340,&quot;width&quot;:788,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:31697,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa9c987e-8e80-4ab5-9981-51241643e4f7_788x340.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5GmR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa9c987e-8e80-4ab5-9981-51241643e4f7_788x340.png 424w, https://substackcdn.com/image/fetch/$s_!5GmR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa9c987e-8e80-4ab5-9981-51241643e4f7_788x340.png 848w, https://substackcdn.com/image/fetch/$s_!5GmR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa9c987e-8e80-4ab5-9981-51241643e4f7_788x340.png 1272w, https://substackcdn.com/image/fetch/$s_!5GmR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa9c987e-8e80-4ab5-9981-51241643e4f7_788x340.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Technique 7c &#8212; Episodic Compression and Working Memory Architecture</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ceVO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda79d73f-10c0-46d7-a742-73e4045f2700_3822x2084.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ceVO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda79d73f-10c0-46d7-a742-73e4045f2700_3822x2084.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ceVO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda79d73f-10c0-46d7-a742-73e4045f2700_3822x2084.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ceVO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda79d73f-10c0-46d7-a742-73e4045f2700_3822x2084.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ceVO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda79d73f-10c0-46d7-a742-73e4045f2700_3822x2084.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ceVO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda79d73f-10c0-46d7-a742-73e4045f2700_3822x2084.jpeg" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/da79d73f-10c0-46d7-a742-73e4045f2700_3822x2084.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1185091,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda79d73f-10c0-46d7-a742-73e4045f2700_3822x2084.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ceVO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda79d73f-10c0-46d7-a742-73e4045f2700_3822x2084.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ceVO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda79d73f-10c0-46d7-a742-73e4045f2700_3822x2084.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ceVO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda79d73f-10c0-46d7-a742-73e4045f2700_3822x2084.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ceVO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda79d73f-10c0-46d7-a742-73e4045f2700_3822x2084.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The Claude Code leak revealed Anthropic&#8217;s production answer to runaway context growth: a three-layer memory system (MEMORY.md pointer index, on-demand topic files, background Dream consolidation daemon). The same leak&#8217;s autoCompact data &#8212; <strong>1,279 sessions with 50+ consecutive compaction failures, consuming up to 250,000 API calls per day globally</strong> &#8212; quantifies exactly what happens when context engineering fails. Episodic compression periodically summarises and replaces accumulated transcript with a compact representation, maintaining context within a working budget.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!x4TA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a70246b-f41f-4979-985b-a92bc582d060_786x488.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!x4TA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a70246b-f41f-4979-985b-a92bc582d060_786x488.png 424w, https://substackcdn.com/image/fetch/$s_!x4TA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a70246b-f41f-4979-985b-a92bc582d060_786x488.png 848w, https://substackcdn.com/image/fetch/$s_!x4TA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a70246b-f41f-4979-985b-a92bc582d060_786x488.png 1272w, https://substackcdn.com/image/fetch/$s_!x4TA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a70246b-f41f-4979-985b-a92bc582d060_786x488.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!x4TA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a70246b-f41f-4979-985b-a92bc582d060_786x488.png" width="786" height="488" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7a70246b-f41f-4979-985b-a92bc582d060_786x488.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:488,&quot;width&quot;:786,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:40214,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a70246b-f41f-4979-985b-a92bc582d060_786x488.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!x4TA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a70246b-f41f-4979-985b-a92bc582d060_786x488.png 424w, https://substackcdn.com/image/fetch/$s_!x4TA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a70246b-f41f-4979-985b-a92bc582d060_786x488.png 848w, https://substackcdn.com/image/fetch/$s_!x4TA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a70246b-f41f-4979-985b-a92bc582d060_786x488.png 1272w, https://substackcdn.com/image/fetch/$s_!x4TA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a70246b-f41f-4979-985b-a92bc582d060_786x488.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Step-by-Step: Lever 7 Applied to the 500M Token User</h4><p style="text-align: justify;">Context engineering reduces the number of tokens consumed to accomplish the same work. A 500M-token agentic workload, re-engineered with lazy loading + RAG + compression, typically delivers the same outputs in <strong>300&#8211;350M tokens</strong> &#8212; a 30&#8211;40% reduction with no quality loss on well-bounded tasks.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mFTZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9ad4209-7003-4414-8323-ba3ea1588dfa_782x301.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mFTZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9ad4209-7003-4414-8323-ba3ea1588dfa_782x301.png 424w, https://substackcdn.com/image/fetch/$s_!mFTZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9ad4209-7003-4414-8323-ba3ea1588dfa_782x301.png 848w, https://substackcdn.com/image/fetch/$s_!mFTZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9ad4209-7003-4414-8323-ba3ea1588dfa_782x301.png 1272w, https://substackcdn.com/image/fetch/$s_!mFTZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9ad4209-7003-4414-8323-ba3ea1588dfa_782x301.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mFTZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9ad4209-7003-4414-8323-ba3ea1588dfa_782x301.png" width="782" height="301" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b9ad4209-7003-4414-8323-ba3ea1588dfa_782x301.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:301,&quot;width&quot;:782,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:25173,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9ad4209-7003-4414-8323-ba3ea1588dfa_782x301.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mFTZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9ad4209-7003-4414-8323-ba3ea1588dfa_782x301.png 424w, https://substackcdn.com/image/fetch/$s_!mFTZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9ad4209-7003-4414-8323-ba3ea1588dfa_782x301.png 848w, https://substackcdn.com/image/fetch/$s_!mFTZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9ad4209-7003-4414-8323-ba3ea1588dfa_782x301.png 1272w, https://substackcdn.com/image/fetch/$s_!mFTZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9ad4209-7003-4414-8323-ba3ea1588dfa_782x301.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vsFI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07783ee7-89b8-40fb-b275-a839696bcd3a_786x148.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vsFI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07783ee7-89b8-40fb-b275-a839696bcd3a_786x148.png 424w, https://substackcdn.com/image/fetch/$s_!vsFI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07783ee7-89b8-40fb-b275-a839696bcd3a_786x148.png 848w, https://substackcdn.com/image/fetch/$s_!vsFI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07783ee7-89b8-40fb-b275-a839696bcd3a_786x148.png 1272w, https://substackcdn.com/image/fetch/$s_!vsFI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07783ee7-89b8-40fb-b275-a839696bcd3a_786x148.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vsFI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07783ee7-89b8-40fb-b275-a839696bcd3a_786x148.png" width="786" height="148" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/07783ee7-89b8-40fb-b275-a839696bcd3a_786x148.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:148,&quot;width&quot;:786,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:25022,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07783ee7-89b8-40fb-b275-a839696bcd3a_786x148.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vsFI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07783ee7-89b8-40fb-b275-a839696bcd3a_786x148.png 424w, https://substackcdn.com/image/fetch/$s_!vsFI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07783ee7-89b8-40fb-b275-a839696bcd3a_786x148.png 848w, https://substackcdn.com/image/fetch/$s_!vsFI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07783ee7-89b8-40fb-b275-a839696bcd3a_786x148.png 1272w, https://substackcdn.com/image/fetch/$s_!vsFI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07783ee7-89b8-40fb-b275-a839696bcd3a_786x148.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h2>3. The Combined Model: Seven Levers, Full Stack</h2><p style="text-align: justify;">The five levers convert the 500M-token user from a <strong>&#8722;$1,420 monthly subsidy to +$1,017 gross profit</strong>. Levers 6 and 7 extend this further. The following table shows the complete seven-lever stack, adding context engineering and architectural innovations to the combined model:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zfxG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e3bd6b-cc5b-4a0d-bdc1-6b7c40b83673_3822x2072.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zfxG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e3bd6b-cc5b-4a0d-bdc1-6b7c40b83673_3822x2072.jpeg 424w, https://substackcdn.com/image/fetch/$s_!zfxG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e3bd6b-cc5b-4a0d-bdc1-6b7c40b83673_3822x2072.jpeg 848w, https://substackcdn.com/image/fetch/$s_!zfxG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e3bd6b-cc5b-4a0d-bdc1-6b7c40b83673_3822x2072.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!zfxG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e3bd6b-cc5b-4a0d-bdc1-6b7c40b83673_3822x2072.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zfxG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e3bd6b-cc5b-4a0d-bdc1-6b7c40b83673_3822x2072.jpeg" width="1456" height="789" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d3e3bd6b-cc5b-4a0d-bdc1-6b7c40b83673_3822x2072.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:789,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1327939,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e3bd6b-cc5b-4a0d-bdc1-6b7c40b83673_3822x2072.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zfxG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e3bd6b-cc5b-4a0d-bdc1-6b7c40b83673_3822x2072.jpeg 424w, https://substackcdn.com/image/fetch/$s_!zfxG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e3bd6b-cc5b-4a0d-bdc1-6b7c40b83673_3822x2072.jpeg 848w, https://substackcdn.com/image/fetch/$s_!zfxG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e3bd6b-cc5b-4a0d-bdc1-6b7c40b83673_3822x2072.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!zfxG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e3bd6b-cc5b-4a0d-bdc1-6b7c40b83673_3822x2072.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;"></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wXn1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe32b3cd4-f2a3-412d-97e1-3fc0e0d56d3a_785x476.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wXn1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe32b3cd4-f2a3-412d-97e1-3fc0e0d56d3a_785x476.png 424w, https://substackcdn.com/image/fetch/$s_!wXn1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe32b3cd4-f2a3-412d-97e1-3fc0e0d56d3a_785x476.png 848w, https://substackcdn.com/image/fetch/$s_!wXn1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe32b3cd4-f2a3-412d-97e1-3fc0e0d56d3a_785x476.png 1272w, https://substackcdn.com/image/fetch/$s_!wXn1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe32b3cd4-f2a3-412d-97e1-3fc0e0d56d3a_785x476.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wXn1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe32b3cd4-f2a3-412d-97e1-3fc0e0d56d3a_785x476.png" width="785" height="476" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e32b3cd4-f2a3-412d-97e1-3fc0e0d56d3a_785x476.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:476,&quot;width&quot;:785,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:52986,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe32b3cd4-f2a3-412d-97e1-3fc0e0d56d3a_785x476.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wXn1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe32b3cd4-f2a3-412d-97e1-3fc0e0d56d3a_785x476.png 424w, https://substackcdn.com/image/fetch/$s_!wXn1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe32b3cd4-f2a3-412d-97e1-3fc0e0d56d3a_785x476.png 848w, https://substackcdn.com/image/fetch/$s_!wXn1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe32b3cd4-f2a3-412d-97e1-3fc0e0d56d3a_785x476.png 1272w, https://substackcdn.com/image/fetch/$s_!wXn1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe32b3cd4-f2a3-412d-97e1-3fc0e0d56d3a_785x476.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">Lever 7 serve cost: $733 (post-governance) &#215; 0.65 (35% token volume reduction) = $477. Lever 6a then applies &#8722;60% architectural efficiency to $477: <strong>$477 &#215; 0.40 = $191</strong>. Lever 6b applies the full &#8722;87% architectural transformation compounded through all prior optimisations: <strong>$477 &#215; 0.13 = $62</strong>.</p><p style="text-align: justify;"><strong>The full seven-lever swing</strong> &#8212; from baseline to Lever 7 + Lever 6a, achievable within 18 months &#8212; is <strong>$3,029 per account per month</strong> (from &#8722;$1,420 to +$1,609). At full architectural transformation the swing reaches <strong>$3,158 per account per month</strong>. This is not a marginal improvement. It is a structural transformation of the unit economics of AI inference.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7PPV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57608748-4ba1-40aa-8044-47e506f35529_787x127.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7PPV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57608748-4ba1-40aa-8044-47e506f35529_787x127.png 424w, https://substackcdn.com/image/fetch/$s_!7PPV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57608748-4ba1-40aa-8044-47e506f35529_787x127.png 848w, https://substackcdn.com/image/fetch/$s_!7PPV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57608748-4ba1-40aa-8044-47e506f35529_787x127.png 1272w, https://substackcdn.com/image/fetch/$s_!7PPV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57608748-4ba1-40aa-8044-47e506f35529_787x127.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7PPV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57608748-4ba1-40aa-8044-47e506f35529_787x127.png" width="787" height="127" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/57608748-4ba1-40aa-8044-47e506f35529_787x127.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:127,&quot;width&quot;:787,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:21713,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57608748-4ba1-40aa-8044-47e506f35529_787x127.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7PPV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57608748-4ba1-40aa-8044-47e506f35529_787x127.png 424w, https://substackcdn.com/image/fetch/$s_!7PPV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57608748-4ba1-40aa-8044-47e506f35529_787x127.png 848w, https://substackcdn.com/image/fetch/$s_!7PPV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57608748-4ba1-40aa-8044-47e506f35529_787x127.png 1272w, https://substackcdn.com/image/fetch/$s_!7PPV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57608748-4ba1-40aa-8044-47e506f35529_787x127.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h2>4. The China Competitive Lens: Pressure and Blueprint</h2><p>The Chinese AI laboratory ecosystem &#8212; operating under US export controls that limit access to H100-class GPUs &#8212; has been forced to optimise inference economics as a survival constraint, not a nice-to-have. The result is a set of open-source/weight architectural and serving innovations that are <em>simultaneously a competitive threat and a free engineering gift</em> to Western providers.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UF2k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbea39167-5a9b-4409-af01-5d1e9cf2b3fc_783x360.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UF2k!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbea39167-5a9b-4409-af01-5d1e9cf2b3fc_783x360.png 424w, https://substackcdn.com/image/fetch/$s_!UF2k!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbea39167-5a9b-4409-af01-5d1e9cf2b3fc_783x360.png 848w, https://substackcdn.com/image/fetch/$s_!UF2k!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbea39167-5a9b-4409-af01-5d1e9cf2b3fc_783x360.png 1272w, https://substackcdn.com/image/fetch/$s_!UF2k!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbea39167-5a9b-4409-af01-5d1e9cf2b3fc_783x360.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UF2k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbea39167-5a9b-4409-af01-5d1e9cf2b3fc_783x360.png" width="783" height="360" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bea39167-5a9b-4409-af01-5d1e9cf2b3fc_783x360.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:360,&quot;width&quot;:783,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:44119,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbea39167-5a9b-4409-af01-5d1e9cf2b3fc_783x360.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UF2k!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbea39167-5a9b-4409-af01-5d1e9cf2b3fc_783x360.png 424w, https://substackcdn.com/image/fetch/$s_!UF2k!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbea39167-5a9b-4409-af01-5d1e9cf2b3fc_783x360.png 848w, https://substackcdn.com/image/fetch/$s_!UF2k!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbea39167-5a9b-4409-af01-5d1e9cf2b3fc_783x360.png 1272w, https://substackcdn.com/image/fetch/$s_!UF2k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbea39167-5a9b-4409-af01-5d1e9cf2b3fc_783x360.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The strategic implication for Anthropic, OpenAI, and other Western frontier labs is not that Chinese models will displace them. Quality differentiation, trust infrastructure, enterprise relationships, and safety research remain substantial moats. The implication is that <strong>the reference price point for frontier-adjacent performance is collapsing.</strong> As Chinese open-weight models on Hugging Face overtook US model downloads in August 2025 and Chinese models from DeepSeek, Alibaba, Moonshot, and others potentially displace US alternatives across global leaderboards, the pricing pressure on Anthropic&#8217;s $3/$15 per million token API rates is structural and accelerating.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!W32L!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63997174-19c9-4dfb-8bf3-783784f9f309_3822x2075.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!W32L!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63997174-19c9-4dfb-8bf3-783784f9f309_3822x2075.jpeg 424w, https://substackcdn.com/image/fetch/$s_!W32L!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63997174-19c9-4dfb-8bf3-783784f9f309_3822x2075.jpeg 848w, https://substackcdn.com/image/fetch/$s_!W32L!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63997174-19c9-4dfb-8bf3-783784f9f309_3822x2075.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!W32L!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63997174-19c9-4dfb-8bf3-783784f9f309_3822x2075.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!W32L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63997174-19c9-4dfb-8bf3-783784f9f309_3822x2075.jpeg" width="1456" height="790" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/63997174-19c9-4dfb-8bf3-783784f9f309_3822x2075.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:790,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1009797,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63997174-19c9-4dfb-8bf3-783784f9f309_3822x2075.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!W32L!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63997174-19c9-4dfb-8bf3-783784f9f309_3822x2075.jpeg 424w, https://substackcdn.com/image/fetch/$s_!W32L!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63997174-19c9-4dfb-8bf3-783784f9f309_3822x2075.jpeg 848w, https://substackcdn.com/image/fetch/$s_!W32L!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63997174-19c9-4dfb-8bf3-783784f9f309_3822x2075.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!W32L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63997174-19c9-4dfb-8bf3-783784f9f309_3822x2075.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The six-lever framework presented here is the correct response: <em><strong>not to match Chinese pricing at the API level, but to reduce internal serve costs sufficiently that frontier-quality models can be sustainably offered at prices that reflect the new efficiency reality.</strong></em><strong> A provider running MLA + MoE + FP8 + speculative decoding on owned infrastructure could plausibly offer Sonnet-class performance at $0.50&#8211;$0.80 / MTok blended and still maintain 50%+ gross margins &#8212; competing on quality and trust rather than being structurally undercut on price.</strong></p><h2>5. Implementation Roadmap: Sequencing All Six Levers</h2><h4>Horizon 1 (0&#8211;6 months): Measure, Signal, Deploy Near-Term Tech</h4><blockquote><p>&#8226; <strong>Deploy token usage dashboards</strong> for all Max and Enterprise subscribers.</p><p>&#8226; <strong>Introduce transparent breakeven disclosure</strong>: communicate subsidy regime and transition timeline.</p><p>&#8226; <strong>Deploy FP8 inference</strong> on Sonnet-class models in production &#8212; calibrate, validate, roll out. This is the fastest serve-cost lever deployable without model changes.</p><p>&#8226; <strong>Begin EAGLE-3 draft head training</strong> for Sonnet 4.6 and Haiku 4.5. Target 0.70+ acceptance rate before production rollout.</p></blockquote><h4>Horizon 2 (6&#8211;18 months): Restructure Pricing + Launch Production SpecDec</h4><blockquote><p>&#8226; <strong>Launch tiered Max plans</strong> with defined token inclusions and $3.80&#8211;$4.50/MTok overage.</p><p>&#8226; <strong>Roll out production speculative decoding</strong> across Claude Code and API. Publish throughput improvement metrics to build user trust.</p><p>&#8226; <strong>Launch adaptive compute routing</strong>: expose thinking/non-thinking mode in Claude Code agent loop, routed automatically by task complexity classifier.</p><p>&#8226; <strong>Begin infrastructure investment</strong>: commit to first owned-infrastructure cluster targeting 10&#8211;15% of inference.</p><p>&#8226; <strong>Implement prefill/decode separation</strong> in serving infrastructure to improve fleet hardware utilisation.</p></blockquote><h4>Horizon 3 (18&#8211;36 months): Structural Cost Advantage</h4><blockquote><p>&#8226; <strong>New model generation with MLA</strong>: integrate Multi-Head Latent Attention into the next Claude architecture. Target KV cache reduction of 80%+ and 2&#8211;3&#215; throughput improvement.</p><p>&#8226; <strong>Scale owned infrastructure to 40&#8211;50%</strong> of inference workload. Hyperscaler margin recapture becomes material in gross margin.</p><p>&#8226; <strong>Launch Outcome API</strong>: priced per accepted output, not per token. Target legal, compliance, procurement verticals.</p></blockquote><h4>Horizon 4 (36&#8211;48 months): Full Architecture Transformation</h4><blockquote><p>&#8226; <strong>Evaluate MoE adoption</strong> for the following Claude model generation. Chinese precedent validates that frontier quality is achievable with sparse activation architectures.</p><p>&#8226; <strong>Negotiate INT4-QAT partnership or internal programme</strong>: Kimi K2 Thinking&#8217;s QAT approach achieves 2&#215; speed at validated quality. Apply to Claude&#8217;s next-generation architecture.</p><p>&#8226; <strong>Publish efficiency benchmarks</strong> alongside quality benchmarks: tokens per second, cost per accepted output, cache hit rate. This re-anchors competitive differentiation on the full value equation, not just benchmark scores.</p></blockquote><h2>6. What Consumers and Enterprises Get</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EQxV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa91678-403b-42b5-9ebc-c2687f6b0a6a_786x378.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EQxV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa91678-403b-42b5-9ebc-c2687f6b0a6a_786x378.png 424w, https://substackcdn.com/image/fetch/$s_!EQxV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa91678-403b-42b5-9ebc-c2687f6b0a6a_786x378.png 848w, https://substackcdn.com/image/fetch/$s_!EQxV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa91678-403b-42b5-9ebc-c2687f6b0a6a_786x378.png 1272w, https://substackcdn.com/image/fetch/$s_!EQxV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa91678-403b-42b5-9ebc-c2687f6b0a6a_786x378.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EQxV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa91678-403b-42b5-9ebc-c2687f6b0a6a_786x378.png" width="786" height="378" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5aa91678-403b-42b5-9ebc-c2687f6b0a6a_786x378.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:378,&quot;width&quot;:786,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:47513,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa91678-403b-42b5-9ebc-c2687f6b0a6a_786x378.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EQxV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa91678-403b-42b5-9ebc-c2687f6b0a6a_786x378.png 424w, https://substackcdn.com/image/fetch/$s_!EQxV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa91678-403b-42b5-9ebc-c2687f6b0a6a_786x378.png 848w, https://substackcdn.com/image/fetch/$s_!EQxV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa91678-403b-42b5-9ebc-c2687f6b0a6a_786x378.png 1272w, https://substackcdn.com/image/fetch/$s_!EQxV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa91678-403b-42b5-9ebc-c2687f6b0a6a_786x378.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>7. Macro Financial Projection: All Six Levers</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3tR5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe53f99c9-88ee-4beb-a542-b1d5e411183c_786x651.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3tR5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe53f99c9-88ee-4beb-a542-b1d5e411183c_786x651.png 424w, https://substackcdn.com/image/fetch/$s_!3tR5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe53f99c9-88ee-4beb-a542-b1d5e411183c_786x651.png 848w, https://substackcdn.com/image/fetch/$s_!3tR5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe53f99c9-88ee-4beb-a542-b1d5e411183c_786x651.png 1272w, https://substackcdn.com/image/fetch/$s_!3tR5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe53f99c9-88ee-4beb-a542-b1d5e411183c_786x651.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3tR5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe53f99c9-88ee-4beb-a542-b1d5e411183c_786x651.png" width="786" height="651" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e53f99c9-88ee-4beb-a542-b1d5e411183c_786x651.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:651,&quot;width&quot;:786,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:54756,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe53f99c9-88ee-4beb-a542-b1d5e411183c_786x651.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3tR5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe53f99c9-88ee-4beb-a542-b1d5e411183c_786x651.png 424w, https://substackcdn.com/image/fetch/$s_!3tR5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe53f99c9-88ee-4beb-a542-b1d5e411183c_786x651.png 848w, https://substackcdn.com/image/fetch/$s_!3tR5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe53f99c9-88ee-4beb-a542-b1d5e411183c_786x651.png 1272w, https://substackcdn.com/image/fetch/$s_!3tR5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe53f99c9-88ee-4beb-a542-b1d5e411183c_786x651.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CoYM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9e3ad81-b2b4-407a-8ca3-eb2190a5c919_787x241.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CoYM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9e3ad81-b2b4-407a-8ca3-eb2190a5c919_787x241.png 424w, https://substackcdn.com/image/fetch/$s_!CoYM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9e3ad81-b2b4-407a-8ca3-eb2190a5c919_787x241.png 848w, https://substackcdn.com/image/fetch/$s_!CoYM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9e3ad81-b2b4-407a-8ca3-eb2190a5c919_787x241.png 1272w, https://substackcdn.com/image/fetch/$s_!CoYM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9e3ad81-b2b4-407a-8ca3-eb2190a5c919_787x241.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CoYM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9e3ad81-b2b4-407a-8ca3-eb2190a5c919_787x241.png" width="787" height="241" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a9e3ad81-b2b4-407a-8ca3-eb2190a5c919_787x241.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:241,&quot;width&quot;:787,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:22794,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9e3ad81-b2b4-407a-8ca3-eb2190a5c919_787x241.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CoYM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9e3ad81-b2b4-407a-8ca3-eb2190a5c919_787x241.png 424w, https://substackcdn.com/image/fetch/$s_!CoYM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9e3ad81-b2b4-407a-8ca3-eb2190a5c919_787x241.png 848w, https://substackcdn.com/image/fetch/$s_!CoYM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9e3ad81-b2b4-407a-8ca3-eb2190a5c919_787x241.png 1272w, https://substackcdn.com/image/fetch/$s_!CoYM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9e3ad81-b2b4-407a-8ca3-eb2190a5c919_787x241.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>8. The Demand-Side Accelerants: Why Efficiency Doesn&#8217;t Reduce Total Compute</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3nUV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f070865-c86a-4f21-8c73-94262744427b_3822x2097.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3nUV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f070865-c86a-4f21-8c73-94262744427b_3822x2097.jpeg 424w, https://substackcdn.com/image/fetch/$s_!3nUV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f070865-c86a-4f21-8c73-94262744427b_3822x2097.jpeg 848w, https://substackcdn.com/image/fetch/$s_!3nUV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f070865-c86a-4f21-8c73-94262744427b_3822x2097.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!3nUV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f070865-c86a-4f21-8c73-94262744427b_3822x2097.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3nUV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f070865-c86a-4f21-8c73-94262744427b_3822x2097.jpeg" width="1456" height="799" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8f070865-c86a-4f21-8c73-94262744427b_3822x2097.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:799,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1024626,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f070865-c86a-4f21-8c73-94262744427b_3822x2097.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3nUV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f070865-c86a-4f21-8c73-94262744427b_3822x2097.jpeg 424w, https://substackcdn.com/image/fetch/$s_!3nUV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f070865-c86a-4f21-8c73-94262744427b_3822x2097.jpeg 848w, https://substackcdn.com/image/fetch/$s_!3nUV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f070865-c86a-4f21-8c73-94262744427b_3822x2097.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!3nUV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f070865-c86a-4f21-8c73-94262744427b_3822x2097.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">Sections 1&#8211;7 examine the supply side of the token economics equation: how to reduce the cost of serving each token. This section examines the demand side &#8212; and reaches a counterintuitive conclusion. The efficiency improvements described in the preceding seven levers will <strong>not</strong> reduce total token consumption. They will <strong>accelerate it</strong>. Three structural forces guarantee this: <strong>the API/subscription pricing threshold, the emergence of goal-directed agentic workflows, and Jevons Paradox.</strong> Understanding these forces is essential to projecting the true revenue opportunity &#8212; and the true infrastructure scale requirement &#8212; of the agentic era.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gy3a!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F450c7cec-df82-40b9-a15c-148678363542_3822x2064.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gy3a!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F450c7cec-df82-40b9-a15c-148678363542_3822x2064.jpeg 424w, https://substackcdn.com/image/fetch/$s_!gy3a!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F450c7cec-df82-40b9-a15c-148678363542_3822x2064.jpeg 848w, https://substackcdn.com/image/fetch/$s_!gy3a!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F450c7cec-df82-40b9-a15c-148678363542_3822x2064.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!gy3a!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F450c7cec-df82-40b9-a15c-148678363542_3822x2064.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gy3a!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F450c7cec-df82-40b9-a15c-148678363542_3822x2064.jpeg" width="1456" height="786" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/450c7cec-df82-40b9-a15c-148678363542_3822x2064.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:786,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1549656,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F450c7cec-df82-40b9-a15c-148678363542_3822x2064.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gy3a!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F450c7cec-df82-40b9-a15c-148678363542_3822x2064.jpeg 424w, https://substackcdn.com/image/fetch/$s_!gy3a!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F450c7cec-df82-40b9-a15c-148678363542_3822x2064.jpeg 848w, https://substackcdn.com/image/fetch/$s_!gy3a!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F450c7cec-df82-40b9-a15c-148678363542_3822x2064.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!gy3a!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F450c7cec-df82-40b9-a15c-148678363542_3822x2064.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3 style="text-align: justify;"><strong>8a &#8212; The Subscription / API Threshold and the Migration to Metered Pricing</strong></h3><p style="text-align: justify;">There is a crossover point at which a flat-rate subscription is no longer cheaper than API pricing for the user. Below that threshold, subscription wins on simplicity. Above it, API is cheaper. Where that threshold sits determines which pricing model attracts the heaviest users &#8212; and therefore where the largest subsidies live.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BY3L!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2965e2b3-d42c-4584-a7dd-7a1e7276d11a_787x612.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BY3L!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2965e2b3-d42c-4584-a7dd-7a1e7276d11a_787x612.png 424w, https://substackcdn.com/image/fetch/$s_!BY3L!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2965e2b3-d42c-4584-a7dd-7a1e7276d11a_787x612.png 848w, https://substackcdn.com/image/fetch/$s_!BY3L!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2965e2b3-d42c-4584-a7dd-7a1e7276d11a_787x612.png 1272w, https://substackcdn.com/image/fetch/$s_!BY3L!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2965e2b3-d42c-4584-a7dd-7a1e7276d11a_787x612.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BY3L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2965e2b3-d42c-4584-a7dd-7a1e7276d11a_787x612.png" width="787" height="612" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2965e2b3-d42c-4584-a7dd-7a1e7276d11a_787x612.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:612,&quot;width&quot;:787,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:59347,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2965e2b3-d42c-4584-a7dd-7a1e7276d11a_787x612.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BY3L!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2965e2b3-d42c-4584-a7dd-7a1e7276d11a_787x612.png 424w, https://substackcdn.com/image/fetch/$s_!BY3L!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2965e2b3-d42c-4584-a7dd-7a1e7276d11a_787x612.png 848w, https://substackcdn.com/image/fetch/$s_!BY3L!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2965e2b3-d42c-4584-a7dd-7a1e7276d11a_787x612.png 1272w, https://substackcdn.com/image/fetch/$s_!BY3L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2965e2b3-d42c-4584-a7dd-7a1e7276d11a_787x612.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">This migration from subscription to metered API is <strong>not adversarial</strong> &#8212; at efficiency-adjusted prices it is in the enterprise&#8217;s interest. Enterprise procurement teams prefer metered costs: they enable per-team chargebacks, FinOps visibility, budget attribution, and ROI measurement. A $900/month API bill that can be traced to specific projects and teams is more governable than a $200/month subscription that obscures individual consumption. The efficiency gains in Levers 6 and 7 make this transition economically plausible for both sides for the first time.</p><h3>8b &#8212; Goal-Directed Agents and the /goal Economy: The Cron Job Model</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!L38h!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67857fe6-63a5-40ea-92ce-19197d87050a_3822x2097.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!L38h!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67857fe6-63a5-40ea-92ce-19197d87050a_3822x2097.jpeg 424w, https://substackcdn.com/image/fetch/$s_!L38h!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67857fe6-63a5-40ea-92ce-19197d87050a_3822x2097.jpeg 848w, https://substackcdn.com/image/fetch/$s_!L38h!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67857fe6-63a5-40ea-92ce-19197d87050a_3822x2097.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!L38h!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67857fe6-63a5-40ea-92ce-19197d87050a_3822x2097.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!L38h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67857fe6-63a5-40ea-92ce-19197d87050a_3822x2097.jpeg" width="1456" height="799" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/67857fe6-63a5-40ea-92ce-19197d87050a_3822x2097.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:799,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1024626,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67857fe6-63a5-40ea-92ce-19197d87050a_3822x2097.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!L38h!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67857fe6-63a5-40ea-92ce-19197d87050a_3822x2097.jpeg 424w, https://substackcdn.com/image/fetch/$s_!L38h!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67857fe6-63a5-40ea-92ce-19197d87050a_3822x2097.jpeg 848w, https://substackcdn.com/image/fetch/$s_!L38h!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67857fe6-63a5-40ea-92ce-19197d87050a_3822x2097.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!L38h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67857fe6-63a5-40ea-92ce-19197d87050a_3822x2097.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The most significant structural shift in agentic token consumption is not the volume of individual messages &#8212; it is the move from <strong>reactive per-message workflows</strong> to <strong>goal-directed continuous execution</strong>. The KAIROS system revealed in the Claude Code leak (an always-on background agent that runs autonomously toward set goals, with ULTRAPLAN offloading complex planning to long-horizon Opus sessions) represents the logical endpoint: an AI that runs like a cron job, consuming tokens continuously whether or not a human is watching.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dRkm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F329e73e5-3bf3-45fa-a91a-c23cbddad334_786x597.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dRkm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F329e73e5-3bf3-45fa-a91a-c23cbddad334_786x597.png 424w, https://substackcdn.com/image/fetch/$s_!dRkm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F329e73e5-3bf3-45fa-a91a-c23cbddad334_786x597.png 848w, https://substackcdn.com/image/fetch/$s_!dRkm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F329e73e5-3bf3-45fa-a91a-c23cbddad334_786x597.png 1272w, https://substackcdn.com/image/fetch/$s_!dRkm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F329e73e5-3bf3-45fa-a91a-c23cbddad334_786x597.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dRkm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F329e73e5-3bf3-45fa-a91a-c23cbddad334_786x597.png" width="786" height="597" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/329e73e5-3bf3-45fa-a91a-c23cbddad334_786x597.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:597,&quot;width&quot;:786,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:60679,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F329e73e5-3bf3-45fa-a91a-c23cbddad334_786x597.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dRkm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F329e73e5-3bf3-45fa-a91a-c23cbddad334_786x597.png 424w, https://substackcdn.com/image/fetch/$s_!dRkm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F329e73e5-3bf3-45fa-a91a-c23cbddad334_786x597.png 848w, https://substackcdn.com/image/fetch/$s_!dRkm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F329e73e5-3bf3-45fa-a91a-c23cbddad334_786x597.png 1272w, https://substackcdn.com/image/fetch/$s_!dRkm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F329e73e5-3bf3-45fa-a91a-c23cbddad334_786x597.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The /goal model changes the <em>unit of value</em> entirely. Nobody cares how many tokens a goal-directed refactor consumed &#8212; they care whether the PR passed code review. This is the natural home of value-based pricing (Lever 4): charge per accepted commit, per shipped feature, per resolved incident. The token volume becomes an internal cost variable, not the customer-facing price unit. The provider that builds billing infrastructure around outcomes rather than tokens is the one best positioned for the /goal economy.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!I1Ij!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd8c4942-cefb-4858-9b19-c735fe263f18_790x145.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!I1Ij!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd8c4942-cefb-4858-9b19-c735fe263f18_790x145.png 424w, https://substackcdn.com/image/fetch/$s_!I1Ij!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd8c4942-cefb-4858-9b19-c735fe263f18_790x145.png 848w, https://substackcdn.com/image/fetch/$s_!I1Ij!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd8c4942-cefb-4858-9b19-c735fe263f18_790x145.png 1272w, https://substackcdn.com/image/fetch/$s_!I1Ij!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd8c4942-cefb-4858-9b19-c735fe263f18_790x145.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!I1Ij!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd8c4942-cefb-4858-9b19-c735fe263f18_790x145.png" width="790" height="145" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dd8c4942-cefb-4858-9b19-c735fe263f18_790x145.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:145,&quot;width&quot;:790,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:28432,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd8c4942-cefb-4858-9b19-c735fe263f18_790x145.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!I1Ij!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd8c4942-cefb-4858-9b19-c735fe263f18_790x145.png 424w, https://substackcdn.com/image/fetch/$s_!I1Ij!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd8c4942-cefb-4858-9b19-c735fe263f18_790x145.png 848w, https://substackcdn.com/image/fetch/$s_!I1Ij!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd8c4942-cefb-4858-9b19-c735fe263f18_790x145.png 1272w, https://substackcdn.com/image/fetch/$s_!I1Ij!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd8c4942-cefb-4858-9b19-c735fe263f18_790x145.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>8c &#8212; Jevons Paradox: Why Cheaper Tokens Mean More Tokens, Not Fewer</h3><p style="text-align: justify;">In 1865, economist William Stanley Jevons observed that improvements in steam engine efficiency &#8212; which dramatically reduced the coal required per unit of work &#8212; led to a <strong>massive increase</strong> in total coal consumption. The reason: lower cost per unit of work made previously uneconomical applications viable, expanding demand faster than efficiency reduced per-unit consumption. Jevons called this the rebound effect. It has since been observed in electricity, aviation, computing, and the internet.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Q09R!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda7b7af-aa0f-4122-bbb9-1835f783bbd7_3822x2064.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Q09R!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda7b7af-aa0f-4122-bbb9-1835f783bbd7_3822x2064.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Q09R!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda7b7af-aa0f-4122-bbb9-1835f783bbd7_3822x2064.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Q09R!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda7b7af-aa0f-4122-bbb9-1835f783bbd7_3822x2064.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Q09R!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda7b7af-aa0f-4122-bbb9-1835f783bbd7_3822x2064.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Q09R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda7b7af-aa0f-4122-bbb9-1835f783bbd7_3822x2064.jpeg" width="1456" height="786" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fda7b7af-aa0f-4122-bbb9-1835f783bbd7_3822x2064.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:786,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1549656,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda7b7af-aa0f-4122-bbb9-1835f783bbd7_3822x2064.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Q09R!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda7b7af-aa0f-4122-bbb9-1835f783bbd7_3822x2064.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Q09R!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda7b7af-aa0f-4122-bbb9-1835f783bbd7_3822x2064.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Q09R!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda7b7af-aa0f-4122-bbb9-1835f783bbd7_3822x2064.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Q09R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda7b7af-aa0f-4122-bbb9-1835f783bbd7_3822x2064.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The <a href="https://ramp.com/blog/trillion-dollar-ai-blindspot">Ramp data</a> already confirms Jevons in the AI token market: <strong>AI spending across Ramp&#8217;s enterprise customer base grew 13&#215; in one year</strong> &#8212; while token prices were falling. The Disney dashboard (16.4 billion tokens in 9 workdays from 4,800 employees) comes from a company with active cost management. The Jellyfish top adopter at 975M tokens/month almost certainly had awareness of token costs. High consumption and falling prices are not correlated by accident &#8212; falling prices are the mechanism enabling high consumption.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FWlF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F226d6e23-cd2c-4787-be96-c78f45665a3d_786x581.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FWlF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F226d6e23-cd2c-4787-be96-c78f45665a3d_786x581.png 424w, https://substackcdn.com/image/fetch/$s_!FWlF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F226d6e23-cd2c-4787-be96-c78f45665a3d_786x581.png 848w, https://substackcdn.com/image/fetch/$s_!FWlF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F226d6e23-cd2c-4787-be96-c78f45665a3d_786x581.png 1272w, https://substackcdn.com/image/fetch/$s_!FWlF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F226d6e23-cd2c-4787-be96-c78f45665a3d_786x581.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FWlF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F226d6e23-cd2c-4787-be96-c78f45665a3d_786x581.png" width="786" height="581" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/226d6e23-cd2c-4787-be96-c78f45665a3d_786x581.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:581,&quot;width&quot;:786,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:57366,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F226d6e23-cd2c-4787-be96-c78f45665a3d_786x581.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FWlF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F226d6e23-cd2c-4787-be96-c78f45665a3d_786x581.png 424w, https://substackcdn.com/image/fetch/$s_!FWlF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F226d6e23-cd2c-4787-be96-c78f45665a3d_786x581.png 848w, https://substackcdn.com/image/fetch/$s_!FWlF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F226d6e23-cd2c-4787-be96-c78f45665a3d_786x581.png 1272w, https://substackcdn.com/image/fetch/$s_!FWlF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F226d6e23-cd2c-4787-be96-c78f45665a3d_786x581.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BAbc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f4c33ad-38b8-4780-87c4-44e8ff2b462b_786x257.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BAbc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f4c33ad-38b8-4780-87c4-44e8ff2b462b_786x257.png 424w, https://substackcdn.com/image/fetch/$s_!BAbc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f4c33ad-38b8-4780-87c4-44e8ff2b462b_786x257.png 848w, https://substackcdn.com/image/fetch/$s_!BAbc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f4c33ad-38b8-4780-87c4-44e8ff2b462b_786x257.png 1272w, https://substackcdn.com/image/fetch/$s_!BAbc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f4c33ad-38b8-4780-87c4-44e8ff2b462b_786x257.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BAbc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f4c33ad-38b8-4780-87c4-44e8ff2b462b_786x257.png" width="786" height="257" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2f4c33ad-38b8-4780-87c4-44e8ff2b462b_786x257.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:257,&quot;width&quot;:786,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:28498,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f4c33ad-38b8-4780-87c4-44e8ff2b462b_786x257.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BAbc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f4c33ad-38b8-4780-87c4-44e8ff2b462b_786x257.png 424w, https://substackcdn.com/image/fetch/$s_!BAbc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f4c33ad-38b8-4780-87c4-44e8ff2b462b_786x257.png 848w, https://substackcdn.com/image/fetch/$s_!BAbc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f4c33ad-38b8-4780-87c4-44e8ff2b462b_786x257.png 1272w, https://substackcdn.com/image/fetch/$s_!BAbc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f4c33ad-38b8-4780-87c4-44e8ff2b462b_786x257.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The interaction between Jevons and the /goal model is the critical compound effect. Goal-directed agents don&#8217;t just consume more tokens than message-driven workflows &#8212; they remove the human attention constraint that currently caps per-session consumption. Each developer running KAIROS-style background agents is effectively a <strong>token-consuming cluster</strong>, not a token-consuming individual. When 10,000 enterprise developers each run 10 background agents continuously, total token consumption scales not by the number of developers but by the number of developers <em>times</em> the number of concurrent agent goals <em>times</em> the depth of each goal&#8217;s execution. The Jellyfish extreme-adopter at 975M tokens/month is not an outlier to be managed &#8212; it is a preview of the median.</p><p style="text-align: justify;">For providers, this reframes the entire strategic question. </p><blockquote><p style="text-align: justify;"><em><strong>The issue is not how to reduce the subsidy on heavy users. It is how to build infrastructure at the scale Jevons-amplified demand will require, while capturing the economics to fund that infrastructure.</strong></em> </p></blockquote><p style="text-align: justify;">The seven levers in this article address the economics. Infrastructure inversion (Lever 3) addresses the scale. But the demand trajectory makes clear that the investment horizon is not five years &#8212; it is <strong>now</strong>, and the capital requirements are larger than any subscription revenue model can fund alone.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DYKb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff823e16e-ca79-4e65-8cc0-9c619dfd28ca_786x145.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DYKb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff823e16e-ca79-4e65-8cc0-9c619dfd28ca_786x145.png 424w, https://substackcdn.com/image/fetch/$s_!DYKb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff823e16e-ca79-4e65-8cc0-9c619dfd28ca_786x145.png 848w, https://substackcdn.com/image/fetch/$s_!DYKb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff823e16e-ca79-4e65-8cc0-9c619dfd28ca_786x145.png 1272w, https://substackcdn.com/image/fetch/$s_!DYKb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff823e16e-ca79-4e65-8cc0-9c619dfd28ca_786x145.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DYKb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff823e16e-ca79-4e65-8cc0-9c619dfd28ca_786x145.png" width="786" height="145" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f823e16e-ca79-4e65-8cc0-9c619dfd28ca_786x145.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:145,&quot;width&quot;:786,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:26873,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff823e16e-ca79-4e65-8cc0-9c619dfd28ca_786x145.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DYKb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff823e16e-ca79-4e65-8cc0-9c619dfd28ca_786x145.png 424w, https://substackcdn.com/image/fetch/$s_!DYKb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff823e16e-ca79-4e65-8cc0-9c619dfd28ca_786x145.png 848w, https://substackcdn.com/image/fetch/$s_!DYKb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff823e16e-ca79-4e65-8cc0-9c619dfd28ca_786x145.png 1272w, https://substackcdn.com/image/fetch/$s_!DYKb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff823e16e-ca79-4e65-8cc0-9c619dfd28ca_786x145.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h2>9. Conclusion</h2><p style="text-align: justify;">Broda&#8217;s arithmetic is correct. But the subsidy problem is only half the picture. The other half is what the subsidy is funding: the early adoption of a demand curve that, once Jevons takes hold, will dwarf anything visible in today&#8217;s usage data. <strong>Every efficiency improvement that makes token consumption cheaper unlocks a new category of AI application that was previously economically blocked</strong>. Goal-directed agents, always-on background processes, speculative parallel execution, and continuous monitoring &#8212; none of these are viable at $3.24/MTok serve cost at scale. All of them become viable at $0.42/MTok. The demand is latent. The efficiency improvements will release it.</p><p style="text-align: justify;">The seven-lever framework, as simplistic as it may seem, presented here demonstrates that the same 500M-token user who today costs a provider <strong>$1,420 per month in subsidy</strong> can potentially become an account generating <strong>+$1,609 per month in gross profit within 18 months</strong> &#8212; a swing of $3,029 per account per month &#8212; <strong>through tiered pricing, model routing, infrastructure inversion, value-based licensing, token governance, context engineering, and near-term architectural deployment.</strong> In the full architectural transformation scenario, that gross profit reaches <strong>+$1,738 per month at 97% margin</strong>. Applied at the $23 billion revenue scale, the seven levers represent a path from 40% to <strong>65&#8211;75% gross margin</strong> within a four-year horizon &#8212; before Jevons-driven demand expansion is accounted for. At this point in time - I am not only hopeful, but confident Anthropic is strategically already on such a path. </p><p style="text-align: justify;">The Jevons reframe changes the conclusion materially. <strong>The providers that move earliest on efficiency are not primarily protecting existing margins on existing users.</strong> They are <strong>lowering the economic floor</strong> at which new categories of agentic infrastructure become viable. </p><blockquote><p style="text-align: justify;"><em><strong>The /goal economy &#8212; goal-directed agents running continuously as background processes, consuming tokens the way a cron job consumes CPU &#8212; is the next demand wave</strong></em>. </p></blockquote><p style="text-align: justify;"><strong>The providers that have built infrastructure, pricing, and architecture for that wave before it arrives will capture the entire market expansion it creates</strong>. This I believe is Anthropic's possible future. Those who wait will be scrambling to build under load.</p><blockquote><p style="text-align: justify;"><em><strong>The brutal reality of token economics is that the current regime cannot stay open-ended. The beautiful reality is that the path from subsidised to wildly profitable does not require choosing between provider economics and user value, between Western quality and Chinese efficiency, or between serving today&#8217;s users and building for tomorrow&#8217;s demand wave. It requires building all of it &#8212; at the right sequence, at the right speed, ahead of the Jevons curve.</strong></em></p></blockquote><h2>Appendix I</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zdzG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F483b0758-52b4-429f-bcbb-e3008dff3a6b_786x570.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zdzG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F483b0758-52b4-429f-bcbb-e3008dff3a6b_786x570.png 424w, https://substackcdn.com/image/fetch/$s_!zdzG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F483b0758-52b4-429f-bcbb-e3008dff3a6b_786x570.png 848w, https://substackcdn.com/image/fetch/$s_!zdzG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F483b0758-52b4-429f-bcbb-e3008dff3a6b_786x570.png 1272w, https://substackcdn.com/image/fetch/$s_!zdzG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F483b0758-52b4-429f-bcbb-e3008dff3a6b_786x570.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zdzG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F483b0758-52b4-429f-bcbb-e3008dff3a6b_786x570.png" width="786" height="570" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/483b0758-52b4-429f-bcbb-e3008dff3a6b_786x570.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:570,&quot;width&quot;:786,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:53711,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F483b0758-52b4-429f-bcbb-e3008dff3a6b_786x570.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zdzG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F483b0758-52b4-429f-bcbb-e3008dff3a6b_786x570.png 424w, https://substackcdn.com/image/fetch/$s_!zdzG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F483b0758-52b4-429f-bcbb-e3008dff3a6b_786x570.png 848w, https://substackcdn.com/image/fetch/$s_!zdzG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F483b0758-52b4-429f-bcbb-e3008dff3a6b_786x570.png 1272w, https://substackcdn.com/image/fetch/$s_!zdzG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F483b0758-52b4-429f-bcbb-e3008dff3a6b_786x570.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nuL9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41bc0826-c21b-489f-a17a-0abe3fd9d665_785x477.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nuL9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41bc0826-c21b-489f-a17a-0abe3fd9d665_785x477.png 424w, https://substackcdn.com/image/fetch/$s_!nuL9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41bc0826-c21b-489f-a17a-0abe3fd9d665_785x477.png 848w, https://substackcdn.com/image/fetch/$s_!nuL9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41bc0826-c21b-489f-a17a-0abe3fd9d665_785x477.png 1272w, https://substackcdn.com/image/fetch/$s_!nuL9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41bc0826-c21b-489f-a17a-0abe3fd9d665_785x477.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nuL9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41bc0826-c21b-489f-a17a-0abe3fd9d665_785x477.png" width="785" height="477" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/41bc0826-c21b-489f-a17a-0abe3fd9d665_785x477.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:477,&quot;width&quot;:785,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:46590,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41bc0826-c21b-489f-a17a-0abe3fd9d665_785x477.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nuL9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41bc0826-c21b-489f-a17a-0abe3fd9d665_785x477.png 424w, https://substackcdn.com/image/fetch/$s_!nuL9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41bc0826-c21b-489f-a17a-0abe3fd9d665_785x477.png 848w, https://substackcdn.com/image/fetch/$s_!nuL9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41bc0826-c21b-489f-a17a-0abe3fd9d665_785x477.png 1272w, https://substackcdn.com/image/fetch/$s_!nuL9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41bc0826-c21b-489f-a17a-0abe3fd9d665_785x477.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qRzg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11cec094-0c3a-4e20-b98a-25e59505a66c_787x660.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qRzg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11cec094-0c3a-4e20-b98a-25e59505a66c_787x660.png 424w, https://substackcdn.com/image/fetch/$s_!qRzg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11cec094-0c3a-4e20-b98a-25e59505a66c_787x660.png 848w, https://substackcdn.com/image/fetch/$s_!qRzg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11cec094-0c3a-4e20-b98a-25e59505a66c_787x660.png 1272w, https://substackcdn.com/image/fetch/$s_!qRzg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11cec094-0c3a-4e20-b98a-25e59505a66c_787x660.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qRzg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11cec094-0c3a-4e20-b98a-25e59505a66c_787x660.png" width="787" height="660" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/11cec094-0c3a-4e20-b98a-25e59505a66c_787x660.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:660,&quot;width&quot;:787,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:52505,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11cec094-0c3a-4e20-b98a-25e59505a66c_787x660.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qRzg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11cec094-0c3a-4e20-b98a-25e59505a66c_787x660.png 424w, https://substackcdn.com/image/fetch/$s_!qRzg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11cec094-0c3a-4e20-b98a-25e59505a66c_787x660.png 848w, https://substackcdn.com/image/fetch/$s_!qRzg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11cec094-0c3a-4e20-b98a-25e59505a66c_787x660.png 1272w, https://substackcdn.com/image/fetch/$s_!qRzg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11cec094-0c3a-4e20-b98a-25e59505a66c_787x660.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>NOTES:</strong></p><p><strong>Track A &#8212; Step 2: FP8 &#215; 0.72</strong></p><p>The published benchmark (Qwen3-2507 on H100, Baseten July 2025) shows FP8 inference delivers <strong>1.4&#215; throughput</strong> versus standard BF16 precision.</p><p>Throughput means tokens produced per second. If you produce 1.4&#215; as many tokens per second on the same hardware, you spend 1/1.4 as much time per token &#8212; and therefore 1/1.4 as much cost per token.</p><pre><code><code>1 &#247; 1.4 = 0.714 &#8776; 0.72

$3.24 &#215; 0.72 = $2.33 / MTok</code></code></pre><p>That&#8217;s it. No more complexity. The 0.72 is just 1 divided by the throughput improvement. The conservative range is 1.3&#8211;1.6&#215; throughput, so the multiplier range is 0.625&#8211;0.769. The 0.72 sits at the middle-conservative end.</p><div><hr></div><p><strong>Step 3: Speculative decoding &#215; 0.70</strong></p><p>This one requires understanding where the cost actually sits in the workload first.</p><p>The $3.24 serve cost comes from the $5.40 blended list price &#215; 60% COR ratio. That $5.40 blended rate is:</p><pre><code><code>Input tokens:  80% of volume &#215; $3.00 / MTok = $2.40
Output tokens: 20% of volume &#215; $15.00 / MTok = $3.00
Blended:                                        $5.40</code></code></pre><p>So output tokens are 20% of the volume but produce 55.6% of the revenue-equivalent cost ($3.00 out of $5.40). That asymmetry matters.</p><p>On the serving infrastructure side, the asymmetry is even sharper. Output generation (the decode phase) is sequential &#8212; one token per forward pass, cannot be parallelised. Input processing (the prefill phase) can be batched across many tokens simultaneously. So even though output is 20% of token volume, it consumes approximately <strong>60% of GPU time</strong> on agentic workloads with long completions.</p><p>Speculative decoding at a 0.75 acceptance rate produces approximately 2&#215; output throughput &#8212; so output generation cost per token is halved.</p><pre><code><code>Output share of serve cost:           60%
Reduction from halving output cost:   50%
Saving as share of total:             60% &#215; 50% = 30%

Multiplier: 1 - 0.30 = 0.70

$2.33 &#215; 0.70 = $1.63 / MTok</code></code></pre><p>The 60% output GPU share is an engineering estimate for long-completion agentic workloads, not a precisely measured figure. For short-response chat workloads it might be 40%, making the multiplier 0.80 not 0.70. This is the most assumption-sensitive step.</p><div><hr></div><p><strong>Step 4: Adaptive compute + prefill/decode separation &#215; 0.79</strong></p><p>Two separate effects multiplied together.</p><p><strong>Sub-effect A: Adaptive compute routing</strong></p><p>Qwen3 and Kimi K2 data suggests roughly 40% of agent subtasks are routine enough for non-thinking mode. Non-thinking mode generates approximately 75% fewer output tokens (a direct answer of ~200 tokens instead of a reasoning chain of ~800 tokens).</p><pre><code><code>Tasks routed to non-thinking mode:         40%
Output token reduction on those tasks:     75%
Net output token reduction (whole workload): 40% &#215; 75% = 30%

Output tokens = 55.6% of blended cost
Saving: 30% &#215; 55.6% = 16.7% of total cost...</code></code></pre><p>But this step is applied after speculative decoding has already restructured the cost balance. The document uses a simpler blended estimate of 10% total cost reduction from adaptive routing, which is the midpoint of the 8&#8211;12% range cited. So:</p><pre><code><code>Sub-multiplier A: 1 - 0.10 = 0.90</code></code></pre><p><strong>Sub-effect B: Prefill/decode separation</strong></p><p>Mooncake production data shows 15&#8211;25% improvement in hardware utilisation from separating prefill clusters (optimised for compute throughput) from decode clusters (optimised for memory bandwidth). The document uses 12% as a conservative serve cost reduction.</p><pre><code><code>Sub-multiplier B: 1 - 0.12 = 0.88</code></code></pre><p><strong>Combined:</strong></p><pre><code><code>0.90 &#215; 0.88 = 0.792 &#8776; 0.79

$1.63 &#215; 0.79 = $1.29 / MTok</code></code></pre><p>The honest weakness: the 10% adaptive saving and 12% PD separation saving are both estimates derived from production data at other companies (Qwen3, Mooncake), not Anthropic-specific measurements. The combined multiplier range is probably 0.79&#8211;0.87 depending on implementation quality.</p><div><hr></div><p><strong>Step 5: MLA &#215; 0.60</strong></p><p>DeepSeek V2 measured 5.76&#215; peak throughput from MLA. In mixed production workloads the realistic gain is 2.5&#8211;3.5&#215;. The document uses <strong>3&#215;</strong> as the working assumption.</p><p>Now a critical distinction: MLA primarily benefits the <strong>hardware component</strong> of serve cost &#8212; specifically GPU memory bandwidth and throughput. Not all serve costs are hardware. The cost structure breakdown:</p><pre><code><code>Hardware (GPU compute + memory):    ~60% of serve cost
Software, network, storage, staff:  ~40% of serve cost</code></code></pre><p>If hardware throughput improves 3&#215;, hardware cost per token falls to 1/3 of its current level:</p><pre><code><code>Hardware saving: 1 - (1 &#247; 3) = 1 - 0.333 = 66.7%

Saving as share of total serve cost: 66.7% &#215; 60% = 40%

Multiplier: 1 - 0.40 = 0.60

$1.29 &#215; 0.60 = $0.77 / MTok</code></code></pre><p>A conservative version uses 2.5&#215; throughput instead of 3&#215;:</p><pre><code><code>Hardware saving: 1 - (1 &#247; 2.5) = 1 - 0.40 = 60%
Saving as share of total: 60% &#215; 60% = 36%
Conservative multiplier: 1 - 0.36 = 0.64
Conservative result: $1.29 &#215; 0.64 = $0.83 / MTok</code></code></pre><div><hr></div><p><strong>Step 6: MoE &#215; 0.55</strong></p><p>DeepSeek V3 activates 37B parameters per token out of 671B total.</p><p>FLOPs per token scale approximately linearly with active parameters:</p><pre><code><code>Dense quality-equivalent model:   ~70B params &#215; 2 = 140B FLOPs per token
MoE (37B active):                  37B &#215; 2        =  74B FLOPs per token

FLOPs reduction: (140 - 74) &#247; 140 = 47%</code></code></pre><p>Serve cost has two main hardware drivers: compute (FLOPs) and memory bandwidth (loading weights). In a well-tuned MoE system, memory bandwidth also scales approximately with active parameters because you&#8217;re only reading the active expert weights per forward pass. Treating them together:</p><pre><code><code>Assume hardware = 60% of serve cost
Hardware cost per token falls proportionally to FLOPs: &#215; (1 - 0.47) = &#215; 0.53

Saving as share of total: 47% &#215; 60% = 28.2%
Multiplier: 1 - 0.282 = 0.718 &#8776; 0.72</code></code></pre><p>That actually gives a more conservative 0.72, not 0.55. The document&#8217;s 0.55 implies a larger hardware share or a more favourable memory bandwidth assumption. If hardware is <strong>80% of serve cost</strong> at this stage (after non-hardware costs have been relatively reduced by prior optimisations):</p><pre><code><code>Saving: 47% &#215; 80% = 37.6%
Multiplier: 1 - 0.376 = 0.624 &#8776; 0.62</code></code></pre><p>To get 0.55 you&#8217;d need to assume nearly all serve cost is hardware-driven and the FLOPs saving is closer to 50%:</p><pre><code><code>50% FLOPs saving &#215; 90% hardware share = 45% saving &#8594; multiplier 0.55</code></code></pre><p><strong>This is the weakest multiplier in the chain.</strong> The 0.55 is at the optimistic end. A range of 0.60&#8211;0.72 is more defensible. The document&#8217;s 0.55 overstates the gain.</p><pre><code><code>$0.77 &#215; 0.55 = $0.42 / MTok   (document figure)
$0.77 &#215; 0.65 = $0.50 / MTok   (conservative figure)</code></code></pre><div><hr></div><p><strong>Track B &#8212; &#215; 0.65 (&#8722;35% token volume)</strong></p><p>Three techniques, each applied to a different portion of the 500M token workload.</p><p><strong>Lazy loading applied to skill/instruction context:</strong></p><pre><code><code>Assume skill/instruction tokens = 20% of total workload (100M of 500M tokens)
Without lazy loading: 50 skills &#215; 2,000 tokens = 100,000 tokens per call loaded
With lazy loading: 2 skills &#215; 2,000 tokens = 4,000 tokens per call loaded
Reduction per call: (100,000 - 4,000) &#247; 100,000 = 96%

Applied to 20% of workload: 96% &#215; 20% = 19.2% total token reduction</code></code></pre><p><strong>RAG applied to codebase-retrieval queries:</strong></p><pre><code><code>Assume 25% of calls involve loading large codebases
Without RAG: 150,000-token codebase loaded per such call
With RAG: 3,000-token chunk retrieved
Reduction per RAG call: (150,000 - 3,000) &#247; 150,000 = 98%

These calls = 25% &#215; 80% input share = 20% of total tokens
Applied saving: 98% &#215; 20% = 19.6% total token reduction</code></code></pre><p><strong>Episodic compression on long sessions:</strong></p><pre><code><code>Assume 30% of token consumption occurs in sessions long enough to benefit
Without compression: average context grows to 400,000 tokens
With compression: maintained at 20,000 tokens working window
Reduction: (400,000 - 20,000) &#247; 400,000 = 95%

Applied to 30% of workload: 95% &#215; 30% = 28.5% total token reduction</code></code></pre><p><strong>The overlap problem:</strong></p><p>These three effects are not additive &#8212; they address partially overlapping token populations. A codebase loaded via RAG is also a codebase that would have accumulated in context without compression. Applying a rough 50% overlap discount to avoid double-counting:</p><pre><code><code>Raw sum: 19.2% + 19.6% + 28.5% = 67.3%
Overlap-adjusted (&#247; 2 for overlapping populations): ~33.6% &#8776; 35%

Multiplier: 1 - 0.35 = 0.65
500M &#215; 0.65 = 325M effective tokens</code></code></pre><p>The honest range is 25&#8211;40% depending on how much of the workload benefits from each technique and how well they&#8217;re implemented. The 35% is a reasonable central estimate but not a tight one.</p><div><hr></div><p><strong>The combined table with all arithmetic shown:</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3VQK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e0a52ca-2b63-49c2-846c-a6e43c7a9181_896x562.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3VQK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e0a52ca-2b63-49c2-846c-a6e43c7a9181_896x562.png 424w, https://substackcdn.com/image/fetch/$s_!3VQK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e0a52ca-2b63-49c2-846c-a6e43c7a9181_896x562.png 848w, https://substackcdn.com/image/fetch/$s_!3VQK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e0a52ca-2b63-49c2-846c-a6e43c7a9181_896x562.png 1272w, https://substackcdn.com/image/fetch/$s_!3VQK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e0a52ca-2b63-49c2-846c-a6e43c7a9181_896x562.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3VQK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e0a52ca-2b63-49c2-846c-a6e43c7a9181_896x562.png" width="896" height="562" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0e0a52ca-2b63-49c2-846c-a6e43c7a9181_896x562.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:562,&quot;width&quot;:896,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:80458,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198139565?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e0a52ca-2b63-49c2-846c-a6e43c7a9181_896x562.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3VQK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e0a52ca-2b63-49c2-846c-a6e43c7a9181_896x562.png 424w, https://substackcdn.com/image/fetch/$s_!3VQK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e0a52ca-2b63-49c2-846c-a6e43c7a9181_896x562.png 848w, https://substackcdn.com/image/fetch/$s_!3VQK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e0a52ca-2b63-49c2-846c-a6e43c7a9181_896x562.png 1272w, https://substackcdn.com/image/fetch/$s_!3VQK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e0a52ca-2b63-49c2-846c-a6e43c7a9181_896x562.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The most defensible range at near-term (Steps FP8 + SpecDec + Adaptive/PD): <strong>$1.29&#8211;$1.41/MTok</strong>, with $1.29 the optimistic case and $1.41 the conservative case. Full architecture: <strong>$0.42&#8211;$0.55/MTok</strong>. The strategic conclusion is the same across the entire range &#8212; the subsidy reverses by a wide margin.</p><h3>Where does Anthropic Stand (A few wild guesses)? </h3><p>Track A &#8212; Where each multiplier comes from, how solid it is, and whether Anthropic is likely already using it</p><p>Step 2: FP8 &#215; 0.72 (&#8722;28%)</p><p>My source is simple arithmetic: if FP8 inference on H100 tensor cores delivers 1.4&#215; throughput versus BF16 (the conservative end of a published 1.3&#8211;1.6&#215; range from Qwen3-2507 Baseten benchmarks and DeepSeek V3 production data), then cost per token = 1 &#247; 1.4 = 0.714, rounded to 0.72. The logic should be sound. The 1.4&#215; figure is deliberately conservative &#8212; some implementations show 1.6&#215; &#8212; so the multiplier is probably if anything slightly pessimistic.</p><p>Realistic? Yes, for a well-implemented FP8 deployment. The caveat is that FP8 requires per-model calibration and can degrade quality on certain task types without careful tuning. Production deployment is engineering-intensive. The gain is real but not automatic.</p><p>Is Anthropic using it? Almost certainly not at full scale. Here is the key inference: if FP8 were deployed, serve cost would be ~$2.33/MTok, implying a COR ratio of $2.33 &#247; $5.40 = 43%. The Information reported COR at 60%. That gap is what I am using as evidence. But we do not have full financials - so take this with a pinch of salt. The $3.24 baseline is consistent with standard precision (BF16 or similar) on cloud infrastructure. FP8 would hypothetically have shown up in the margin data already. Maybe it's there already - until they IPO, I am stabbing ever so slightly in the dark. </p><p>Step 3: Speculative decoding &#215; 0.70 (&#8722;30%)</p><p>This one is more constructed. The reasoning: output tokens drive roughly 60% of serve cost on an agentic workload (they cost 5&#215; more per token and are decode-bound, the most GPU-intensive phase). At a 0.75 acceptance rate, speculative decoding approximately halves output generation cost. So 60% &#215; 50% = 30% of total serve cost saved, hence &#215; 0.70. The 60% output cost share is an assumption worth challenging &#8212; for pure chat it might be 40%; for coding agents generating long completions it could be 70%. The 0.70 multiplier is reasonable for coding-agent workloads specifically, less reliable for mixed enterprise use.</p><p>Realistic? Directionally yes, but the output-cost share assumption is load-dependent. A range of 0.72&#8211;0.78 would be more defensible than a precise 0.70.</p><p>Is Anthropic using it? Probably not in production at scale. EAGLE-3 style speculative decoding requires training a draft head for each model. The leaked ULTRAPLAN references suggest awareness of multi-stage inference, and MTP modules are referenced in DeepSeek V3's architecture as doubling as speculative decoders &#8212; but there is no public evidence Anthropic has deployed this. Again the margin data argues against it.</p><p>Step 4: Adaptive compute + prefill/decode separation &#215; 0.79 (&#8722;21%)</p><p>This compounds two sub-effects: adaptive compute routing saving ~8&#8211;12% (from Qwen3 and Kimi K2 thinking mode data, 40% of tasks routed to non-thinking mode saving ~75% of output tokens on those tasks) and prefill/decode separation adding ~12&#8211;15% fleet utilisation improvement (from Mooncake production data). The combined 0.79 sits at the optimistic end of those ranges. A more conservative combined estimate would be 0.83&#8211;0.85.</p><p>Realistic? The adaptive compute component is well-evidenced. The PD separation component assumes Anthropic would run separate prefill and decode clusters &#8212; a significant infrastructure reorganisation. The 0.79 is probably 15&#8211;20% too aggressive on its own; call it 0.82 as a more defensible figure.</p><p>Is Anthropic using it? Partially. Adaptive thinking modes exist in Claude 3.7+ and subsequent models &#8212; that component is real. Prefill/decode separation at the infrastructure level is uncertain and not evidenced. So Anthropic is probably capturing maybe half of this multiplier already, which means the incremental gain from full deployment of this step is smaller than shown.</p><p>Step 5: MLA &#215; 0.60 (&#8722;40%)</p><p>DeepSeek V2 measured 5.76&#215; peak throughput improvement from MLA over dense MHA. In production under mixed workloads, a realistic throughput gain is 2.5&#8211;3.5&#215;. Taking 3&#215; as the working assumption: hardware cost per token falls by 67%. Hardware is roughly 60% of serve cost. So 60% &#215; 67% = 40% blended reduction, hence &#215; 0.60. The MHA2MLA ACL 2025 paper validates the compression ratio of 93.3% KV cache reduction. The throughput gain is the uncertain variable &#8212; peak lab measurements rarely fully survive production deployment. A &#215; 0.65 (&#8722;35%) would be more conservative and equally defensible.</p><p>Realistic? The KV cache reduction is very well evidenced. The serve cost impact depends on how memory-bandwidth-bound the specific deployment is. For long-context agentic workloads the gain is larger; for short-context chat it is smaller. 0.60 is reasonable for the heavy user scenario specifically.</p><p>Is Anthropic using it? Not too confident about this. MLA requires a new model architecture &#8212; you cannot retrofit it to Claude Sonnet 4.6. This is a next-generation model decision. Given the 60% COR and the architectural complexity, there is no evidence Claude's current attention architecture uses MLA.</p><p>Step 6: MoE &#215; 0.55 (&#8722;45%)</p><p>This is the most aggressive multiplier and the least certain. MoE reduces FLOPs per token by ~47% relative to a quality-equivalent dense model (activating 37B of 671B parameters), but memory bandwidth requirement stays high (all experts must be resident). In practice the net serve cost reduction for a well-implemented MoE deployment versus a well-implemented dense deployment is probably 35&#8211;50%. The 0.55 sits at the aggressive end. A &#215; 0.60 (&#8722;40%) would be more conservative and still substantial.</p><p>Realistic? The DeepSeek V3 production data supports this range, but DeepSeek operates on owned infrastructure optimised for their specific MoE architecture. Any dense model Western provider migrating to MoE on rented hyperscaler infrastructure might see a narrower gain until infrastructure inversion (Lever 3) is also complete. The interaction between MoE and infrastructure inversion is important &#8212; you capture the full MoE benefit most cleanly on owned hardware tuned for sparse workloads.</p><p>Is Anthropic using it? Doubt it, but again - wild stabs here. Same logic as MLA &#8212; requires a new model generation. One unverified Medium analysis of the leaked code claimed a reference to a 405B MoE architecture in training pipeline code. This is not confirmed by primary sources and should be treated with significant caution. The margin data is the most reliable signal and it argues against MoE deployment at scale.</p><p>Track B &#8212; The &#215; 0.65 (&#8722;35% token volume)</p><p>This is the softest multiplier in the document. It blends three effects: lazy loading (&#8722;20% on skill-heavy workloads), RAG (&#8722;80% for codebase-retrieval queries but only on that subset), and episodic compression (&#8722;70% on context accumulation but only on long sessions). The 35% blended figure assumes a workload where roughly a third of token consumption is addressable by each technique. In practice this depends entirely on how the enterprise has built their agentic pipelines. A well-optimised enterprise might achieve 40&#8211;50% reduction. One with no context engineering might achieve 5%.</p><p>Realistic? As an enterprise average for a team that has actively implemented all three techniques &#8212; yes, 25&#8211;35% is credible. As a blanket assumption for all heavy users &#8212; no. This is the lever most dependent on enterprise behaviour, not provider architecture. The honest range is 15&#8211;40% depending on implementation quality.</p><p>Is Anthropic using it on their own infrastructure? Partially. The leaked MEMORY.md three-layer architecture is Anthropic's own context engineering for Claude Code. The autoCompact system is episodic compression in production. So Anthropic is applying this to their own agent harness. The $3.24 baseline may already reflect some of this at the Claude Code level &#8212; meaning the incremental gain from enterprises applying Lever 7 to their own workflows is additive on top, not something Anthropic needs to do centrally.</p><h3>Qualifier!! What the compounding assumption gets wrong</h3><p><strong>The deepest issue is that the multipliers are applied sequentially and independently, which assumes each efficiency gain operates on a clean remaining cost base. In reality some effects partially overlap. FP8 and speculative decoding both improve output throughput &#8212; their combined effect is somewhat less than their product suggests because the bottleneck shifts after the first improvement. A more rigorous treatment would model the specific bottleneck at each stage. The sequential multiplication is a reasonable approximation but overstates the combined gain by perhaps 10&#8211;15%.</strong></p><h2>References</h2><p><strong>[1] </strong>Broda, E. (2026, May 14). The Brutal Reality of Token Economics. <a href="https://agenticmesh.substack.com">AgenticMesh Substack.</a></p><p><strong>[2] </strong>Anthropic. (2026). Claude Max Plan Pricing. </p><p><strong>[3] </strong>Anthropic. (2026). API Pricing Documentation. <a href="https://platform.claude.com/docs/en/about-claude/pricing">Anthropic Platform.</a></p><p><strong>[4] </strong>The Information. (2025). Anthropic Lowers Gross Margin Projection as Revenue Skyrockets. <a href="https://www.theinformation.com/articles/16444">The Information (paywalled).</a></p><p><strong>[5] </strong>Reuters. (2025, October 15). Anthropic aims to nearly triple annualized revenue in 2026. <a href="https://www.reuters.com/business/retail-consumer/anthropic-aims-nearly-triple-annualized-revenue-2026-sources-say-2025-10-15/">Reuters.</a></p><p><strong>[6] </strong>OpenAI. (2025, September 15). How People Use ChatGPT. <a href="https://cdn.openai.com/pdf/a253471f-8260-40c6-a2cc-aa93fe9f142e/economic-research-chatgpt-usage-paper.pdf">OpenAI Economic Research.</a></p><p><strong>[7] </strong>Business Insider. (2026, April). How Disney Employees Are Using AI, According to Internal Docs. <a href="https://www.businessinsider.com/how-disney-tech-employees-are-using-ai-claude-cursor-tokens-2026-4">Business Insider.</a></p><p><strong>[8] </strong>Business Insider. (2026, May). AI Tokenmaxxing Fails As Productivity Strategy, Says Jellyfish. <a href="https://www.businessinsider.com/ai-tokenmaxxing-fails-as-productivity-strategy-jellyfish-2026-5">Business Insider.</a></p><p><strong>[9] </strong>DeepSeek-AI. (2024). DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model. <a href="https://arxiv.org/abs/2405.04434">arXiv:2405.04434.</a></p><p><strong>[10] </strong>DeepSeek-AI. (2024). DeepSeek-V3 Technical Report. <a href="https://huggingface.co/deepseek-ai/DeepSeek-V3">Hugging Face / GitHub.</a></p><p><strong>[11] </strong>Ji, T. et al. (2025). Towards Economical Inference: Enabling DeepSeek&#8217;s Multi-Head Latent Attention in Any Transformer-based LLMs. ACL 2025. <a href="https://arxiv.org/abs/2502.14837">arXiv:2502.14837.</a></p><p><strong>[12] </strong>Vizuara. (2025, July). Decoding Multi-Head Latent Attention (Part 1): The KV Cache Memory Bottleneck, Solved. <a href="https://vizuara.substack.com/p/decoding-multi-head-latent-attention">Vizuara Substack.</a></p><p><strong>[13] </strong>Mangla, P. et al. (2025, October). KV Cache Optimization via Multi-Head Latent Attention. <a href="https://pyimagesearch.com/2025/10/13/kv-cache-optimization-via-multi-head-latent-attention/">PyImageSearch.</a></p><p><strong>[14] </strong>IntuitionLabs. (2026). DeepSeek&#8217;s Low Inference Cost Explained: MoE and Strategy. <a href="https://intuitionlabs.ai/articles/deepseek-inference-cost-explained">IntuitionLabs AI.</a></p><p><strong>[15] </strong>Introl. (2026, February). Mixture of Experts Infrastructure. <a href="https://introl.com/blog/mixture-of-experts-moe-infrastructure-scaling-sparse-models-guide">Introl Blog.</a></p><p><strong>[16] </strong>Friendli AI. (2025). The Rise of MoE: Comparing 2025&#8217;s Leading Mixture-of-Experts AI Models. <a href="https://friendli.ai/blog/moe-models-comparison">Friendli AI Blog.</a></p><p><strong>[17] </strong>IntuitionLabs. (2026, February). Analysis of the Kimi K2 Open-Weight Language Model. <a href="https://intuitionlabs.ai/articles/kimi-k2-open-weight-llm-analysis">IntuitionLabs AI.</a></p><p><strong>[18] </strong>VentureBeat. (2025, July). Alibaba&#8217;s new open source Qwen3-235B-A22B-2507 beats Kimi-2 and offers low compute version. <a href="https://venturebeat.com/ai/alibabas-new-open-source-qwen3-235b-a22b-2507-beats-kimi-2-and-offers-low-compute-version">VentureBeat.</a></p><p><strong>[19] </strong>Together AI. (2025&#8211;2026). Together AI delivers fastest inference for the top open-source models. <a href="https://www.together.ai/blog/fastest-inference-for-the-top-open-source-models">Together AI Blog.</a></p><p><strong>[20] </strong>SyncSoft AI. (2026, May). Speculative Decoding 2026: 2.8x Faster LLM Inference. <a href="https://www.syncsoft.ai/en/blog/speculative-decoding-eagle3-medusa-deepseek-mtp-chinese-chuhai-2026">SyncSoft AI Blog.</a></p><p><strong>[21] </strong>Fireworks AI. (2026). Best Open Source LLMs in 2026: We Reviewed 7 Models. <a href="https://fireworks.ai/blog/best-open-source-llms">Fireworks AI Blog.</a></p><p><strong>[22] </strong>Hugging Face. (2026). Architectural Choices in China&#8217;s Open-Source AI Ecosystem: Building Beyond DeepSeek. <a href="https://huggingface.co/blog/huggingface/one-year-since-the-deepseek-moment-blog-2">Hugging Face Blog.</a></p><p><strong>[23] </strong>US-China Economic and Security Review Commission. (2026, March). Two Loops: How China&#8217;s Open AI Strategy Reinforces Its Industrial Dominance. <a href="https://www.uscc.gov/sites/default/files/2026-03/Two_Loops--How_Chinas_Open_AI_Strategy_Reinforces_Its_Industrial_Dominance.pdf">USCC Report.</a></p><p><strong>[24] </strong>Digital Applied. (2025&#8211;2026). Chinese AI Models Beat GPT-4: Kimi K2, Qwen 3, GLM 4.5. <a href="https://www.digitalapplied.com/blog/chinese-ai-models-kimi-k2-qwen-3-coder-glm-4-5">Digital Applied.</a></p><p><strong>[25] </strong>FastMTP Team. (2025). FastMTP: Accelerating LLM Inference with Enhanced Multi-Token Prediction. <a href="https://arxiv.org/abs/2509.18362">arXiv:2509.18362.</a></p><p><strong>[26] </strong>Microsoft. (2025). $80B Datacenter Investment Programme. <a href="https://blogs.microsoft.com/on-the-issues/2025/01/03/the-golden-opportunity-for-american-ai/">Microsoft Blog.</a></p><p><strong>[27] </strong>Google. (2025). TPU Ironwood: Next-Generation AI Accelerator. <a href="https://blog.google/innovation-and-ai/infrastructure-and-cloud/google-cloud/ironwood-tpu-age-of-inference/">Google Blog.</a></p><p><strong>[28] </strong>Jevons, W.S. (1865). The Coal Question: An Inquiry Concerning the Progress of the Nation, and the Probable Exhaustion of Our Coal-Mines. <a href="https://oll.libertyfund.org/titles/jevons-the-coal-question">Macmillan (via Google Books).</a></p><p><strong>[29] </strong>VentureBeat. (2026, April 1). Claude Code&#8217;s source code appears to have leaked: here&#8217;s what we know. <a href="https://venturebeat.com/technology/claude-codes-source-code-appears-to-have-leaked-heres-what-we-know/">VentureBeat.</a></p><p><strong>[30] </strong>NodeSource. (2026, April 2). Anthropic Accidentally Leaked Claude Code&#8217;s Entire Source &#8212; Here&#8217;s What Was Inside (KAIROS, ULTRAPLAN, AutoCompact details). <a href="https://nodesource.com/blog/anthropic-claude-code-source-leak-bun-bug">NodeSource Blog.</a></p><p><strong>[31] </strong>Redis (2026). What is Semantic caching. <a href="https://redis.io/blog/what-is-semantic-caching/">AI Blog.</a></p><p><strong>[32] </strong>Anthropic. (2026). Prompt Caching &#8212; 90% discount on cached input tokens. <a href="https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching">Anthropic API Docs.</a></p><p>[33] Ramp - https://ramp.com/blog/trillion-dollar-ai-blindspot </p><p style="text-align: justify;"></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[ASCRS Harness Lab - The Integrated Agentic Stack: When Does More Architecture Mean Better AI? A Diagnostic Teardown]]></title><description><![CDATA[Design Considerations of A Shipper&#8217;s Agentic Logic System: Part II (Project: AI Supply Chain Response System (ASCRS))]]></description><link>https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated</link><guid isPermaLink="false">https://interestingengineering.substack.com/p/ascrs-harness-lab-the-integrated</guid><dc:creator><![CDATA[Interesting Engineering ++]]></dc:creator><pubDate>Sat, 16 May 2026 17:52:19 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!cv0d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fb21fa-dc0a-4c0f-8ce4-6b2e85594843_1160x595.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cv0d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fb21fa-dc0a-4c0f-8ce4-6b2e85594843_1160x595.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cv0d!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fb21fa-dc0a-4c0f-8ce4-6b2e85594843_1160x595.png 424w, https://substackcdn.com/image/fetch/$s_!cv0d!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fb21fa-dc0a-4c0f-8ce4-6b2e85594843_1160x595.png 848w, https://substackcdn.com/image/fetch/$s_!cv0d!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fb21fa-dc0a-4c0f-8ce4-6b2e85594843_1160x595.png 1272w, https://substackcdn.com/image/fetch/$s_!cv0d!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fb21fa-dc0a-4c0f-8ce4-6b2e85594843_1160x595.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cv0d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fb21fa-dc0a-4c0f-8ce4-6b2e85594843_1160x595.png" width="1160" height="595" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/33fb21fa-dc0a-4c0f-8ce4-6b2e85594843_1160x595.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:595,&quot;width&quot;:1160,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1153649,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fb21fa-dc0a-4c0f-8ce4-6b2e85594843_1160x595.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cv0d!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fb21fa-dc0a-4c0f-8ce4-6b2e85594843_1160x595.png 424w, https://substackcdn.com/image/fetch/$s_!cv0d!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fb21fa-dc0a-4c0f-8ce4-6b2e85594843_1160x595.png 848w, https://substackcdn.com/image/fetch/$s_!cv0d!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fb21fa-dc0a-4c0f-8ce4-6b2e85594843_1160x595.png 1272w, https://substackcdn.com/image/fetch/$s_!cv0d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33fb21fa-dc0a-4c0f-8ce4-6b2e85594843_1160x595.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Had some time on my hands, and applied the features of <a href="https://interestingengineering.substack.com/p/the-harness-experiment">The Harness Experiment</a>(s) to the <a href="https://interestingengineering.substack.com/p/the-architecture-of-awareness-design">Architecture of Awareness</a> design considerations. You will remember from The Harness Experiment (applied to a mini vendor analysis case study) that the results presented as follows:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8jMF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ae2b848-ef25-4133-8252-15e960e186dd_1193x715.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8jMF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ae2b848-ef25-4133-8252-15e960e186dd_1193x715.jpeg 424w, https://substackcdn.com/image/fetch/$s_!8jMF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ae2b848-ef25-4133-8252-15e960e186dd_1193x715.jpeg 848w, https://substackcdn.com/image/fetch/$s_!8jMF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ae2b848-ef25-4133-8252-15e960e186dd_1193x715.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!8jMF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ae2b848-ef25-4133-8252-15e960e186dd_1193x715.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8jMF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ae2b848-ef25-4133-8252-15e960e186dd_1193x715.jpeg" width="1193" height="715" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7ae2b848-ef25-4133-8252-15e960e186dd_1193x715.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:715,&quot;width&quot;:1193,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:74337,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ae2b848-ef25-4133-8252-15e960e186dd_1193x715.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8jMF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ae2b848-ef25-4133-8252-15e960e186dd_1193x715.jpeg 424w, https://substackcdn.com/image/fetch/$s_!8jMF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ae2b848-ef25-4133-8252-15e960e186dd_1193x715.jpeg 848w, https://substackcdn.com/image/fetch/$s_!8jMF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ae2b848-ef25-4133-8252-15e960e186dd_1193x715.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!8jMF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ae2b848-ef25-4133-8252-15e960e186dd_1193x715.jpeg 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>applying the following framework:</p><p><em><strong>Harness-as-a-Service: a platform layer that you configure, rather than rebuild.</strong></em></p><pre><code><code>&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;
&#9474;                Claude Code (HaaS Runtime)                   &#9474;   &#8592; Outer Harness-as-a-Service
&#9474;  &#8226; ReAct / Agent Loop                                       &#9474;
&#9474;  &#8226; Tool Registry + Permission Gates                         &#9474;
&#9474;  &#8226; Context Assembly + Compaction                            &#9474;
&#9474;  &#8226; 3-Layer Memory System (in-context + MEMORY.md + files)   &#9474;
&#9474;  &#8226; Sub-agent / Swarm Orchestration                          &#9474;
&#9474;  &#8226; Safety, Hooks, Streaming, Sandboxes                      &#9474;
&#9474;  &#8226; Persistent Filesystem + Execution Environment            &#9474;
&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9650;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;
                               &#9474;   (Configure + Extend)
                               &#9474;
&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9524;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;
&#9474;           Pre-Experiment Harness (Lab OS / Foundation)    &#8592; Standardized Configuration Surface
&#9474;  &#8226; shared/client.js (model routing)                         &#9474;
&#9474;  &#8226; shared/scorer.js + rubric system                         &#9474;
&#9474;  &#8226; shared/self_heal.js + logger                             &#9474;
&#9474;  &#8226; Root CLAUDE.md (constitution + lab-wide rules)           &#9474;
&#9474;  &#8226; Memory conventions + task templates                      &#9474;
&#9474;  &#8226; Observability &amp; results pipeline                         &#9474;
&#9474;  &#8226; Common tools &amp; utilities                                 &#9474;
&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9650;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;
                               &#9474;   (Swappable)
                               &#9474;
&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9524;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;
&#9474;           Experiment Harness Layer (H1&#8211;H10)                 &#9474;   &#8592; Domain-Specific Scaffolding
&#9474;                                                             &#9474;
&#9474;  H1: Prompt + Constitution                                  &#9474;
&#9474;  H2: Reflection + Self-Critique Loop                        &#9474;
&#9474;  H3: Sequential Tool-Use                                    &#9474;
&#9474;  H4: Parallel Fan-Out + Merge                               &#9474;
&#9474;  H5: Eval + Revision Loop                                   &#9474;
&#9474;  H6: Skill / Memory Crystallization                         &#9474;
&#9474;  H7: Model Routing / Tiered                                 &#9474;
&#9474;  H8: HITL + Confidence Gating                               &#9474;
&#9474;  H9: Sub-Agent Swarm                                        &#9474;
&#9474;  H10: Meta-Router / Adaptive                                &#9474;
&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9650;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;
                               &#9474;
&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9524;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;
&#9474;                     Core Model                              &#9474;
&#9474;                gpt-oss-120b / Claude Sonnet etc.            &#9474;
&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;</code></code></pre><p>The question I was recently asked: <em><strong>Can you reconcile the two outcome design decisions (Choices between H1-H10 and V3/V4)? That answer is - Yes. Depending on the nature of the exercise, which you can later synthesize into a goal-oriented task, with domain knowledge/understanding:</strong></em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wh1M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff4ba283-bb21-4d7a-8cd8-4c659d48c9f1_1287x672.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wh1M!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff4ba283-bb21-4d7a-8cd8-4c659d48c9f1_1287x672.png 424w, https://substackcdn.com/image/fetch/$s_!wh1M!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff4ba283-bb21-4d7a-8cd8-4c659d48c9f1_1287x672.png 848w, https://substackcdn.com/image/fetch/$s_!wh1M!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff4ba283-bb21-4d7a-8cd8-4c659d48c9f1_1287x672.png 1272w, https://substackcdn.com/image/fetch/$s_!wh1M!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff4ba283-bb21-4d7a-8cd8-4c659d48c9f1_1287x672.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wh1M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff4ba283-bb21-4d7a-8cd8-4c659d48c9f1_1287x672.png" width="1287" height="672" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ff4ba283-bb21-4d7a-8cd8-4c659d48c9f1_1287x672.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:672,&quot;width&quot;:1287,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1149213,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff4ba283-bb21-4d7a-8cd8-4c659d48c9f1_1287x672.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wh1M!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff4ba283-bb21-4d7a-8cd8-4c659d48c9f1_1287x672.png 424w, https://substackcdn.com/image/fetch/$s_!wh1M!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff4ba283-bb21-4d7a-8cd8-4c659d48c9f1_1287x672.png 848w, https://substackcdn.com/image/fetch/$s_!wh1M!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff4ba283-bb21-4d7a-8cd8-4c659d48c9f1_1287x672.png 1272w, https://substackcdn.com/image/fetch/$s_!wh1M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff4ba283-bb21-4d7a-8cd8-4c659d48c9f1_1287x672.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!24RG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1257e4c-21bf-4b29-9a7c-ec11b500d9f0_1337x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!24RG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1257e4c-21bf-4b29-9a7c-ec11b500d9f0_1337x720.png 424w, https://substackcdn.com/image/fetch/$s_!24RG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1257e4c-21bf-4b29-9a7c-ec11b500d9f0_1337x720.png 848w, https://substackcdn.com/image/fetch/$s_!24RG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1257e4c-21bf-4b29-9a7c-ec11b500d9f0_1337x720.png 1272w, https://substackcdn.com/image/fetch/$s_!24RG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1257e4c-21bf-4b29-9a7c-ec11b500d9f0_1337x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!24RG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1257e4c-21bf-4b29-9a7c-ec11b500d9f0_1337x720.png" width="1337" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a1257e4c-21bf-4b29-9a7c-ec11b500d9f0_1337x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1337,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1388095,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1257e4c-21bf-4b29-9a7c-ec11b500d9f0_1337x720.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!24RG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1257e4c-21bf-4b29-9a7c-ec11b500d9f0_1337x720.png 424w, https://substackcdn.com/image/fetch/$s_!24RG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1257e4c-21bf-4b29-9a7c-ec11b500d9f0_1337x720.png 848w, https://substackcdn.com/image/fetch/$s_!24RG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1257e4c-21bf-4b29-9a7c-ec11b500d9f0_1337x720.png 1272w, https://substackcdn.com/image/fetch/$s_!24RG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1257e4c-21bf-4b29-9a7c-ec11b500d9f0_1337x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Into this <strong>Integrated Agentic Stack</strong>:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!na1Z!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29a9dd87-90d4-4208-9e6a-9dd9f09533ac_1162x595.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!na1Z!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29a9dd87-90d4-4208-9e6a-9dd9f09533ac_1162x595.png 424w, https://substackcdn.com/image/fetch/$s_!na1Z!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29a9dd87-90d4-4208-9e6a-9dd9f09533ac_1162x595.png 848w, https://substackcdn.com/image/fetch/$s_!na1Z!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29a9dd87-90d4-4208-9e6a-9dd9f09533ac_1162x595.png 1272w, https://substackcdn.com/image/fetch/$s_!na1Z!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29a9dd87-90d4-4208-9e6a-9dd9f09533ac_1162x595.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!na1Z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29a9dd87-90d4-4208-9e6a-9dd9f09533ac_1162x595.png" width="1162" height="595" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/29a9dd87-90d4-4208-9e6a-9dd9f09533ac_1162x595.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:595,&quot;width&quot;:1162,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1153869,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29a9dd87-90d4-4208-9e6a-9dd9f09533ac_1162x595.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!na1Z!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29a9dd87-90d4-4208-9e6a-9dd9f09533ac_1162x595.png 424w, https://substackcdn.com/image/fetch/$s_!na1Z!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29a9dd87-90d4-4208-9e6a-9dd9f09533ac_1162x595.png 848w, https://substackcdn.com/image/fetch/$s_!na1Z!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29a9dd87-90d4-4208-9e6a-9dd9f09533ac_1162x595.png 1272w, https://substackcdn.com/image/fetch/$s_!na1Z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29a9dd87-90d4-4208-9e6a-9dd9f09533ac_1162x595.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>But before I explain why, here is the &#8220;<a href="https://interestingengineering.substack.com/p/the-harness-experiment">harness experiment</a>&#8221; (the same one I applied to the vendor mini case study) applied to the <a href="https://interestingengineering.substack.com/p/the-architecture-of-awareness-design">Architecture of Awareness - a greyfield/brownfield project </a>that I ran in parallel to a legacy system. It will be written in 3-Parts (which will make this a long read):</p><h2>PART I: The Experiment (Again)</h2><p><strong>1 &#183; The Question</strong></p><p>Most AI benchmarks test whether a model can answer a question correctly. This experiment asked something harder: does the architecture around the model matter -- and if so, which designs actually help?</p><p>The specific hypothesis: ten different ways of wrapping the same model around the same task should produce measurably different quality scores. Some harnesses add structure, memory, parallel agents, or human review steps. Some add nothing but a well-crafted prompt. The question is which lever moves the needle most.</p><p><strong>Core question: </strong><em>Is architectural complexity a quality multiplier -- or does it mostly add latency and cost?</em></p><p>&#183; AI model = an engine. A harness = the chassis, steering, and gears built around it.</p><p>&#183; The same engine (GPT-OSS-120B) ran in all ten harnesses. Only the chassis changed.</p><p>&#183; Measurable outputs: quality score (&#945;), latency in milliseconds (&#955;), token cost (&#954;).</p><p>&#183; Domain: pharmaceutical supply chain crisis response -- a task requiring financial reasoning, logistics logic, and structured written output.</p><p>&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;</p><p><strong>2 &#183; The Setup</strong></p><p><strong>The Task</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!knZq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff440d4e0-5570-432b-9ef4-a0d6eeb99b8b_1317x658.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!knZq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff440d4e0-5570-432b-9ef4-a0d6eeb99b8b_1317x658.png 424w, https://substackcdn.com/image/fetch/$s_!knZq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff440d4e0-5570-432b-9ef4-a0d6eeb99b8b_1317x658.png 848w, https://substackcdn.com/image/fetch/$s_!knZq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff440d4e0-5570-432b-9ef4-a0d6eeb99b8b_1317x658.png 1272w, https://substackcdn.com/image/fetch/$s_!knZq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff440d4e0-5570-432b-9ef4-a0d6eeb99b8b_1317x658.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!knZq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff440d4e0-5570-432b-9ef4-a0d6eeb99b8b_1317x658.png" width="1317" height="658" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f440d4e0-5570-432b-9ef4-a0d6eeb99b8b_1317x658.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:658,&quot;width&quot;:1317,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1193035,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff440d4e0-5570-432b-9ef4-a0d6eeb99b8b_1317x658.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!knZq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff440d4e0-5570-432b-9ef4-a0d6eeb99b8b_1317x658.png 424w, https://substackcdn.com/image/fetch/$s_!knZq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff440d4e0-5570-432b-9ef4-a0d6eeb99b8b_1317x658.png 848w, https://substackcdn.com/image/fetch/$s_!knZq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff440d4e0-5570-432b-9ef4-a0d6eeb99b8b_1317x658.png 1272w, https://substackcdn.com/image/fetch/$s_!knZq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff440d4e0-5570-432b-9ef4-a0d6eeb99b8b_1317x658.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The scenario: it is March 2026. The Strait of Hormuz -- through which 21% of global oil and a significant share of pharmaceutical precursor shipments pass -- has been disrupted by an IRGC naval exercise. A fictional pharmaceutical company has 23 purchase orders in transit or pending shipment. The model must produce a CFO-approvable rerouting brief: which orders go by air, which go by sea around the Cape of Good Hope, and which get deferred -- with financial justification.</p><p>Why this task? Because it is not a trivia question with one right answer. It requires chaining multiple reasoning steps: reading historical precedent data, deriving a weighted cost estimate, routing 23 orders by tier and deadline, and then writing a coherent document that a finance executive could sign. Shallow outputs fail silently -- they look plausible but miss the analytical depth.</p><p><strong>The Four-Layer Stack</strong></p><p>&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;<br> &#9474; Layer 0 -- Claude Code Runtime &#9474;<br> &#9474; (orchestrates experiments, runs Node.js scripts) &#9474;<br> &#9500;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9508;<br> &#9474; Layer 1 -- shared/ (NEVER modified by experiments) &#9474;<br> &#9474; task.md &#183; disruption_context.json &#183; rubric.json &#9474;<br> &#9474; gold_answer.md &#183; scorer.js &#183; client.js &#183; tools/ &#9474;<br> &#9500;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9508;<br> &#9474; Layer 2 -- experiments/h1-h10/ (the variable) &#9474;<br> &#9474; Each harness has its own run.js. Some add prompts, &#9474;<br> &#9474; tools, agents, memory, or eval loops. H1 adds nothing. &#9474;<br> &#9500;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9508;<br> &#9474; Layer 3 -- .env &#9474;<br> &#9474; DEFAULT_MODEL &#183; LIGHT_MODEL &#183; SCORER_MODEL &#9474;<br> &#9474; (model strings, never hardcoded in experiment code) &#9474;<br> &#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;</p><p>The architecture enforces isolation: Layer 1 is immutable shared infrastructure. An experiment can only change what happens in Layer 2 -- it cannot alter the task, the data, or the scoring. This is the experimental control.</p><p><strong>File Structure</strong></p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;markdown&quot;,&quot;nodeId&quot;:&quot;fecc8b41-1f41-4720-a3f2-654063bfc911&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-markdown">THE FILE DIRECTORY

This structure is identical across every project. Folder names and file
names never change. Only the contents of domain-specific files change.

```
[project-name]/
&#9474;
&#9500;&#9472;&#9472; MASTER_GUIDE.md            &#8592; this file (reference + Claude Code spec)
&#9500;&#9472;&#9472; CLAUDE.md                  &#8592; root behavioral constitution
&#9500;&#9472;&#9472; package.json               &#8592; npm run h1 through h10, compare, all
&#9500;&#9472;&#9472; .env.example               &#8592; template (safe to commit to git)
&#9500;&#9472;&#9472; .env                       &#8592; real API keys (NEVER commit)
&#9500;&#9472;&#9472; .gitignore
&#9474;
&#9500;&#9472;&#9472; shared/                    &#8592; Layer 1: never edited by experiments
&#9474;   &#9474;
&#9474;   &#9500;&#9472;&#9472; client.js              &#8592; &#9733; ONLY file to edit for model switch
&#9474;   &#9474;                             Change DEFAULT_MODEL here OR in .env
&#9474;   &#9474;
&#9474;   &#9500;&#9472;&#9472; scorer.js              &#8592; Claude-as-judge. Loads rubric.json
&#9474;   &#9474;                             automatically. No changes needed.
&#9474;   &#9474;
&#9474;   &#9500;&#9472;&#9472; self_heal.js           &#8592; Retry wrapper. Never changes.
&#9474;   &#9500;&#9472;&#9472; logger.js              &#8592; Append-only results. Never changes.
&#9474;   &#9474;
&#9474;   &#9500;&#9472;&#9472; task.md                &#8592; &#9733; DOMAIN: what the agent must produce
&#9474;   &#9500;&#9472;&#9472; [domain_data].json     &#8592; &#9733; DOMAIN: ground-truth input data
&#9474;   &#9500;&#9472;&#9472; rubric.json            &#8592; &#9733; DOMAIN: gradient quality criteria
&#9474;   &#9500;&#9472;&#9472; gold_answer.md         &#8592; &#9733; DOMAIN: human-authored reference
&#9474;   &#9474;
&#9474;   &#9500;&#9472;&#9472; memory/                &#8592; auto-written by H5/H6 (do not edit)
&#9474;   &#9474;   &#9492;&#9472;&#9472; [domain]_skill.md
&#9474;   &#9474;
&#9474;   &#9500;&#9472;&#9472; tasks/                 &#8592; multi-task variants (optional)
&#9474;   &#9474;   &#9500;&#9472;&#9472; [slug-1].md
&#9474;   &#9474;   &#9492;&#9472;&#9472; [slug-2].md
&#9474;   &#9474;
&#9474;   &#9500;&#9472;&#9472; rubrics/               &#8592; per-task rubrics for H10 (required)
&#9474;   &#9474;   &#9500;&#9472;&#9472; [slug-1].json
&#9474;   &#9474;   &#9492;&#9472;&#9472; [slug-2].json
&#9474;   &#9474;
&#9474;   &#9492;&#9472;&#9472; tools/
&#9474;       &#9500;&#9472;&#9472; search.js          &#8592; &#9733; DOMAIN: data file target changes
&#9474;       &#9492;&#9472;&#9472; extract.js         &#8592; schema validator. Rarely changes.
&#9474;
&#9500;&#9472;&#9472; experiments/
&#9474;   &#9474;
&#9474;   &#9500;&#9472;&#9472; h1-baseline/
&#9474;   &#9474;   &#9500;&#9472;&#9472; CLAUDE.md          &#8592; scope + constraints + verification gate
&#9474;   &#9474;   &#9492;&#9472;&#9472; run.js             &#8592; experiment code (reads from shared/)
&#9474;   &#9474;
&#9474;   &#9500;&#9472;&#9472; h2-prompt-harness/
&#9474;   &#9474;   &#9500;&#9472;&#9472; system_prompt.md   &#8592; &#9733; DOMAIN: the variable under test
&#9474;   &#9474;   &#9500;&#9472;&#9472; CLAUDE.md
&#9474;   &#9474;   &#9492;&#9472;&#9472; run.js
&#9474;   &#9474;
&#9474;   &#9500;&#9472;&#9472; h3-sequential/
&#9474;   &#9474;   &#9500;&#9472;&#9472; skills/
&#9474;   &#9474;   &#9474;   &#9492;&#9472;&#9472; [domain]_eval.md  &#8592; &#9733; DOMAIN: step sequence + heuristics
&#9474;   &#9474;   &#9500;&#9472;&#9472; CLAUDE.md
&#9474;   &#9474;   &#9492;&#9472;&#9472; run.js
&#9474;   &#9474;
&#9474;   &#9500;&#9472;&#9472; h4-parallel/
&#9474;   &#9474;   &#9500;&#9472;&#9472; CLAUDE.md
&#9474;   &#9474;   &#9492;&#9472;&#9472; run.js
&#9474;   &#9474;
&#9474;   &#9500;&#9472;&#9472; h5-eval-loop/
&#9474;   &#9474;   &#9500;&#9472;&#9472; prompt_revisions/  &#8592; auto-written during run (do not edit)
&#9474;   &#9474;   &#9474;   &#9492;&#9472;&#9472; gen_N.md
&#9474;   &#9474;   &#9500;&#9472;&#9472; CLAUDE.md
&#9474;   &#9474;   &#9492;&#9472;&#9472; run.js
&#9474;   &#9474;
&#9474;   &#9500;&#9472;&#9472; h6-skill-memory/
&#9474;   &#9474;   &#9500;&#9472;&#9472; memory/            &#8592; auto-crystallized from H5 (do not edit)
&#9474;   &#9474;   &#9474;   &#9492;&#9472;&#9472; [domain]_skill.md
&#9474;   &#9474;   &#9500;&#9472;&#9472; CLAUDE.md
&#9474;   &#9474;   &#9492;&#9472;&#9472; run.js
&#9474;   &#9474;
&#9474;   &#9500;&#9472;&#9472; h7-model-routing/
&#9474;   &#9474;   &#9500;&#9472;&#9472; CLAUDE.md
&#9474;   &#9474;   &#9492;&#9472;&#9472; run.js
&#9474;   &#9474;
&#9474;   &#9500;&#9472;&#9472; h8-hitl/
&#9474;   &#9474;   &#9500;&#9472;&#9472; CLAUDE.md
&#9474;   &#9474;   &#9492;&#9472;&#9472; run.js
&#9474;   &#9474;
&#9474;   &#9500;&#9472;&#9472; h9-subagent-swarm/
&#9474;   &#9474;   &#9500;&#9472;&#9472; CLAUDE.md          &#8592; &#9733; DOMAIN: sub-agent role names
&#9474;   &#9474;   &#9492;&#9472;&#9472; run.js             &#8592; &#9733; DOMAIN: SA_SYSTEMS object
&#9474;   &#9474;
&#9474;   &#9492;&#9472;&#9472; h10-meta-harness/
&#9474;       &#9500;&#9472;&#9472; task_batch.json    &#8592; &#9733; DOMAIN: 20 mixed tasks + rubric refs
&#9474;       &#9500;&#9472;&#9472; CLAUDE.md
&#9474;       &#9492;&#9472;&#9472; run.js
&#9474;
&#9500;&#9472;&#9472; results/
&#9474;   &#9492;&#9472;&#9472; results.jsonl          &#8592; append-only (NEVER overwrite)
&#9474;
&#9492;&#9472;&#9472; scripts/
    &#9500;&#9472;&#9472; compare.js             &#8592; prints &#945;/&#955;/&#954; lift table
    &#9492;&#9472;&#9472; run_all.js             &#8592; sequential H1&#8594;H10 runner</code></pre></div><p><strong>The Rubric -- How Quality Was Measured</strong></p><p>A second, separate AI model (the &#8216;scorer&#8217;) read each output and rated it against six criteria. The scorer never saw which harness produced the output -- it only saw the output and a gold-standard human-written answer. Each criterion was scored 0.0, 0.5, or 1.0, then multiplied by its weight. The sum is &#945;.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0LDe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abd36b4-eeb4-4fb8-ac68-34e5251860fd_852x403.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0LDe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abd36b4-eeb4-4fb8-ac68-34e5251860fd_852x403.png 424w, https://substackcdn.com/image/fetch/$s_!0LDe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abd36b4-eeb4-4fb8-ac68-34e5251860fd_852x403.png 848w, https://substackcdn.com/image/fetch/$s_!0LDe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abd36b4-eeb4-4fb8-ac68-34e5251860fd_852x403.png 1272w, https://substackcdn.com/image/fetch/$s_!0LDe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abd36b4-eeb4-4fb8-ac68-34e5251860fd_852x403.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0LDe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abd36b4-eeb4-4fb8-ac68-34e5251860fd_852x403.png" width="852" height="403" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6abd36b4-eeb4-4fb8-ac68-34e5251860fd_852x403.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:403,&quot;width&quot;:852,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:54546,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abd36b4-eeb4-4fb8-ac68-34e5251860fd_852x403.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0LDe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abd36b4-eeb4-4fb8-ac68-34e5251860fd_852x403.png 424w, https://substackcdn.com/image/fetch/$s_!0LDe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abd36b4-eeb4-4fb8-ac68-34e5251860fd_852x403.png 848w, https://substackcdn.com/image/fetch/$s_!0LDe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abd36b4-eeb4-4fb8-ac68-34e5251860fd_852x403.png 1272w, https://substackcdn.com/image/fetch/$s_!0LDe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abd36b4-eeb4-4fb8-ac68-34e5251860fd_852x403.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p><strong>Why weights matter: </strong><em>The rubric is not a checklist -- it is a priority ordering. Cargo tier differentiation (25%) is weighted highest because routing the wrong PO by the wrong method has real financial and patient-safety consequences. A harness that nails carrier selection but fumbles financial derivation will still score higher than one that gets the maths right but routes a cold-chain order to a warm freight hold.</em></p></blockquote><p>&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;</p><p><strong>3 &#183; The First Mistake</strong></p><p>Before a single experiment could be trusted, the measurement infrastructure had to be validated. Two failures occurred in sequence -- and both are instructive.</p><p><strong>Mistake 1 -- Choosing a Reasoning Model as the Baseline</strong></p><p>The first model selected for the DEFAULT role was Nemotron-3-Super-120B. It produced 16,415 tokens of output. The scorer returned &#945; = 0 for all six criteria. Not because the task was hard -- but because Nemotron is a &#8216;reasoning model&#8217;: it outputs its chain-of-thought thinking process, not a finished document. The scorer correctly found nothing to score.</p><blockquote><p><strong>Lesson: </strong><em>Model type matters before model quality. A reasoning model and an instruction-following model are not interchangeable. If your harness expects a formatted deliverable, use a model that produces deliverables.</em></p></blockquote><p><strong>Mistake 2 -- The Scorer Token Ceiling</strong></p><p>Once a proper instruction-following model was in place, the scorer hit a second problem: its response was being cut off mid-word because maxTokens was set to 2,000 (yes I put limits in place on the experiments, with strict guardrails. Hence applied the token limits). A six-criterion scoring response with reasoning explanations needs roughly 4,000 tokens. The fix was a one-line code change. But the truncated result had already written partial scores to disk -- a reminder that silent data corruption is worse than a loud failure.</p><blockquote><p><strong>Lesson: </strong><em>Infrastructure bugs corrupt downstream results without raising an error. Always validate the measurement layer before running experiments.</em></p></blockquote><p>After both fixes, H1&#8217;s baseline came in at &#945; = 0.725 -- right in the target range of 0.40-0.75. This confirmed the rubric was calibrated: not trivially easy, not impossibly hard.</p><p>&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;</p><p><strong>4 &#183; What Happened -- H1 through H10</strong></p><p><strong>Results at a Glance</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!miVD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0a2efd6-06df-4e50-9149-a958146a1fad_912x352.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!miVD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0a2efd6-06df-4e50-9149-a958146a1fad_912x352.png 424w, https://substackcdn.com/image/fetch/$s_!miVD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0a2efd6-06df-4e50-9149-a958146a1fad_912x352.png 848w, https://substackcdn.com/image/fetch/$s_!miVD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0a2efd6-06df-4e50-9149-a958146a1fad_912x352.png 1272w, https://substackcdn.com/image/fetch/$s_!miVD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0a2efd6-06df-4e50-9149-a958146a1fad_912x352.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!miVD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0a2efd6-06df-4e50-9149-a958146a1fad_912x352.png" width="912" height="352" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a0a2efd6-06df-4e50-9149-a958146a1fad_912x352.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:352,&quot;width&quot;:912,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:48886,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0a2efd6-06df-4e50-9149-a958146a1fad_912x352.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!miVD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0a2efd6-06df-4e50-9149-a958146a1fad_912x352.png 424w, https://substackcdn.com/image/fetch/$s_!miVD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0a2efd6-06df-4e50-9149-a958146a1fad_912x352.png 848w, https://substackcdn.com/image/fetch/$s_!miVD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0a2efd6-06df-4e50-9149-a958146a1fad_912x352.png 1272w, https://substackcdn.com/image/fetch/$s_!miVD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0a2efd6-06df-4e50-9149-a958146a1fad_912x352.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dscS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b646b4-d70c-420f-a88f-c81ab0bfc978_1376x741.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dscS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b646b4-d70c-420f-a88f-c81ab0bfc978_1376x741.png 424w, https://substackcdn.com/image/fetch/$s_!dscS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b646b4-d70c-420f-a88f-c81ab0bfc978_1376x741.png 848w, https://substackcdn.com/image/fetch/$s_!dscS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b646b4-d70c-420f-a88f-c81ab0bfc978_1376x741.png 1272w, https://substackcdn.com/image/fetch/$s_!dscS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b646b4-d70c-420f-a88f-c81ab0bfc978_1376x741.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dscS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b646b4-d70c-420f-a88f-c81ab0bfc978_1376x741.png" width="1376" height="741" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/85b646b4-d70c-420f-a88f-c81ab0bfc978_1376x741.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:741,&quot;width&quot;:1376,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1223912,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b646b4-d70c-420f-a88f-c81ab0bfc978_1376x741.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dscS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b646b4-d70c-420f-a88f-c81ab0bfc978_1376x741.png 424w, https://substackcdn.com/image/fetch/$s_!dscS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b646b4-d70c-420f-a88f-c81ab0bfc978_1376x741.png 848w, https://substackcdn.com/image/fetch/$s_!dscS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b646b4-d70c-420f-a88f-c81ab0bfc978_1376x741.png 1272w, https://substackcdn.com/image/fetch/$s_!dscS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b646b4-d70c-420f-a88f-c81ab0bfc978_1376x741.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Harness Walk-Through</strong></h3><p><strong>H1 -- Baseline &#945; = 0.725</strong></p><p>No system prompt. Raw model, raw task, raw data. This is the floor. Scored perfectly on cargo tier routing and escalation trigger, but only half-marks on weighting arithmetic and G4/G7 linkage. The model understood the domain but lacked the explicit instructions to connect the dots.</p><p><strong>H2 -- Prompt Harness &#945; = 1.000 &#9733;</strong></p><p>Added a structured system prompt: explicit rules for carrier selection, G4/G7 ordering, freight-rate signal interpretation, and anti-hallucination constraints. Perfect score across all six criteria. Almost no extra tokens or latency over H1. This is the highest-value intervention in the entire experiment.</p><p><strong>H3 -- Sequential Tools &#945; = 0.700 &#8595;</strong></p><p>The model called six search tools in sequence -- tier routing, then carriers, then precedents, then synthesis. Despite consuming 3x the tokens of H1 and taking 4x as long, it scored slightly below baseline. Each tool result added context the model had to reconcile, and reconciliation errors degraded the output.</p><p><strong>H4 -- Parallel Fan-Out &#945; = 0.750</strong></p><p>Three specialist agents ran in parallel and a merge agent combined their outputs. Individual analyses were strong -- four criteria scored 1.0. But the merge agent introduced a G4/G7 inconsistency: it calibrated the escalation trigger to the wrong scenario. Classic merge failure: correct parts, incoherent assembly.</p><p><strong>H5 -- Eval Loop &#945; = 0.750</strong></p><p>Generated a draft, scored it, fed the score back as a revision prompt, repeated up to three times, and saved the best draft. Scored identically to H4 despite taking 10 minutes. The eval-and-revise loop is powerful in theory, but revisions introduced the same G4/G7 inconsistency -- the model&#8217;s revision strategy was not anchored tightly enough to the rubric&#8217;s specific consistency requirement.</p><p><strong>H6 -- Skill Memory &#945; = 0.300 &#8595;</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Fz_e!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a076ee3-fe53-4959-8fb1-d16340dcb560_1310x728.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Fz_e!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a076ee3-fe53-4959-8fb1-d16340dcb560_1310x728.png 424w, https://substackcdn.com/image/fetch/$s_!Fz_e!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a076ee3-fe53-4959-8fb1-d16340dcb560_1310x728.png 848w, https://substackcdn.com/image/fetch/$s_!Fz_e!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a076ee3-fe53-4959-8fb1-d16340dcb560_1310x728.png 1272w, https://substackcdn.com/image/fetch/$s_!Fz_e!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a076ee3-fe53-4959-8fb1-d16340dcb560_1310x728.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Fz_e!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a076ee3-fe53-4959-8fb1-d16340dcb560_1310x728.png" width="1310" height="728" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6a076ee3-fe53-4959-8fb1-d16340dcb560_1310x728.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:728,&quot;width&quot;:1310,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1237601,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a076ee3-fe53-4959-8fb1-d16340dcb560_1310x728.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Fz_e!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a076ee3-fe53-4959-8fb1-d16340dcb560_1310x728.png 424w, https://substackcdn.com/image/fetch/$s_!Fz_e!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a076ee3-fe53-4959-8fb1-d16340dcb560_1310x728.png 848w, https://substackcdn.com/image/fetch/$s_!Fz_e!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a076ee3-fe53-4959-8fb1-d16340dcb560_1310x728.png 1272w, https://substackcdn.com/image/fetch/$s_!Fz_e!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a076ee3-fe53-4959-8fb1-d16340dcb560_1310x728.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Read H5&#8217;s best draft as &#8216;memory&#8217; before generating. Catastrophic regression: H5&#8217;s draft contained two conflicting weighting methodologies. H6 inherited both simultaneously, quoting EUR 4.2M in one section and EUR 7.14M in another. The scorer penalised both the weighting criterion and G4/G7 consistency to 0.0. Memory without quality control is worse than no memory.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sx8O!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e72391-f4f5-4fea-bc31-5d490ad50aeb_1312x683.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sx8O!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e72391-f4f5-4fea-bc31-5d490ad50aeb_1312x683.png 424w, https://substackcdn.com/image/fetch/$s_!sx8O!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e72391-f4f5-4fea-bc31-5d490ad50aeb_1312x683.png 848w, https://substackcdn.com/image/fetch/$s_!sx8O!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e72391-f4f5-4fea-bc31-5d490ad50aeb_1312x683.png 1272w, https://substackcdn.com/image/fetch/$s_!sx8O!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e72391-f4f5-4fea-bc31-5d490ad50aeb_1312x683.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sx8O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e72391-f4f5-4fea-bc31-5d490ad50aeb_1312x683.png" width="1312" height="683" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/97e72391-f4f5-4fea-bc31-5d490ad50aeb_1312x683.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:683,&quot;width&quot;:1312,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1090287,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e72391-f4f5-4fea-bc31-5d490ad50aeb_1312x683.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sx8O!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e72391-f4f5-4fea-bc31-5d490ad50aeb_1312x683.png 424w, https://substackcdn.com/image/fetch/$s_!sx8O!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e72391-f4f5-4fea-bc31-5d490ad50aeb_1312x683.png 848w, https://substackcdn.com/image/fetch/$s_!sx8O!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e72391-f4f5-4fea-bc31-5d490ad50aeb_1312x683.png 1272w, https://substackcdn.com/image/fetch/$s_!sx8O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e72391-f4f5-4fea-bc31-5d490ad50aeb_1312x683.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>H7 -- Model Routing &#945; = 0.900 &#9733;</strong></p><p>Routed cheap subtasks (classification) to a lightweight model and synthesis to the default model. Second-highest quality in the lab, at only 26,635 tokens. The routing saved cost and maintained quality. This is the most practically applicable result in the study.</p><p><strong>H8 -- HITL &#945; = 0.650 &#8595;</strong></p><p>Simulated a human review step: generated a draft, produced feedback, then revised. The revision step introduced flawed arithmetic in the trigger derivation and an incoherent queue-ratio justification. The HITL loop degraded content that was already sound. Simulated human review is not human review.</p><p><strong>H9 -- Sub-Agent Swarm &#945; = 0.625 &#8595;</strong></p><p>Five specialist sub-agents each wrote their section, then an orchestrator merged them. Every criterion landed at exactly 0.5 -- present but not rigorous. The reviewer sub-agent was designed to block the merge on G4/G7 inconsistency; it did not. High token cost (58K), middling quality, below the H1 baseline.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7MqE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2d15f05-cb7b-4425-a60c-3ef675f62048_1311x731.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7MqE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2d15f05-cb7b-4425-a60c-3ef675f62048_1311x731.png 424w, https://substackcdn.com/image/fetch/$s_!7MqE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2d15f05-cb7b-4425-a60c-3ef675f62048_1311x731.png 848w, https://substackcdn.com/image/fetch/$s_!7MqE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2d15f05-cb7b-4425-a60c-3ef675f62048_1311x731.png 1272w, https://substackcdn.com/image/fetch/$s_!7MqE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2d15f05-cb7b-4425-a60c-3ef675f62048_1311x731.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7MqE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2d15f05-cb7b-4425-a60c-3ef675f62048_1311x731.png" width="1311" height="731" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c2d15f05-cb7b-4425-a60c-3ef675f62048_1311x731.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:731,&quot;width&quot;:1311,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1283298,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2d15f05-cb7b-4425-a60c-3ef675f62048_1311x731.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7MqE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2d15f05-cb7b-4425-a60c-3ef675f62048_1311x731.png 424w, https://substackcdn.com/image/fetch/$s_!7MqE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2d15f05-cb7b-4425-a60c-3ef675f62048_1311x731.png 848w, https://substackcdn.com/image/fetch/$s_!7MqE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2d15f05-cb7b-4425-a60c-3ef675f62048_1311x731.png 1272w, https://substackcdn.com/image/fetch/$s_!7MqE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2d15f05-cb7b-4425-a60c-3ef675f62048_1311x731.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>H10 -- Meta-Harness &#945; = 0.883 &#9733;</strong></p><p>Routed three task types (CLASSIFY, ANALYZE, SYNTHESIZE) to different models with different rubrics. The SYNTHESIZE task scored perfectly (1.0). But H10 took 2.9 million milliseconds and 76,185 tokens -- by far the most expensive run. The meta-routing approach works, but the overhead is enormous relative to H2.</p><p>&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;</p><p><strong>5 &#183; The Unexpected Findings</strong></p><p>The results (not surprising, in hindsight - agentic system design is a real thing!) contradicted, and yet were also at times consistent with <a href="https://interestingengineering.substack.com/p/the-harness-experiment">The Harness Experiment</a> .</p><p><strong>A Plain Prompt Beat Every Multi-Agent Architecture</strong></p><p>H2 -- a system prompt added to the same baseline model -- scored 1.000. H9, with five specialist sub-agents, a reviewer, and an orchestrator, scored 0.625. The prompt costs nothing in infrastructure, adds negligible latency, and requires no coordination logic. This is not a niche result: it held across every architectural variant tested.</p><blockquote><p><strong>Why? </strong><em>A well-crafted prompt encodes expert knowledge directly into the model&#8217;s generation process. Multi-agent architectures distribute that knowledge across agents that must then coordinate -- introducing merge failures, contradictions, and overhead. The prompt short-circuits all of that.</em></p></blockquote><p><strong>Memory Poisoning Is a Real Failure Mode</strong></p><p>H6 dropped from H5&#8217;s 0.750 to 0.300 by reading H5&#8217;s best draft as memory. The draft contained conflicting weighting methodologies the model had not fully resolved. Loading it as context caused the new generation to inherit the contradiction, not transcend it. Skill memory is only safe when the stored skill is internally consistent.</p><p><strong>Coordination Overhead Is Proportional to Agent Count</strong></p><p>H3, H4, H9 all spent more tokens and time than H1 for equal or worse quality. Each agent boundary is a merge point. Merge points introduce inconsistencies. The reviewer in H9 was explicitly designed to catch the exact inconsistency that appeared in the output. It did not catch it.</p><p><strong>Model Routing Is the Only Architecture That Improves the Efficiency Frontier</strong></p><p>H7 scored 0.900 at 26,635 tokens. H2 scored 1.000 at 15,277 tokens. Every other harness cost more tokens for less quality. H7 hints at a real design principle: route cheap tasks to cheap models, and reserve expensive capacity for synthesis only.</p><p>&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;</p><p><strong>5b &#183; What If the Rubric Had Been Different?</strong></p><p><strong>The rubric is not neutral. Every weighting choice is a statement about what matters!! </strong>Here is what would have changed under three alternative rubric designs.</p><p><strong>What if cargo_tier_differentiation was weighted lower (10% instead of 25%)?</strong></p><p>H4, H8, and H9 all scored 1.0 on cargo tier routing. Reducing its weight would have compressed the gap between harnesses. The overall ranking would not change -- H2 still wins -- but H9&#8217;s underperformance would look less severe, and the experiment&#8217;s main conclusion would be harder to see.</p><p><strong>What if planning_basis_consistency had a harder scoring floor?</strong></p><p>Most harnesses scored 0.5 on this criterion -- &#8216;present but not explicitly linked.&#8217; If the rubric had required explicit linkage language (not just correct ordering), nearly every harness outside H2 would have scored 0.0 here. H1&#8217;s alpha would have dropped from 0.725 to approximately 0.525. The implication: the rubric as written is lenient on G4/G7 consistency, and harnesses may be satisfying the letter of the criterion rather than its spirit.</p><p><strong>What if latency and token cost were included in the score?</strong></p><p>H2 would still win -- lowest token cost among the high scorers. H10 would drop dramatically: 2.9M ms and 76K tokens for 0.883 is a poor trade compared to H7&#8217;s 0.900 at 26K tokens. H9 would be the clear loser on a cost-adjusted basis: 58K tokens for 0.625. A realistic production scoring function weighting quality 70%, token cost 20%, and latency 10% would rank: H2 first, H7 second, H1 third.</p><p>&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;</p><p><strong>5c &#183; Did the Master Guide Help or Constrain?</strong></p><p>The experiment was built from a detailed MASTER_GUIDE.md that specified the task, sub-agent roles, rubric criteria, and even the expected alpha range for H1 (0.40-0.75). Did the guide improve rigour, or did it pre-solve the experiment?</p><p><strong>Where the guide helped</strong></p><p>&#183; It defined the gold answer independently of the experiments -- a genuine external standard.</p><p>&#183; It specified scorer model separation (SCORER &#8800; DEFAULT), preventing self-grading inflation.</p><p>&#183; It provided H9 sub-agent roles with precise responsibilities, making the swarm topology testable.</p><p>&#183; It required strict file-path discipline (path.join + __dirname), preventing silent path bugs.</p><p><strong>Where the guide may have constrained</strong></p><p>&#183; The rubric criteria were pre-specified. An experiment discovering the rubric through failure would have tested different things.</p><p>&#183; The recommended 60%/40% weighting for the planning basis was in the data. A harness deriving a different weighting might be penalised even with sound reasoning.</p><p>&#183; The expected H1 alpha range was stated in advance -- a validity aid, but also a potential framing for what counts as a &#8216;data issue&#8217; vs. a &#8216;rubric issue.&#8217;</p><blockquote><p><strong>Verdict: </strong><em>The Master Guide was a controlled-experiment enabler. The key safeguard: it was written before any experiment ran and was not retroactively adjusted to make results look cleaner. The architecture it specified -- immutable shared layer, isolated experiment layer -- is the reason ten harnesses are genuinely comparable.</em></p></blockquote><p><strong>6 &#183; H9 Sub-Agent Topology -- Why It Failed</strong></p><p>H9 mirrors how real AI orchestration systems are often designed: specialists, a reviewer, an orchestrator. Here is the actual topology and where it broke down.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!K7ne!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975f72b9-b317-4d06-9f1c-57f0390ab07d_760x427.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!K7ne!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975f72b9-b317-4d06-9f1c-57f0390ab07d_760x427.png 424w, https://substackcdn.com/image/fetch/$s_!K7ne!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975f72b9-b317-4d06-9f1c-57f0390ab07d_760x427.png 848w, https://substackcdn.com/image/fetch/$s_!K7ne!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975f72b9-b317-4d06-9f1c-57f0390ab07d_760x427.png 1272w, https://substackcdn.com/image/fetch/$s_!K7ne!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975f72b9-b317-4d06-9f1c-57f0390ab07d_760x427.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!K7ne!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975f72b9-b317-4d06-9f1c-57f0390ab07d_760x427.png" width="760" height="427" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/975f72b9-b317-4d06-9f1c-57f0390ab07d_760x427.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:427,&quot;width&quot;:760,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:23362,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975f72b9-b317-4d06-9f1c-57f0390ab07d_760x427.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!K7ne!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975f72b9-b317-4d06-9f1c-57f0390ab07d_760x427.png 424w, https://substackcdn.com/image/fetch/$s_!K7ne!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975f72b9-b317-4d06-9f1c-57f0390ab07d_760x427.png 848w, https://substackcdn.com/image/fetch/$s_!K7ne!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975f72b9-b317-4d06-9f1c-57f0390ab07d_760x427.png 1272w, https://substackcdn.com/image/fetch/$s_!K7ne!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975f72b9-b317-4d06-9f1c-57f0390ab07d_760x427.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>&#183; <strong>The reviewer failed: </strong>SA_reviewer confirmed &#8216;G4 and G7 are consistent&#8217; even when the orchestrator&#8217;s synthesis had subtly recalibrated the trigger to the wrong scenario. It confirmed the numbers appeared in both sections without verifying they derived from the same scenario model.</p><p>&#183; <strong>The orchestrator averaged: </strong>Each sub-agent wrote independently correct content, but the orchestrator had to reconcile five different phrasings of the same analysis. The result averaged them rather than selecting the best -- which is why every criterion landed at exactly 0.5.</p><p><strong>Critical design constraints to preserve</strong></p><p>&#183; SCORER_MODEL must differ from DEFAULT_MODEL -- self-grading inflates alpha by 15-30%.</p><p>&#183; Do not use reasoning/chain-of-thought models as DEFAULT -- they output thinking traces, not deliverables.</p><p>&#183; Run H5 before H6 -- H6 reads H5&#8217;s output as memory and will error otherwise.</p><p>&#183; Scorer maxTokens must be at least 4,000 -- six-criterion responses need room.</p><p><strong>To change the rubric</strong></p><p>Edit shared/rubric.json. Weights must sum to 1.0. To tighten a criterion, narrow the score_0_5 band description -- this forces more outputs into the 0.0 tier. Re-run H1 first to verify the new baseline lands between 0.40 and 0.75 before running H2-H10.</p><p>&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;</p><p><strong>8 &#183; Conclusions</strong></p><p>Five findings that should inform how AI systems are designed and evaluated:</p><p><strong>1. Prompt engineering is still the highest-leverage intervention.</strong></p><p>A well-crafted system prompt outperformed every multi-agent, multi-tool, and eval-loop architecture tested. It costs almost nothing in tokens, adds minimal latency, and requires no coordination infrastructure. Write the prompt first -- not last.</p><p><strong>2. Coordination overhead scales faster than quality.</strong></p><p>Every additional agent boundary introduces a merge point. Merge points introduce inconsistencies. H4, H5, H8, and H9 all spent more tokens and time than H1 for equal or worse output. Multi-agent architectures are justified when tasks are genuinely parallel and independent. They are harmful when the task requires integrated reasoning.</p><p><strong>3. Memory without quality control is worse than no memory.</strong></p><p>H6 shows that loading a flawed prior output as skill memory propagates its errors into the next generation. Retrieval-augmented generation and skill memory systems need a validity gate before stored content is trusted.</p><p><strong>4. Model routing is the most practical architectural improvement.</strong></p><p>H7 achieved 0.900 quality at 26,635 tokens by routing cheap tasks to a cheap model and synthesis to a capable model. Match model capability to task complexity.</p><p><strong>5. Evaluation infrastructure is as important as the experiment.</strong></p><p>Two measurement failures (wrong model type, scorer token ceiling) produced corrupted results before the experiments even started. The quality of a benchmark is determined by the quality of its measurement layer. Validate the ruler before you measure the table.</p><h2><strong>PART II - Two Experiments, One Domain, Opposite Conclusions</strong></h2><p><em>What the ASCRS Architecture of Awareness and the Harness Lab discovered &#8212; and why they disagree</em></p><h3>1 &#183; The Short Version (For Everyone)</h3><p>Imagine you are a chef who has just produced a recipe for a restaurant kitchen. Two months later, you run a cooking competition using that same recipe. You expect the dish that won the cooking-show round &#8212; the elaborate multi-chef brigade system &#8212; to win again. Instead, a cook working alone with a single, very precisely written instruction card beats everyone.</p><p>That is, roughly, what happened here.</p><p>The <strong><a href="https://interestingengineering.substack.com/p/the-architecture-of-awareness-design">Architecture of Awareness</a></strong> was written a few weeks ago. It traced four versions of a fictional AI system designed to help a pharmaceutical company respond to a shipping crisis in the Strait of Hormuz &#8212; the narrow waterway through which a fifth of the world&#8217;s oil and a significant share of medicine-ingredient shipments travel. The exercise asked: if an AI has to decide, within hours, how to re-route &#8364;15 million of pharmaceutical cargo when that waterway suddenly closes, which AI design gets the decision right?</p><p>The answer in that article was: the most sophisticated version wins. More coordination, more memory, more checks &#8212; better outcome.</p><p>The <strong><a href="https://interestingengineering.substack.com/p/the-harness-experiment">ASCRS Harness Lab</a></strong> ran a controlled experiment last week. Same scenario, same AI model across all ten test architectures. This time the answer was: the simplest meaningful intervention &#8212; a well-written set of instructions &#8212; produced a perfect score. The five-agent swarm with specialist roles scored below the bare model with no instructions at all.</p><p><strong>The central tension in one sentence:</strong></p><p>&#8594; The design exercise predicted that complex multi-agent architectures would win. The controlled experiment found that a precisely written prompt beat all of them.</p><p>Both findings are correct. Understanding why they are not contradictory is the whole point of this document.</p><h3>2 &#183; What Each Experiment Actually Was</h3><h3>The Architecture of Awareness (April 2026)</h3><p>This was a <strong>design exploration</strong>, not a controlled benchmark. It was a viability test for running agentic systems (in parallel with a legacy one). Greyfield/Brownfield type project. Four progressively more sophisticated system architectures were designed and described, each addressing real failures identified in the previous version:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NIG2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F628bf010-cfaa-48e3-9816-fb5c43fd7371_861x356.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NIG2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F628bf010-cfaa-48e3-9816-fb5c43fd7371_861x356.png 424w, https://substackcdn.com/image/fetch/$s_!NIG2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F628bf010-cfaa-48e3-9816-fb5c43fd7371_861x356.png 848w, https://substackcdn.com/image/fetch/$s_!NIG2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F628bf010-cfaa-48e3-9816-fb5c43fd7371_861x356.png 1272w, https://substackcdn.com/image/fetch/$s_!NIG2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F628bf010-cfaa-48e3-9816-fb5c43fd7371_861x356.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NIG2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F628bf010-cfaa-48e3-9816-fb5c43fd7371_861x356.png" width="861" height="356" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/628bf010-cfaa-48e3-9816-fb5c43fd7371_861x356.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:356,&quot;width&quot;:861,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:53943,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F628bf010-cfaa-48e3-9816-fb5c43fd7371_861x356.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NIG2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F628bf010-cfaa-48e3-9816-fb5c43fd7371_861x356.png 424w, https://substackcdn.com/image/fetch/$s_!NIG2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F628bf010-cfaa-48e3-9816-fb5c43fd7371_861x356.png 848w, https://substackcdn.com/image/fetch/$s_!NIG2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F628bf010-cfaa-48e3-9816-fb5c43fd7371_861x356.png 1272w, https://substackcdn.com/image/fetch/$s_!NIG2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F628bf010-cfaa-48e3-9816-fb5c43fd7371_861x356.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The evaluation method was <strong>CFO outcome</strong>: could the document produced actually be signed off on? Was cargo booked on time? Was the financial figure defensible? This is a qualitative, real-world standard &#8212; pass or fail, with narrative explanation of why.</p><p>Important: the scenarios built on each other. V3 was designed specifically to fix V2&#8217;s identified failures. V4 was designed to fix V3&#8217;s hidden failures. This is iterative design, not controlled comparison.</p><h3>The ASCRS Harness Lab (May 2026)</h3><p>This was a <strong>controlled benchmark experiment</strong>. Ten architectures &#8212; called H1 through H10 &#8212; all received the same task, the same data, and the same model. Performance was scored quantitatively against six rubric criteria by a separate scorer model that had never seen which architecture produced which output.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2d8x!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1fa2f4e-ac0d-45e0-a54c-126a6db557da_863x556.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2d8x!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1fa2f4e-ac0d-45e0-a54c-126a6db557da_863x556.png 424w, https://substackcdn.com/image/fetch/$s_!2d8x!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1fa2f4e-ac0d-45e0-a54c-126a6db557da_863x556.png 848w, https://substackcdn.com/image/fetch/$s_!2d8x!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1fa2f4e-ac0d-45e0-a54c-126a6db557da_863x556.png 1272w, https://substackcdn.com/image/fetch/$s_!2d8x!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1fa2f4e-ac0d-45e0-a54c-126a6db557da_863x556.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2d8x!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1fa2f4e-ac0d-45e0-a54c-126a6db557da_863x556.png" width="863" height="556" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b1fa2f4e-ac0d-45e0-a54c-126a6db557da_863x556.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:556,&quot;width&quot;:863,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:70756,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1fa2f4e-ac0d-45e0-a54c-126a6db557da_863x556.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2d8x!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1fa2f4e-ac0d-45e0-a54c-126a6db557da_863x556.png 424w, https://substackcdn.com/image/fetch/$s_!2d8x!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1fa2f4e-ac0d-45e0-a54c-126a6db557da_863x556.png 848w, https://substackcdn.com/image/fetch/$s_!2d8x!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1fa2f4e-ac0d-45e0-a54c-126a6db557da_863x556.png 1272w, https://substackcdn.com/image/fetch/$s_!2d8x!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1fa2f4e-ac0d-45e0-a54c-126a6db557da_863x556.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The scoring metric &#945; (alpha) runs 0.0 to 1.0. The six rubric criteria weighted what matters in real pharmaceutical logistics: was the highest-risk item (a &#8364;4.7M monoclonal antibody requiring &#8722;20&#176;C cold chain storage) routed to the correct carrier? Did the financial planning figure logically precede the escalation rule? Was the freight-rate spike treated as a severity signal, not just a cost?</p><h3>3 &#183; Why the Results Are Different &#8212; And Why That Is Not a Contradiction</h3><h3>Different Types of Evidence</h3><p>The Architecture of Awareness documented iterative design &#8212; each version was explicitly built to correct the known failures of the previous one. That is not a fair competition; it is a debugging sequence. When V3 outperforms V1, it is partly because V3 was designed by someone who had already seen V1 fail.</p><p>The Harness Lab ran a controlled comparison. All ten architectures started from the same position and were scored by the same rubric. No architecture had the benefit of seeing the others fail first. This is a different &#8212; and more rigorous &#8212; evidential standard.</p><p>Think of the difference this way: one is a chef iterating on a recipe across several months. The other is a blind tasting, all dishes served simultaneously. Both produce useful information. But the blind tasting is the one that tells you which design actually wins on its own merits.</p><h3>Different Task Structures</h3><p>The Architecture of Awareness was evaluating a <strong>continuous, multi-step operational system</strong> that had to coordinate real-time data feeds, ERP database writes, long-term institutional memory, multi-source intelligence, and human approval workflows. This is genuinely complex. Multiple agents with specialist knowledge are a reasonable design choice because the work is genuinely distributed.</p><p>The Harness Lab tested a <strong>single-turn reasoning task</strong>: given structured data, produce a document. All the information was present upfront. No external calls were required. No genuine parallelism was needed. A well-written instruction is sufficient when everything needed is already in the room.</p><p><strong>The key insight:</strong></p><blockquote><p>The Architecture of Awareness predicted complex harnesses would win because it was describing a complex, multi-phase operation. The Harness Lab found simple prompts win because it tested a single-turn document task. Both are right &#8212; for their respective task types.</p></blockquote><h3>The Prediction That Didn&#8217;t Hold</h3><p>The MASTER_GUIDE &#8212; the planning document for the Harness Lab &#8212; explicitly predicted H9 (the five-agent swarm) would win, with an expected alpha of 0.75&#8211;0.88. The reasoning was sound: pharmaceutical logistics contains genuine knowledge boundaries (cold-chain compliance, freight market signals, route deadline analysis). Specialist agents for each domain should outperform a generalist.</p><p>H9 scored 0.625. Below the bare model.</p><p>Why? Three interconnected failures:</p><p>&#8226; <strong>The reviewer failed silently. </strong>The SA_reviewer sub-agent was specifically designed to catch one type of error: financial planning figure written before the escalation trigger is locked in, causing the two numbers to reference different assumptions. SA_reviewer confirmed consistency without verifying the actual scenario reference. It checked for the presence of both numbers, not their logical dependency.</p><p>&#8226; <strong>The orchestrator averaged instead of selecting. </strong>Each specialist agent produced a correct fragment. The orchestrator had to reconcile five different phrasings of overlapping analyses. The result averaged them &#8212; which is why every single criterion scored exactly 0.5. Not wrong, not right. Plausible and mediocre.</p><p>&#8226; <strong>Agent coordination is not free. </strong>H9 consumed 58,090 tokens &#8212; nearly four times H1&#8217;s 14,028 &#8212; for a lower quality score. Every agent boundary is a potential merge failure. The expected benefit of specialist knowledge did not materialise, but the coordination cost did.</p><h3>4 &#183; How the Tests Were Run &#8212; And What Could Have Been Better</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eLmL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83888d27-0bc5-4281-9e59-08f03972464e_1338x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eLmL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83888d27-0bc5-4281-9e59-08f03972464e_1338x720.png 424w, https://substackcdn.com/image/fetch/$s_!eLmL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83888d27-0bc5-4281-9e59-08f03972464e_1338x720.png 848w, https://substackcdn.com/image/fetch/$s_!eLmL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83888d27-0bc5-4281-9e59-08f03972464e_1338x720.png 1272w, https://substackcdn.com/image/fetch/$s_!eLmL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83888d27-0bc5-4281-9e59-08f03972464e_1338x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eLmL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83888d27-0bc5-4281-9e59-08f03972464e_1338x720.png" width="1338" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/83888d27-0bc5-4281-9e59-08f03972464e_1338x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1338,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1389391,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83888d27-0bc5-4281-9e59-08f03972464e_1338x720.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eLmL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83888d27-0bc5-4281-9e59-08f03972464e_1338x720.png 424w, https://substackcdn.com/image/fetch/$s_!eLmL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83888d27-0bc5-4281-9e59-08f03972464e_1338x720.png 848w, https://substackcdn.com/image/fetch/$s_!eLmL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83888d27-0bc5-4281-9e59-08f03972464e_1338x720.png 1272w, https://substackcdn.com/image/fetch/$s_!eLmL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83888d27-0bc5-4281-9e59-08f03972464e_1338x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Architecture of Awareness: Method</h3><p>The article was built as a design case study. The four architectures were described using an adapted crisis &#8212; the IRGC naval interdiction exercise, March 2026 &#8212; as the consistent test scenario. Each version was compared against a defined output quality standard: could a CFO approve the resulting brief? Could time-critical cargo be booked within the decision window?</p><p>The evaluation was qualitative and illustrative. The &#8216;brief&#8217; outputs shown were synthetic demonstrations of what each architecture would produce, not actual model outputs run through a scorer. This is standard practice for design documentation but means the results cannot be replicated or independently verified.</p><p>What this method does well: it communicates design reasoning clearly. What it cannot do: establish whether the predicted quality differences are real or assumed.</p><h3>Harness Lab: Method</h3><p>The lab used a controlled experimental structure. The same model (GPT-OSS-120B via OpenRouter free tier) ran in all ten harnesses. The same disruption scenario data was provided to all. A separate scorer model &#8212; never used as an experiment model &#8212; rated every output against the same six-criterion rubric. Scores were written to a tamper-evident results log.</p><p>Two measurement failures occurred before the experiment could start:</p><p>&#8226; <strong>Wrong model type as baseline. </strong>The initial DEFAULT model was a &#8216;reasoning model&#8217; &#8212; a type that outputs its chain-of-thought thinking process, not a finished document. It produced 16,415 tokens of internal deliberation. The scorer correctly found nothing to evaluate. Lesson: model type matters before model quality.</p><p>&#8226; <strong>Scorer token ceiling. </strong>The scorer&#8217;s maximum response length was set to 2,000 tokens. A six-criterion scoring response requires roughly 4,000. The scorer was being cut off mid-reasoning, producing partial scores that were silently written to disk. Lesson: infrastructure failures corrupt results without raising errors. Validate the measurement layer before running experiments.</p><p>Both were caught and fixed before the main runs. H1&#8217;s baseline calibrated at &#945; = 0.725, confirming the rubric was functioning: not trivially easy, not impossibly hard.</p><h3>What Could Have Been Better</h3><p>From my perspective, the Harness Lab was well-designed on the dimensions it controlled. Three legitimate criticisms:</p><p>&#8226; <strong>Single model. </strong>All ten harnesses used the same underlying model. This prevents any conclusion about whether harness design interacts with model capability &#8212; a stronger model might rank the architectures differently. The vendor experiment from the same lab (different domain) used Claude Opus 4.6 and found similar conclusions, which is suggestive but not conclusive.</p><p>&#8226; <strong>Single task. </strong>One scenario, one run per harness. In a statistical sense, N=1. A task where H9 is expected to win (genuine multi-domain parallelism, external state queries, real-time data feeds) might invert the ranking.</p><p>&#8226; <strong>H9 implementation quality. </strong>The SA_reviewer sub-agent was specified to catch G4/G7 inconsistencies. It did not. Whether this represents an inherent limitation of multi-agent coordination or a specific implementation weakness is difficult to separate from the results. The Architecture of Awareness V3+Loop succeeded at exactly this consistency check &#8212; but through a different mechanism: a targeted correction loop, not a specialist reviewer.</p><h3>5 &#183; Why H2 and H7 Won</h3><h3>H2: The Power of Precise Instructions</h3><p>H2 added one thing: a structured system prompt. It told the model, in explicit terms:</p><blockquote><p>&#8226; PO-2853 is a &#8722;20&#176;C monoclonal antibody. Singapore air hub does not maintain &#8722;20&#176;C. Use Qatar Airways Cargo.</p><p>&#8226; Derive the planning basis (G4) before writing the escalation trigger (G7). Both must reference the same scenario duration.</p><p>&#8226; Treat the 187% freight-rate spike as a severity signal reflecting the collective judgment of hundreds of corridor operators &#8212; not just a cost input.</p><p>&#8226; Separate uncertainty into three tiers: confirmed data, model-derived estimates, genuinely unknowable trajectory.</p></blockquote><p>Every one of these instructions maps directly onto a rubric criterion that the bare model struggled with. H2 did not give the model new intelligence. It gave the model a precise description of what a correct answer looks like &#8212; from the outside, in terms a scorer could evaluate.</p><p>This is the deepest result in the experiment: <strong>a good prompt encodes expert knowledge into the generation process without any coordination overhead.</strong> A well-designed prompt is a compressed expert. Multi-agent architectures distribute expert knowledge across agents that then have to coordinate &#8212; introducing merge failures, contradictions, and overhead. The prompt short-circuits all of that.</p><p><strong>Non-technical translation:</strong></p><p>Imagine you hire ten assistants and give them each one specialist task, then ask a manager to combine their work. Now imagine you give one well-briefed assistant a very precise job description and let them do it alone. The second approach often wins &#8212; not because it is smarter, but because there are fewer ways for the work to come back incoherent.</p><p>H2 cost 15,277 tokens &#8212; barely more than H1&#8217;s 14,028. It added no latency. It required no coordination infrastructure. It achieved &#945; = 1.000.</p><h3>H7: The Efficiency Principle</h3><p>H7 introduced model routing: a routing layer that read the task and assigned each subtask to the appropriate model. Classification tasks (which tier is this PO?) went to a lightweight, fast, cheap model. Synthesis (write the CFO brief) went to the capable default model.</p><p>H7 scored &#945; = 0.900 at 26,635 tokens. Only H2 scored higher, and H2 used the same capable model for everything. H7&#8217;s insight is distinct from H2&#8217;s:</p><blockquote><p>&#8226; <strong>Not all tasks require the same model. </strong>A model capable of nuanced financial reasoning is wasted on a binary tier classification. Using a cheaper model for cheap tasks and reserving full capacity for synthesis is a genuine efficiency gain.</p><p>&#8226; <strong>Task segmentation can improve quality. </strong>By giving the synthesis model a clean, pre-classified input rather than raw data, H7 reduced the cognitive load on the most critical step.</p></blockquote><p>The practical implication: H7&#8217;s approach scales better than H2&#8217;s in production. As task volume increases, the cost difference between H7 (right model for the job) and a flat approach (capable model for everything) compounds significantly.</p><h3>The Pattern Behind Both Winners</h3><p>H2 and H7 share a design principle: they both match <strong>resource to requirement</strong> precisely. H2 gives the model exactly the instructions it needs &#8212; no more, no less. H7 gives each subtask exactly the model it needs. Both avoid the failure mode that sank H3, H4, H8, and H9: adding coordination complexity without adding information.</p><h3>6 &#183; The /goal Question &#8212; Could a Simpler Objective Drive the Same Results?</h3><h3>What /goal Is</h3><p>Recent agent runtimes &#8212; notably <a href="https://www.mindstudio.ai/blog/codex-goal-ralph-loop-14-hour-autonomous-task">Codex CLI (OpenAI)</a> and <a href="https://hermes-agent.nousresearch.com/docs/user-guide/features/goals">Hermes Agent</a> &#8212; have shipped a mechanism called <strong>/goal</strong> (sometimes called <strong>Ralph Loop 2.0</strong>). Rather than giving the agent a detailed instruction set upfront, you state a standing objective: &#8216;Produce a CFO-approvable pharmaceutical rerouting brief for this disruption scenario.&#8217; A judge model then evaluates after every turn whether the goal has been achieved. If not, the agent continues automatically up to a turn budget.</p><p>The appeal: rather than specifying exactly how to achieve the brief (which requires the detailed rubric, the carrier rules, the G4/G7 ordering constraint), you specify what you want the brief to be. You let the agent find its own path.</p><h3>What the Goals Were in the Harness Lab</h3><p>The Harness Lab&#8217;s goals were implicit in the rubric:</p><p>&#8226; Produce a document a CFO can sign off on in a 6-hour decision window</p><p>&#8226; Route 23 pharmaceutical purchase orders by tier and deadline, with named carriers</p><p>&#8226; Derive a single defensible financial planning figure with a stated scenario basis</p><p>&#8226; Write an escalation trigger derived from historical data, consistent with the planning figure</p><p>&#8226; Separate uncertainty into confirmed, estimated, and unknown tiers</p><p>These goals were excellent/on the side of good &#8212; specific, measurable, and directly tied to operational consequences. They were, however, encoded in the rubric rather than expressed as a standing /goal objective. The H2 system prompt was essentially a restatement of these goals in instructional form.</p><h3>Could /goal Have Replaced the Detailed Prompt?</h3><p>Partially, and with important caveats.</p><p>The MASTER_GUIDE itself is explicit on where /goal would and would not have helped:</p><p>&#8226; <strong>H5 &#8212; Yes. </strong>The eval loop ran exactly three revisions regardless of quality. /goal would have removed the arbitrary cap: keep revising until the goal is achieved, not until iteration 3.</p><p>&#8226; <strong>H10 &#8212; Possibly. </strong>The meta-harness had a premature completion problem on zero-score tasks. A judge checking the goal after each routing decision would have caught this.</p><p>&#8226; <strong>H1 &#8212; No. </strong>Specification failure. The problem was not that the model stopped too early. It was that it did not know PO-2853 needed Qatar Airways Cargo. Persistence cannot fix an instruction gap.</p><p>&#8226; <strong>H2 &#8212; Unnecessary. </strong>Already solved in one turn. /goal adds overhead where none is needed.</p><p>&#8226; <strong>H4/H9 &#8212; No. </strong>Coherence failure, not a stopping problem. The merge agent produced an internally inconsistent document. Continuing the loop would not have resolved the G4/G7 contradiction; it would have looped on the same failure.</p><p><strong>The core limitation of /goal without a detailed specification:</strong></p><p>A judge checking &#8216;is the CFO brief complete?&#8217; needs to know what complete means. That definition &#8212; the six rubric criteria, the carrier constraint for PO-2853, the G4/G7 ordering dependency &#8212; is not derivable from the goal statement alone. The agent cannot discover that Qatar Airways Cargo is the only GDP-certified &#8722;20&#176;C carrier available on that route by iterating toward &#8216;produce a CFO brief.&#8217; The specification is the knowledge. /goal is the stopping rule.</p><h3>Could /goal Work With a Different Architecture?</h3><p>The honest answer is: I do not know from this experiment. What I can say is this:</p><p>A /goal agent working toward &#8216;produce a CFO-approvable brief&#8217; with access to a pharmacological database, a carrier routing database, and a freight market feed <strong>might</strong> discover the Qatar Airways Cargo constraint through tool calls. It would need to query carrier capabilities against PO-2853&#8217;s &#8722;20&#176;C specification. If the tooling exists and the agent is allowed to explore it, /goal + tool access might approximate what H2&#8217;s system prompt achieves explicitly.</p><p>But there are structural problems with this approach for time-critical decisions:</p><p>&#8226; <strong>Discovery takes time the crisis does not allow. </strong>The CFO decision window in this scenario closes at 14:00 UTC, six hours after the alert. An agent iterating toward a goal through tool discovery may not converge before the Qatar Airways Cargo booking window closes at 10:00 UTC.</p><p>&#8226; <strong>The goal may not be granular enough to catch every requirement. </strong>&#8216;CFO-approvable&#8217; is a compound goal. A judge evaluating it may pass a brief that satisfies four of six rubric criteria. The judge needs the same specification the human rater uses &#8212; at which point you are back to writing a detailed rubric anyway.</p><p>&#8226; <strong>File structure and test design still need to be specified somewhere. </strong>The agent needs to know what format to produce, where to write output, how to structure the gate checklist. /goal delegates the what but not the how &#8212; some specification is always required.</p><h3>The Better Question</h3><p>Rather than asking whether /goal can replace detailed specification, the more useful question is: <strong>what specification is genuinely needed versus what can be delegated to agent judgment?</strong></p><p>The Harness Lab results suggest a clear division:</p><p>&#8226; <strong>Must be specified: </strong>Domain-specific constraints with non-obvious correct answers. Qatar Airways Cargo. G4 before G7. The 75-vessel threshold derivation. The three-tier uncertainty structure. These cannot be discovered; they must be told.</p><p>&#8226; <strong>Can be delegated: </strong>Format, length, sequence of sections, writing style. An agent with a good goal statement and domain knowledge can make reasonable choices on these. /goal is well-suited to stopping when the substance is right, even if the presentation took a different path.</p><p>The Architecture of Awareness reached a similar conclusion from the design direction: the correction loop (V3) succeeded not because it added intelligence but because it surfaced information the system already had. The loop forced the Strategist to write the &#8364;4.2M planning basis it already knew but had not surfaced. /goal with a good judge would do the same thing &#8212; but only if the judge&#8217;s criteria match the domain requirements.</p><h3>7 &#183; Summary: What Both Experiments Establish Together</h3><p>Read together, the two experiments establish a consistent picture that neither alone could produce.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!M2xi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F647694cd-632b-4d7d-a178-8e57af82ae19_943x520.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!M2xi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F647694cd-632b-4d7d-a178-8e57af82ae19_943x520.png 424w, https://substackcdn.com/image/fetch/$s_!M2xi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F647694cd-632b-4d7d-a178-8e57af82ae19_943x520.png 848w, https://substackcdn.com/image/fetch/$s_!M2xi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F647694cd-632b-4d7d-a178-8e57af82ae19_943x520.png 1272w, https://substackcdn.com/image/fetch/$s_!M2xi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F647694cd-632b-4d7d-a178-8e57af82ae19_943x520.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!M2xi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F647694cd-632b-4d7d-a178-8e57af82ae19_943x520.png" width="943" height="520" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/647694cd-632b-4d7d-a178-8e57af82ae19_943x520.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:520,&quot;width&quot;:943,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:73449,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F647694cd-632b-4d7d-a178-8e57af82ae19_943x520.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!M2xi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F647694cd-632b-4d7d-a178-8e57af82ae19_943x520.png 424w, https://substackcdn.com/image/fetch/$s_!M2xi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F647694cd-632b-4d7d-a178-8e57af82ae19_943x520.png 848w, https://substackcdn.com/image/fetch/$s_!M2xi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F647694cd-632b-4d7d-a178-8e57af82ae19_943x520.png 1272w, https://substackcdn.com/image/fetch/$s_!M2xi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F647694cd-632b-4d7d-a178-8e57af82ae19_943x520.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gPD4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f6672f0-8aaf-4546-b4ce-4061ab2d6a7b_1338x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gPD4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f6672f0-8aaf-4546-b4ce-4061ab2d6a7b_1338x720.png 424w, https://substackcdn.com/image/fetch/$s_!gPD4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f6672f0-8aaf-4546-b4ce-4061ab2d6a7b_1338x720.png 848w, https://substackcdn.com/image/fetch/$s_!gPD4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f6672f0-8aaf-4546-b4ce-4061ab2d6a7b_1338x720.png 1272w, https://substackcdn.com/image/fetch/$s_!gPD4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f6672f0-8aaf-4546-b4ce-4061ab2d6a7b_1338x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gPD4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f6672f0-8aaf-4546-b4ce-4061ab2d6a7b_1338x720.png" width="1338" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3f6672f0-8aaf-4546-b4ce-4061ab2d6a7b_1338x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1338,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1389391,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f6672f0-8aaf-4546-b4ce-4061ab2d6a7b_1338x720.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gPD4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f6672f0-8aaf-4546-b4ce-4061ab2d6a7b_1338x720.png 424w, https://substackcdn.com/image/fetch/$s_!gPD4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f6672f0-8aaf-4546-b4ce-4061ab2d6a7b_1338x720.png 848w, https://substackcdn.com/image/fetch/$s_!gPD4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f6672f0-8aaf-4546-b4ce-4061ab2d6a7b_1338x720.png 1272w, https://substackcdn.com/image/fetch/$s_!gPD4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f6672f0-8aaf-4546-b4ce-4061ab2d6a7b_1338x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Five Conclusions</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6wr-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769d41eb-5169-41f2-8ca6-49251ce62116_1295x687.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6wr-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769d41eb-5169-41f2-8ca6-49251ce62116_1295x687.png 424w, https://substackcdn.com/image/fetch/$s_!6wr-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769d41eb-5169-41f2-8ca6-49251ce62116_1295x687.png 848w, https://substackcdn.com/image/fetch/$s_!6wr-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769d41eb-5169-41f2-8ca6-49251ce62116_1295x687.png 1272w, https://substackcdn.com/image/fetch/$s_!6wr-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769d41eb-5169-41f2-8ca6-49251ce62116_1295x687.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6wr-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769d41eb-5169-41f2-8ca6-49251ce62116_1295x687.png" width="1295" height="687" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/769d41eb-5169-41f2-8ca6-49251ce62116_1295x687.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:687,&quot;width&quot;:1295,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1259729,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769d41eb-5169-41f2-8ca6-49251ce62116_1295x687.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6wr-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769d41eb-5169-41f2-8ca6-49251ce62116_1295x687.png 424w, https://substackcdn.com/image/fetch/$s_!6wr-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769d41eb-5169-41f2-8ca6-49251ce62116_1295x687.png 848w, https://substackcdn.com/image/fetch/$s_!6wr-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769d41eb-5169-41f2-8ca6-49251ce62116_1295x687.png 1272w, https://substackcdn.com/image/fetch/$s_!6wr-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F769d41eb-5169-41f2-8ca6-49251ce62116_1295x687.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>1. <strong>Harness architecture choice must match task structure. </strong>On single-turn document tasks with complete data upfront, a well-written prompt wins. On continuous, multi-source operational tasks with external state and real-time data, structured multi-agent coordination wins. The Architecture of Awareness and the Harness Lab are both right &#8212; for different task types.</p><p>2. <strong>Instruction quality is the most underrated variable. </strong>H2 achieved a perfect score with minimal overhead. The cost of H2 over H1 is negligible. The gain is enormous. Write the specification first, add architecture second.</p><p>3. <strong>Coordination overhead is real and proportional to agent count. </strong>H3, H4, H8, and H9 all spent more than H1 for equal or worse quality. Every agent boundary is a merge point. Merge points introduce inconsistencies. The reviewer in H9 failed to catch exactly what it was designed to catch.</p><p>4. <strong>/goal is a stopping rule, not a specification substitute. </strong>It removes arbitrary iteration caps and catches premature completion. It cannot supply domain-specific constraints that must be told, not discovered. Use /goal to drive the loop; use explicit specification to define what done looks like.</p><p>5. <strong>Memory without quality control is worse than no memory. </strong>H6 dropped from 0.750 to 0.300 by inheriting a flawed prior output. The Architecture of Awareness made the same point from the other direction: the correction loop&#8217;s value was forcing the system to surface information it already had. Both experiments agree: what you give the model shapes what it produces more than how many agents you deploy.</p><h2>PART III - The Integrated Agentic Stack</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zVXv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe00b0c7f-0b60-4567-801a-9c25c54ce9ae_1162x592.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zVXv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe00b0c7f-0b60-4567-801a-9c25c54ce9ae_1162x592.png 424w, https://substackcdn.com/image/fetch/$s_!zVXv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe00b0c7f-0b60-4567-801a-9c25c54ce9ae_1162x592.png 848w, https://substackcdn.com/image/fetch/$s_!zVXv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe00b0c7f-0b60-4567-801a-9c25c54ce9ae_1162x592.png 1272w, https://substackcdn.com/image/fetch/$s_!zVXv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe00b0c7f-0b60-4567-801a-9c25c54ce9ae_1162x592.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zVXv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe00b0c7f-0b60-4567-801a-9c25c54ce9ae_1162x592.png" width="1162" height="592" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e00b0c7f-0b60-4567-801a-9c25c54ce9ae_1162x592.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:592,&quot;width&quot;:1162,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1153774,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe00b0c7f-0b60-4567-801a-9c25c54ce9ae_1162x592.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zVXv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe00b0c7f-0b60-4567-801a-9c25c54ce9ae_1162x592.png 424w, https://substackcdn.com/image/fetch/$s_!zVXv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe00b0c7f-0b60-4567-801a-9c25c54ce9ae_1162x592.png 848w, https://substackcdn.com/image/fetch/$s_!zVXv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe00b0c7f-0b60-4567-801a-9c25c54ce9ae_1162x592.png 1272w, https://substackcdn.com/image/fetch/$s_!zVXv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe00b0c7f-0b60-4567-801a-9c25c54ce9ae_1162x592.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7pMl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd860793-7b34-407e-9caf-112b3eb4800b_1336x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7pMl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd860793-7b34-407e-9caf-112b3eb4800b_1336x720.png 424w, https://substackcdn.com/image/fetch/$s_!7pMl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd860793-7b34-407e-9caf-112b3eb4800b_1336x720.png 848w, https://substackcdn.com/image/fetch/$s_!7pMl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd860793-7b34-407e-9caf-112b3eb4800b_1336x720.png 1272w, https://substackcdn.com/image/fetch/$s_!7pMl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd860793-7b34-407e-9caf-112b3eb4800b_1336x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7pMl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd860793-7b34-407e-9caf-112b3eb4800b_1336x720.png" width="1336" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bd860793-7b34-407e-9caf-112b3eb4800b_1336x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1336,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1250793,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd860793-7b34-407e-9caf-112b3eb4800b_1336x720.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7pMl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd860793-7b34-407e-9caf-112b3eb4800b_1336x720.png 424w, https://substackcdn.com/image/fetch/$s_!7pMl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd860793-7b34-407e-9caf-112b3eb4800b_1336x720.png 848w, https://substackcdn.com/image/fetch/$s_!7pMl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd860793-7b34-407e-9caf-112b3eb4800b_1336x720.png 1272w, https://substackcdn.com/image/fetch/$s_!7pMl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd860793-7b34-407e-9caf-112b3eb4800b_1336x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>From <a href="https://interestingengineering.substack.com/p/the-architecture-of-awareness-design">The Architecture of Awareness</a> (V4)</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!y64b!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef374c20-b005-427b-9fa3-62cc3c7c7b78_3822x2065.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!y64b!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef374c20-b005-427b-9fa3-62cc3c7c7b78_3822x2065.jpeg 424w, https://substackcdn.com/image/fetch/$s_!y64b!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef374c20-b005-427b-9fa3-62cc3c7c7b78_3822x2065.jpeg 848w, https://substackcdn.com/image/fetch/$s_!y64b!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef374c20-b005-427b-9fa3-62cc3c7c7b78_3822x2065.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!y64b!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef374c20-b005-427b-9fa3-62cc3c7c7b78_3822x2065.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!y64b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef374c20-b005-427b-9fa3-62cc3c7c7b78_3822x2065.jpeg" width="1456" height="787" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ef374c20-b005-427b-9fa3-62cc3c7c7b78_3822x2065.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:787,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1518993,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef374c20-b005-427b-9fa3-62cc3c7c7b78_3822x2065.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!y64b!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef374c20-b005-427b-9fa3-62cc3c7c7b78_3822x2065.jpeg 424w, https://substackcdn.com/image/fetch/$s_!y64b!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef374c20-b005-427b-9fa3-62cc3c7c7b78_3822x2065.jpeg 848w, https://substackcdn.com/image/fetch/$s_!y64b!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef374c20-b005-427b-9fa3-62cc3c7c7b78_3822x2065.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!y64b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef374c20-b005-427b-9fa3-62cc3c7c7b78_3822x2065.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nM0q!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ed9490c-2e3c-4eb6-9145-f7453e745478_1427x910.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nM0q!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ed9490c-2e3c-4eb6-9145-f7453e745478_1427x910.png 424w, https://substackcdn.com/image/fetch/$s_!nM0q!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ed9490c-2e3c-4eb6-9145-f7453e745478_1427x910.png 848w, https://substackcdn.com/image/fetch/$s_!nM0q!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ed9490c-2e3c-4eb6-9145-f7453e745478_1427x910.png 1272w, https://substackcdn.com/image/fetch/$s_!nM0q!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ed9490c-2e3c-4eb6-9145-f7453e745478_1427x910.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nM0q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ed9490c-2e3c-4eb6-9145-f7453e745478_1427x910.png" width="1427" height="910" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4ed9490c-2e3c-4eb6-9145-f7453e745478_1427x910.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:910,&quot;width&quot;:1427,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:143497,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ed9490c-2e3c-4eb6-9145-f7453e745478_1427x910.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nM0q!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ed9490c-2e3c-4eb6-9145-f7453e745478_1427x910.png 424w, https://substackcdn.com/image/fetch/$s_!nM0q!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ed9490c-2e3c-4eb6-9145-f7453e745478_1427x910.png 848w, https://substackcdn.com/image/fetch/$s_!nM0q!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ed9490c-2e3c-4eb6-9145-f7453e745478_1427x910.png 1272w, https://substackcdn.com/image/fetch/$s_!nM0q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ed9490c-2e3c-4eb6-9145-f7453e745478_1427x910.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HlnT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0a7d30-0d3e-4558-9154-d2b0b7446477_1432x713.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HlnT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0a7d30-0d3e-4558-9154-d2b0b7446477_1432x713.png 424w, https://substackcdn.com/image/fetch/$s_!HlnT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0a7d30-0d3e-4558-9154-d2b0b7446477_1432x713.png 848w, https://substackcdn.com/image/fetch/$s_!HlnT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0a7d30-0d3e-4558-9154-d2b0b7446477_1432x713.png 1272w, https://substackcdn.com/image/fetch/$s_!HlnT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0a7d30-0d3e-4558-9154-d2b0b7446477_1432x713.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HlnT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0a7d30-0d3e-4558-9154-d2b0b7446477_1432x713.png" width="1432" height="713" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/df0a7d30-0d3e-4558-9154-d2b0b7446477_1432x713.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:713,&quot;width&quot;:1432,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:119130,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0a7d30-0d3e-4558-9154-d2b0b7446477_1432x713.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HlnT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0a7d30-0d3e-4558-9154-d2b0b7446477_1432x713.png 424w, https://substackcdn.com/image/fetch/$s_!HlnT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0a7d30-0d3e-4558-9154-d2b0b7446477_1432x713.png 848w, https://substackcdn.com/image/fetch/$s_!HlnT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0a7d30-0d3e-4558-9154-d2b0b7446477_1432x713.png 1272w, https://substackcdn.com/image/fetch/$s_!HlnT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0a7d30-0d3e-4558-9154-d2b0b7446477_1432x713.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!J0Ph!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53b22563-c700-4ad7-9e62-dedbd9cced27_1422x892.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!J0Ph!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53b22563-c700-4ad7-9e62-dedbd9cced27_1422x892.png 424w, https://substackcdn.com/image/fetch/$s_!J0Ph!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53b22563-c700-4ad7-9e62-dedbd9cced27_1422x892.png 848w, https://substackcdn.com/image/fetch/$s_!J0Ph!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53b22563-c700-4ad7-9e62-dedbd9cced27_1422x892.png 1272w, https://substackcdn.com/image/fetch/$s_!J0Ph!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53b22563-c700-4ad7-9e62-dedbd9cced27_1422x892.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!J0Ph!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53b22563-c700-4ad7-9e62-dedbd9cced27_1422x892.png" width="1422" height="892" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/53b22563-c700-4ad7-9e62-dedbd9cced27_1422x892.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:892,&quot;width&quot;:1422,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:160517,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53b22563-c700-4ad7-9e62-dedbd9cced27_1422x892.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!J0Ph!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53b22563-c700-4ad7-9e62-dedbd9cced27_1422x892.png 424w, https://substackcdn.com/image/fetch/$s_!J0Ph!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53b22563-c700-4ad7-9e62-dedbd9cced27_1422x892.png 848w, https://substackcdn.com/image/fetch/$s_!J0Ph!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53b22563-c700-4ad7-9e62-dedbd9cced27_1422x892.png 1272w, https://substackcdn.com/image/fetch/$s_!J0Ph!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53b22563-c700-4ad7-9e62-dedbd9cced27_1422x892.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zG42!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5be7abcb-0db4-4d18-9feb-54efabe03abf_1426x607.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zG42!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5be7abcb-0db4-4d18-9feb-54efabe03abf_1426x607.png 424w, https://substackcdn.com/image/fetch/$s_!zG42!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5be7abcb-0db4-4d18-9feb-54efabe03abf_1426x607.png 848w, https://substackcdn.com/image/fetch/$s_!zG42!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5be7abcb-0db4-4d18-9feb-54efabe03abf_1426x607.png 1272w, https://substackcdn.com/image/fetch/$s_!zG42!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5be7abcb-0db4-4d18-9feb-54efabe03abf_1426x607.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zG42!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5be7abcb-0db4-4d18-9feb-54efabe03abf_1426x607.png" width="1426" height="607" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5be7abcb-0db4-4d18-9feb-54efabe03abf_1426x607.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:607,&quot;width&quot;:1426,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:87521,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5be7abcb-0db4-4d18-9feb-54efabe03abf_1426x607.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zG42!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5be7abcb-0db4-4d18-9feb-54efabe03abf_1426x607.png 424w, https://substackcdn.com/image/fetch/$s_!zG42!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5be7abcb-0db4-4d18-9feb-54efabe03abf_1426x607.png 848w, https://substackcdn.com/image/fetch/$s_!zG42!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5be7abcb-0db4-4d18-9feb-54efabe03abf_1426x607.png 1272w, https://substackcdn.com/image/fetch/$s_!zG42!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5be7abcb-0db4-4d18-9feb-54efabe03abf_1426x607.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-eXI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb04c5d82-793d-489d-be6d-503a3859e118_1426x782.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-eXI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb04c5d82-793d-489d-be6d-503a3859e118_1426x782.png 424w, https://substackcdn.com/image/fetch/$s_!-eXI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb04c5d82-793d-489d-be6d-503a3859e118_1426x782.png 848w, https://substackcdn.com/image/fetch/$s_!-eXI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb04c5d82-793d-489d-be6d-503a3859e118_1426x782.png 1272w, https://substackcdn.com/image/fetch/$s_!-eXI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb04c5d82-793d-489d-be6d-503a3859e118_1426x782.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-eXI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb04c5d82-793d-489d-be6d-503a3859e118_1426x782.png" width="1426" height="782" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b04c5d82-793d-489d-be6d-503a3859e118_1426x782.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:782,&quot;width&quot;:1426,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:108221,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/198013155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb04c5d82-793d-489d-be6d-503a3859e118_1426x782.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-eXI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb04c5d82-793d-489d-be6d-503a3859e118_1426x782.png 424w, https://substackcdn.com/image/fetch/$s_!-eXI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb04c5d82-793d-489d-be6d-503a3859e118_1426x782.png 848w, https://substackcdn.com/image/fetch/$s_!-eXI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb04c5d82-793d-489d-be6d-503a3859e118_1426x782.png 1272w, https://substackcdn.com/image/fetch/$s_!-eXI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb04c5d82-793d-489d-be6d-503a3859e118_1426x782.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h3>References &amp; Further Reading</h3><h3>Experimental Benchmarks</h3><p><strong>Terminal-Bench 2.0 </strong>&#8212; Benchmark for AI agents in command-line environments. ForgeCode vs Claude Code gap (21.8pp from harness architecture alone).</p><p><a href="https://github.com/terminal-bench/terminal-bench">https://github.com/terminal-bench/terminal-bench</a></p><h3>Agent Architecture Foundations</h3><p><strong>ReAct: Synergizing Reasoning and Acting in Language Models</strong> (Yao et al., 2022)</p><p><a href="https://arxiv.org/abs/2210.03629">https://arxiv.org/abs/2210.03629</a></p><p><strong>AutoGen: Multi-Agent Conversation Framework</strong> (Wu et al., 2023)</p><p><a href="https://arxiv.org/abs/2308.08155">https://arxiv.org/abs/2308.08155</a></p><p><strong>LATS: Language Agent Tree Search</strong> (Zhou et al., 2023)</p><p><a href="https://arxiv.org/abs/2310.04406">https://arxiv.org/abs/2310.04406</a></p><h3>Prompt Engineering</h3><p><strong>Chain-of-Thought Prompting Elicits Reasoning in Large Language Models</strong> (Wei et al., 2022)</p><p><a href="https://arxiv.org/abs/2201.11903">https://arxiv.org/abs/2201.11903</a></p><p><strong>Anthropic Prompt Engineering Guide</strong></p><p><a href="https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview">https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview</a></p><h3>/goal and Ralph Loop 2.0</h3><p><strong>Codex CLI &#8212; OpenAI coding agent with /goal support</strong></p><p><a href="https://github.com/openai/codex">https://github.com/openai/codex</a></p><p><a href="https://www.mindstudio.ai/blog/codex-goal-ralph-loop-14-hour-autonomous-task">https://www.mindstudio.ai/blog/codex-goal-ralph-loop-14-hour-autonomous-task </a></p><p><strong>Hermes Agent &#8212; Lightweight agent framework with standing objective support</strong></p><p><a href="https://github.com/hermes-hq/hermes">https://github.com/hermes-hq/hermes</a></p><p><a href="https://hermes-agent.nousresearch.com/docs/user-guide/features/goals">https://hermes-agent.nousresearch.com/docs/user-guide/features/goals</a></p><h3>Model Routing</h3><p><strong>OpenRouter &#8212; Multi-model routing API</strong></p><p><a href="https://openrouter.ai/docs">https://openrouter.ai/docs</a></p><p><strong>Mixture-of-Agents Enhances Large Language Model Capabilities</strong> (Wang et al., 2024)</p><p><a href="https://arxiv.org/abs/2406.04692">https://arxiv.org/abs/2406.04692</a></p><h3>Supply Chain Context</h3><p><strong>EIA: Strait of Hormuz &#8212; Strategic Importance and Disruption Risk</strong></p><p><a href="https://www.eia.gov/international/analysis/regions-of-interest/Hormuz">https://www.eia.gov/international/analysis/regions-of-interest/Hormuz</a></p><p><strong>Lloyd&#8217;s List: Red Sea Shipping Crisis &#8212; Impact on Global Supply Chains (2024)</strong></p><p>https://lloydslist.maritimeintelligence.informa.com</p><h3>Prior Work in This Series</h3><p><em><a href="https://interestingengineering.substack.com/p/the-architecture-of-awareness-design">The Architecture of Awareness: Design Considerations of a Shipper&#8217;s Agentic Logic </a>&#8212; Interesting Engineering++, April 2026</em></p><p><em><a href="https://interestingengineering.substack.com/p/the-loop-is-the-lab">The Loop Is the Lab </a>&#8212; Article Series</em></p><p><em><a href="https://interestingengineering.substack.com/p/the-speciation-of-intelligence">The Speciation of Intelligence </a>&#8212; Article Series</em></p><p><em><a href="https://interestingengineering.substack.com/p/the-working-layer">The Working Layer </a>&#8212; Article Series</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Harness Experiment ]]></title><description><![CDATA[What Ten AI Architectures Teach About Intelligence, Instructions, and Why More Is Often Less]]></description><link>https://interestingengineering.substack.com/p/the-harness-experiment</link><guid isPermaLink="false">https://interestingengineering.substack.com/p/the-harness-experiment</guid><dc:creator><![CDATA[Interesting Engineering ++]]></dc:creator><pubDate>Mon, 11 May 2026 17:08:15 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!nWSV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7374887c-97b7-43aa-a2bf-8f930042f07b_862x828.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nWSV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7374887c-97b7-43aa-a2bf-8f930042f07b_862x828.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nWSV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7374887c-97b7-43aa-a2bf-8f930042f07b_862x828.png 424w, https://substackcdn.com/image/fetch/$s_!nWSV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7374887c-97b7-43aa-a2bf-8f930042f07b_862x828.png 848w, https://substackcdn.com/image/fetch/$s_!nWSV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7374887c-97b7-43aa-a2bf-8f930042f07b_862x828.png 1272w, https://substackcdn.com/image/fetch/$s_!nWSV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7374887c-97b7-43aa-a2bf-8f930042f07b_862x828.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nWSV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7374887c-97b7-43aa-a2bf-8f930042f07b_862x828.png" width="862" height="828" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7374887c-97b7-43aa-a2bf-8f930042f07b_862x828.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:828,&quot;width&quot;:862,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:115239,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/197169160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7374887c-97b7-43aa-a2bf-8f930042f07b_862x828.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nWSV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7374887c-97b7-43aa-a2bf-8f930042f07b_862x828.png 424w, https://substackcdn.com/image/fetch/$s_!nWSV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7374887c-97b7-43aa-a2bf-8f930042f07b_862x828.png 848w, https://substackcdn.com/image/fetch/$s_!nWSV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7374887c-97b7-43aa-a2bf-8f930042f07b_862x828.png 1272w, https://substackcdn.com/image/fetch/$s_!nWSV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7374887c-97b7-43aa-a2bf-8f930042f07b_862x828.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>This is an account of a (fairly) controlled experiment &#8212; ten versions of the same task, ten different AI architectures, and several contradictions that get many working on agentic systems excited (but maybe shouldn&#8217;t). The most sophisticated system isn&#8217;t always the best. The simplest one won. What follows is more instructive than the results.</em></p><h2>The Question Not Many Are Asking</h2><p>Every week brings a new announcement about multi-agent AI systems, swarm intelligence, and the promise that coordinating multiple AI models will unlock capabilities beyond what any single model can achieve. The discourse has the texture of inevitability. More agents, more tools, more architectural complexity &#8212; this is the direction of travel, and questioning it feels like scepticism for its own sake.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>But what if the question of how many agents you need is less important than the question of how clearly you tell one agent what to do?</p><p>That is the question this experiment was designed to answer. Not whether multi-agent systems are theoretically superior. Whether they are practically superior &#8212; on real tasks, with real measurement, against a clear baseline. The answer turned out to be considerably more nuanced than suggested.</p><blockquote><p><em><strong>&#8220;<a href="https://openai.com/index/harness-engineering/">Humans steer. Agents Execute.</a>&#8221; &#8212; Ryan Lopopolo, Technical Team, OpenAI</strong></em></p></blockquote><p>That quote captures the premise of what has become a new engineering discipline: <strong>harness engineering. </strong>The harness is everything that surrounds an AI model &#8212; the <strong>instructions, the tools, the memory systems, the verification loops, the architecture that determines what the model knows, what it can do, and when it decides it is finished</strong>. For much of 2025/26, building a harness meant adding more of each of those things. </p><blockquote><p>The formula Hashimoto&#8217;s work gave us is simple: <strong><a href="https://www.softwareimprovementgroup.com/blog/what-is-harness-engineering/">Agent = Model + Harness</a></strong><a href="https://www.softwareimprovementgroup.com/blog/what-is-harness-engineering/">.</a></p></blockquote><p>This experiment set out to measure whether more was, in fact, better. It is obviously limited by it&#8217;s scope, and yet, many learning points were taken.</p><h2>Setting the Stage: What Was Built and Why</h2><p>The project, which I am calling <strong>Harness Lab</strong>, was built to run a single fixed task through ten progressively more sophisticated AI architectures. The task was deliberately mundane: evaluate three logistics vendors across cost, reliability, and integration criteria, produce a scored comparison table, and recommend one vendor with justification.</p><p>The mundanity was intentional. I did not want a task so complex that only sophisticated architectures could handle it. I wanted a task that any competent system should be able to complete &#8212; and then watch what happened when I added layers on top of a system that was already competent.</p><h3>The Task</h3><p>Three vendors: <strong>FreightNova, LogiPath, and ChainCore</strong>. Three scoring criteria with fixed weights: cost at 35 percent, reliability at 40 percent, integration at 25 percent. Required output: a scored table, a weighted total for each vendor, a 200-word written recommendation naming one vendor, and a confidence rating. All vendor data was provided &#8212; no external knowledge required.</p><p>The correct answer was not ambiguous. In this test example, ChainCore had the highest weighted score (8.24), driven by its reliability premium (97.5% on-time rate versus 91-94%), a 2-hour incident recovery SLA against 4-12 hours for competitors, and the best integration depth. The task was designed to have a defensible right answer so the scoring could be meaningful.</p><h3>The Infrastructure</h3><p>The project ran inside VS Code with Claude Code as the outer runtime &#8212; the AI agent that read the specification, built all the project files, and executed each experiment. Every experiment shared the same foundation: a single API client file, a fixed rubric, a gold-standard answer, and a self-healing error recovery system.</p><p>The key architectural principle was the <strong>separation of concerns</strong>. <em><strong>To change the model, you edit one line. To change the task, you edit one file. The experiment layer &#8212; where different harness architectures were tested &#8212; was entirely isolated from the infrastructure layer.</strong></em> This is precisely the structure that Addy Osmani describes as <em><strong>Harness-as-a-Service: a platform layer that you configure, rather than rebuild.</strong></em></p><pre><code><code>&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;
&#9474;                Claude Code (HaaS Runtime)                   &#9474;   &#8592; Outer Harness-as-a-Service
&#9474;  &#8226; ReAct / Agent Loop                                       &#9474;
&#9474;  &#8226; Tool Registry + Permission Gates                         &#9474;
&#9474;  &#8226; Context Assembly + Compaction                            &#9474;
&#9474;  &#8226; 3-Layer Memory System (in-context + MEMORY.md + files)   &#9474;
&#9474;  &#8226; Sub-agent / Swarm Orchestration                          &#9474;
&#9474;  &#8226; Safety, Hooks, Streaming, Sandboxes                      &#9474;
&#9474;  &#8226; Persistent Filesystem + Execution Environment            &#9474;
&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9650;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;
                               &#9474;   (Configure + Extend)
                               &#9474;
&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9524;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;
&#9474;           Pre-Experiment Harness (Lab OS / Foundation)    &#8592; Standardized Configuration Surface
&#9474;  &#8226; shared/client.js (model routing)                         &#9474;
&#9474;  &#8226; shared/scorer.js + rubric system                         &#9474;
&#9474;  &#8226; shared/self_heal.js + logger                             &#9474;
&#9474;  &#8226; Root CLAUDE.md (constitution + lab-wide rules)           &#9474;
&#9474;  &#8226; Memory conventions + task templates                      &#9474;
&#9474;  &#8226; Observability &amp; results pipeline                         &#9474;
&#9474;  &#8226; Common tools &amp; utilities                                 &#9474;
&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9650;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;
                               &#9474;   (Swappable)
                               &#9474;
&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9524;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;
&#9474;           Experiment Harness Layer (H1&#8211;H10)                 &#9474;   &#8592; Domain-Specific Scaffolding
&#9474;                                                             &#9474;
&#9474;  H1: Prompt + Constitution                                  &#9474;
&#9474;  H2: Reflection + Self-Critique Loop                        &#9474;
&#9474;  H3: Sequential Tool-Use                                    &#9474;
&#9474;  H4: Parallel Fan-Out + Merge                               &#9474;
&#9474;  H5: Eval + Revision Loop                                   &#9474;
&#9474;  H6: Skill / Memory Crystallization                         &#9474;
&#9474;  H7: Model Routing / Tiered                                 &#9474;
&#9474;  H8: HITL + Confidence Gating                               &#9474;
&#9474;  H9: Sub-Agent Swarm                                        &#9474;
&#9474;  H10: Meta-Router / Adaptive                                &#9474;
&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9650;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;
                               &#9474;
&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9524;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;
&#9474;                     Core Model                              &#9474;
&#9474;                gpt-oss-120b / Claude Sonnet etc.            &#9474;
&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;</code></code></pre><p>The folder structure looked like this:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NVuT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9121040b-32ca-4627-86bf-c94c44c35d42_1062x547.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NVuT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9121040b-32ca-4627-86bf-c94c44c35d42_1062x547.png 424w, https://substackcdn.com/image/fetch/$s_!NVuT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9121040b-32ca-4627-86bf-c94c44c35d42_1062x547.png 848w, https://substackcdn.com/image/fetch/$s_!NVuT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9121040b-32ca-4627-86bf-c94c44c35d42_1062x547.png 1272w, https://substackcdn.com/image/fetch/$s_!NVuT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9121040b-32ca-4627-86bf-c94c44c35d42_1062x547.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NVuT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9121040b-32ca-4627-86bf-c94c44c35d42_1062x547.png" width="1062" height="547" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9121040b-32ca-4627-86bf-c94c44c35d42_1062x547.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:547,&quot;width&quot;:1062,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:76150,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/197169160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9121040b-32ca-4627-86bf-c94c44c35d42_1062x547.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NVuT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9121040b-32ca-4627-86bf-c94c44c35d42_1062x547.png 424w, https://substackcdn.com/image/fetch/$s_!NVuT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9121040b-32ca-4627-86bf-c94c44c35d42_1062x547.png 848w, https://substackcdn.com/image/fetch/$s_!NVuT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9121040b-32ca-4627-86bf-c94c44c35d42_1062x547.png 1272w, https://substackcdn.com/image/fetch/$s_!NVuT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9121040b-32ca-4627-86bf-c94c44c35d42_1062x547.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Three metrics were tracked for each experiment. <em><strong>Alpha (&#945;): quality score from 0 to 1, assessed by a separate AI judge against a detailed rubric. Lambda (&#955;): latency in milliseconds. Kappa (&#954;): total tokens consumed</strong></em>. The rubric measured <em><strong>reasoning quality</strong></em>, not structural completeness &#8212; a deliberate design choice that turned out to be critical, as we will see.</p><h2>The First Mistake: When the Rubric Lies</h2><p>The first run of the experiment produced an immediate problem, and the problem was not with the AI.</p><p>H1 &#8212; the bare model control, no system prompt, no tools, no structure &#8212; scored alpha of 1.0. Perfect. A system with no harness whatsoever had produced a flawless result on the first attempt.</p><p>This was not a finding. It was a measurement failure.</p><blockquote><p><strong>&#8220;If your evaluation gives everything a perfect score, you have not built a rubric. You have built a rubber stamp.&#8221;</strong></p></blockquote><h2>The Scoring Problem (My Biggest Early Mistake)</h2><p>Before running any experiment, I had instructed Claude Code (based on objectives) to build a judge &#8212; a second AI call that scores each output 0&#8211;1.</p><p>The original rubric had <strong>six yes/no checkboxes</strong>: Does it have a table? Does it name a vendor? Does it give a confidence rating?</p><p>H1 (the bare model with zero guidance) scored <strong>1.0. Perfect. </strong></p><p>That&#8217;s like grading an essay purely on whether it has paragraphs and a title. A student who writes beautifully structured nonsense aces it. The rubric was measuring <em>presence of structure</em>, not <em>quality of reasoning</em>.</p><p><strong>That was replaced</strong> with six gradient criteria &#8212; each scored 0.0 to 1.0 &#8212; focused on things that actually matter:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bf4N!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7486fd-e4db-4cea-a924-9e533516ff4a_566x378.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bf4N!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7486fd-e4db-4cea-a924-9e533516ff4a_566x378.png 424w, https://substackcdn.com/image/fetch/$s_!bf4N!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7486fd-e4db-4cea-a924-9e533516ff4a_566x378.png 848w, https://substackcdn.com/image/fetch/$s_!bf4N!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7486fd-e4db-4cea-a924-9e533516ff4a_566x378.png 1272w, https://substackcdn.com/image/fetch/$s_!bf4N!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7486fd-e4db-4cea-a924-9e533516ff4a_566x378.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bf4N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7486fd-e4db-4cea-a924-9e533516ff4a_566x378.png" width="566" height="378" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1a7486fd-e4db-4cea-a924-9e533516ff4a_566x378.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:378,&quot;width&quot;:566,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:15962,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/197169160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7486fd-e4db-4cea-a924-9e533516ff4a_566x378.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bf4N!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7486fd-e4db-4cea-a924-9e533516ff4a_566x378.png 424w, https://substackcdn.com/image/fetch/$s_!bf4N!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7486fd-e4db-4cea-a924-9e533516ff4a_566x378.png 848w, https://substackcdn.com/image/fetch/$s_!bf4N!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7486fd-e4db-4cea-a924-9e533516ff4a_566x378.png 1272w, https://substackcdn.com/image/fetch/$s_!bf4N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7486fd-e4db-4cea-a924-9e533516ff4a_566x378.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>H1 re-scored to <strong>0.665</strong> &#8212; a realistic baseline with real headroom above it.</p><h3>Bugs I Hit Along the Way</h3><p>Two experiments scored <strong>0.000</strong> for reasons unrelated to the model&#8217;s quality:</p><p><strong>H6 (Skill Memory)</strong> &#8212; This experiment deliberately uses <em>different</em> vendors (SwiftShip, NexaLog, BridgeCargo) to test whether the skill transfers. The scorer was still checking against the original vendor list and flagged every number as invented. Fix: let the scorer accept the task&#8217;s own vendor data.</p><p><strong>H8 (HITL)</strong> &#8212; The correction step told the model: <em>&#8220;Return the corrected complete JSON output.&#8221;</em> The model returned a tidy list of corrections &#8212; not the corrected vendor brief. Fix: explicitly say <em>&#8220;return the full brief in the same format as the input, with fixes applied.&#8221;</em></p><h2>Rebuilding the Rubric</h2><p>The rubric was rebuilt from scratch before running any further experiments. The new version replaced binary structural checks with gradient quality criteria:</p><p>&#8211; <strong>Math accuracy</strong>: Are the weighted totals arithmetically correct? A score of 1.0 required all three vendors correctly calculated to two decimal places. Wrong formula, partial credit. No calculation, near zero.</p><p>&#8211; <strong>Score calibration:</strong> Are the raw 0-10 criterion scores defensible from the data, with consistent logic across vendors? Arbitrary or uniform scores dropped the result significantly.</p><p>&#8211; <strong>Recommendation quality</strong>: Does the justification cite specific numbers from the vendor data, not vague assertions? A recommendation that says ChainCore is reliable scores lower than one that cites a 97.5% on-time rate and a 2-hour SLA.</p><p>&#8211; <strong>Tradeoff recognition</strong>: Does the output acknowledge that the cheapest vendor is not the most reliable, or that the best-integrated vendor costs more? This is the reasoning quality marker.</p><p>&#8211; <strong>Data fidelity</strong>: Are all cited numbers exactly from the source data, with no invented statistics?</p><p>&#8211; <strong>Confidence calibration</strong>: Is the confidence rating calibrated to the actual certainty implied by the data?</p><p>The scorer prompt was also revised with an explicit anti-inflation instruction: a bare model output with no system prompt should score roughly 0.5 to 0.65 on this rubric. If the scorer was about to return above 0.80 for a plain unstructured output, it was instructed to reconsider its scores.</p><p>This is a lesson that applies well beyond this experiment. </p><blockquote><p><strong>Evaluation design is inseparable from experimental design. The harness you build around the model matters less than the harness you build around the measurement. A rubric that cannot detect quality differences between good and bad outputs will make every architecture look equally capable &#8212; which is precisely the kind of false equivalence that leads to bad decisions about AI system design.</strong></p></blockquote><h2>The Experiments: What Actually Happened</h2><p>With a working rubric in place, the ten experiments ran in sequence. What emerged was a story with a clear shape: quality peaked early, then complexity took over without improving results.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!t-Gf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff390bc3-0750-477f-9efe-5a77456c330b_1193x715.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!t-Gf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff390bc3-0750-477f-9efe-5a77456c330b_1193x715.png 424w, https://substackcdn.com/image/fetch/$s_!t-Gf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff390bc3-0750-477f-9efe-5a77456c330b_1193x715.png 848w, https://substackcdn.com/image/fetch/$s_!t-Gf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff390bc3-0750-477f-9efe-5a77456c330b_1193x715.png 1272w, https://substackcdn.com/image/fetch/$s_!t-Gf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff390bc3-0750-477f-9efe-5a77456c330b_1193x715.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!t-Gf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff390bc3-0750-477f-9efe-5a77456c330b_1193x715.png" width="1193" height="715" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ff390bc3-0750-477f-9efe-5a77456c330b_1193x715.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:715,&quot;width&quot;:1193,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:74337,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/197169160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff390bc3-0750-477f-9efe-5a77456c330b_1193x715.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!t-Gf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff390bc3-0750-477f-9efe-5a77456c330b_1193x715.png 424w, https://substackcdn.com/image/fetch/$s_!t-Gf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff390bc3-0750-477f-9efe-5a77456c330b_1193x715.png 848w, https://substackcdn.com/image/fetch/$s_!t-Gf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff390bc3-0750-477f-9efe-5a77456c330b_1193x715.png 1272w, https://substackcdn.com/image/fetch/$s_!t-Gf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff390bc3-0750-477f-9efe-5a77456c330b_1193x715.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: center;"><em>Table 1. Full results across H1&#8211;H10. &#945; = quality (0&#8211;1.0). &#955; = latency (ms). &#954; = total tokens. H2 (highlighted) dominates on all three metrics. &#8595; indicates below baseline.</em></p><h3>H1: The Control (0.665)</h3><p>The bare model without any harness structure scored 0.665. It produced a reasonable vendor evaluation &#8212; a table existed, a recommendation was made, ChainCore was correctly identified as the best choice. But the justification was thin, the weighted math had a minor error, and the confidence rating was present but miscalibrated. This is the floor. Everything above it represents harness contribution.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NoJO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abce87c-d1f1-4664-9586-89a7af9aecf6_692x677.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NoJO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abce87c-d1f1-4664-9586-89a7af9aecf6_692x677.png 424w, https://substackcdn.com/image/fetch/$s_!NoJO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abce87c-d1f1-4664-9586-89a7af9aecf6_692x677.png 848w, https://substackcdn.com/image/fetch/$s_!NoJO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abce87c-d1f1-4664-9586-89a7af9aecf6_692x677.png 1272w, https://substackcdn.com/image/fetch/$s_!NoJO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abce87c-d1f1-4664-9586-89a7af9aecf6_692x677.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NoJO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abce87c-d1f1-4664-9586-89a7af9aecf6_692x677.png" width="692" height="677" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3abce87c-d1f1-4664-9586-89a7af9aecf6_692x677.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:677,&quot;width&quot;:692,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:55649,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/197169160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abce87c-d1f1-4664-9586-89a7af9aecf6_692x677.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NoJO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abce87c-d1f1-4664-9586-89a7af9aecf6_692x677.png 424w, https://substackcdn.com/image/fetch/$s_!NoJO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abce87c-d1f1-4664-9586-89a7af9aecf6_692x677.png 848w, https://substackcdn.com/image/fetch/$s_!NoJO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abce87c-d1f1-4664-9586-89a7af9aecf6_692x677.png 1272w, https://substackcdn.com/image/fetch/$s_!NoJO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abce87c-d1f1-4664-9586-89a7af9aecf6_692x677.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>H2: The Prompt Harness (0.920) &#8212; The Peak</h3><p>Adding a structured system prompt that specified the analyst role, the exact scoring weights, the required JSON output format, and an explicit anti-hallucination rule produced the best result in the entire experiment. Not second best. Best. By a substantial margin.</p><p>H2 was also the fastest experiment (36 seconds versus H1&#8217;s 46 seconds) and consumed the fewest tokens (1,556). In every measurable dimension &#8212; quality, speed, and cost &#8212; the structured prompt won.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!snuM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc313f1c-c184-49bf-b7bf-a29a50654d4d_587x713.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!snuM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc313f1c-c184-49bf-b7bf-a29a50654d4d_587x713.png 424w, https://substackcdn.com/image/fetch/$s_!snuM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc313f1c-c184-49bf-b7bf-a29a50654d4d_587x713.png 848w, https://substackcdn.com/image/fetch/$s_!snuM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc313f1c-c184-49bf-b7bf-a29a50654d4d_587x713.png 1272w, https://substackcdn.com/image/fetch/$s_!snuM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc313f1c-c184-49bf-b7bf-a29a50654d4d_587x713.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!snuM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc313f1c-c184-49bf-b7bf-a29a50654d4d_587x713.png" width="587" height="713" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fc313f1c-c184-49bf-b7bf-a29a50654d4d_587x713.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:713,&quot;width&quot;:587,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:68171,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/197169160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc313f1c-c184-49bf-b7bf-a29a50654d4d_587x713.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!snuM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc313f1c-c184-49bf-b7bf-a29a50654d4d_587x713.png 424w, https://substackcdn.com/image/fetch/$s_!snuM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc313f1c-c184-49bf-b7bf-a29a50654d4d_587x713.png 848w, https://substackcdn.com/image/fetch/$s_!snuM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc313f1c-c184-49bf-b7bf-a29a50654d4d_587x713.png 1272w, https://substackcdn.com/image/fetch/$s_!snuM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc313f1c-c184-49bf-b7bf-a29a50654d4d_587x713.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is the result that demands explanation, because it goes against the grain of almost everything being built in the AI industry right now. A text file with clear instructions outperformed multi-agent swarms, automated eval loops, model-tier routing systems, and confidence-gated correction mechanisms. The system prompt is 200 words. The sub-agent swarm (H9) consumed nearly three times as many tokens and scored 0.105 lower.</p><blockquote><p><em><strong>&#8220;A well-crafted instruction is not a primitive technique waiting to be superseded. It is the thing that every sophisticated architecture is trying to approximate.&#8221;</strong></em></p></blockquote><h3>H3: Sequential Tools (0.600) &#8212; Below Baseline</h3><p>Adding tool use &#8212; a search function the model could call to retrieve vendor data &#8212; dropped quality below the H1 baseline. The tool loop fragmented the task. Each tool call returned a narrow slice of vendor data, and the model synthesized from three partial views rather than one holistic context. The mechanics of using the tool consumed cognitive resources that would have been better spent on reasoning.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-DFy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa10cef5e-37cd-464b-8bcc-ae041ad594aa_822x605.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-DFy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa10cef5e-37cd-464b-8bcc-ae041ad594aa_822x605.png 424w, https://substackcdn.com/image/fetch/$s_!-DFy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa10cef5e-37cd-464b-8bcc-ae041ad594aa_822x605.png 848w, https://substackcdn.com/image/fetch/$s_!-DFy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa10cef5e-37cd-464b-8bcc-ae041ad594aa_822x605.png 1272w, https://substackcdn.com/image/fetch/$s_!-DFy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa10cef5e-37cd-464b-8bcc-ae041ad594aa_822x605.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-DFy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa10cef5e-37cd-464b-8bcc-ae041ad594aa_822x605.png" width="822" height="605" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a10cef5e-37cd-464b-8bcc-ae041ad594aa_822x605.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:605,&quot;width&quot;:822,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:81502,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/197169160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa10cef5e-37cd-464b-8bcc-ae041ad594aa_822x605.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-DFy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa10cef5e-37cd-464b-8bcc-ae041ad594aa_822x605.png 424w, https://substackcdn.com/image/fetch/$s_!-DFy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa10cef5e-37cd-464b-8bcc-ae041ad594aa_822x605.png 848w, https://substackcdn.com/image/fetch/$s_!-DFy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa10cef5e-37cd-464b-8bcc-ae041ad594aa_822x605.png 1272w, https://substackcdn.com/image/fetch/$s_!-DFy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa10cef5e-37cd-464b-8bcc-ae041ad594aa_822x605.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Latency jumped by 45 percent compared to H2. Tokens dropped slightly (the tool calls were efficient) but quality fell. This is a common pattern: tools add value when the data they retrieve would otherwise be unavailable, not when the data is already present in the context.</p><h3>H4: Parallel Fan-Out (0.440) &#8212; The Coherence Collapse</h3><p>This is where the experiment became genuinely instructive. H4 dispatched three sub-agents concurrently using Promise.all &#8212; one per vendor &#8212; then merged their individual scores into a final recommendation. The theory was that parallelism would reduce latency while maintaining quality.</p><p>Quality dropped to 0.440. This is 33 percent below the H1 baseline. Three agents working simultaneously produced worse output than one agent working alone with no special instructions.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cBo2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6c4afec-032a-4c0d-b3b4-f62feb465f25_686x706.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cBo2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6c4afec-032a-4c0d-b3b4-f62feb465f25_686x706.png 424w, https://substackcdn.com/image/fetch/$s_!cBo2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6c4afec-032a-4c0d-b3b4-f62feb465f25_686x706.png 848w, https://substackcdn.com/image/fetch/$s_!cBo2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6c4afec-032a-4c0d-b3b4-f62feb465f25_686x706.png 1272w, https://substackcdn.com/image/fetch/$s_!cBo2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6c4afec-032a-4c0d-b3b4-f62feb465f25_686x706.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cBo2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6c4afec-032a-4c0d-b3b4-f62feb465f25_686x706.png" width="686" height="706" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f6c4afec-032a-4c0d-b3b4-f62feb465f25_686x706.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:706,&quot;width&quot;:686,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:69030,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/197169160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6c4afec-032a-4c0d-b3b4-f62feb465f25_686x706.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cBo2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6c4afec-032a-4c0d-b3b4-f62feb465f25_686x706.png 424w, https://substackcdn.com/image/fetch/$s_!cBo2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6c4afec-032a-4c0d-b3b4-f62feb465f25_686x706.png 848w, https://substackcdn.com/image/fetch/$s_!cBo2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6c4afec-032a-4c0d-b3b4-f62feb465f25_686x706.png 1272w, https://substackcdn.com/image/fetch/$s_!cBo2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6c4afec-032a-4c0d-b3b4-f62feb465f25_686x706.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The diagnosis is a coherence problem. Each sub-agent independently calibrated its 0-10 scoring scale. Sub-agent A might rate LogiPath&#8217;s cost as 8.5 because it is the cheapest. Sub-agent B might rate ChainCore&#8217;s reliability as 9.5 because it has the best on-time rate. But 8.5 from sub-agent A and 9.5 from sub-agent B are not on the same scale. The merge agent received three scored JSON blobs with no shared methodology and was asked to synthesize them into a coherent recommendation. The rubric&#8217;s score calibration criterion &#8212; which checks whether scores are defensible and consistent across vendors &#8212; penalized this heavily.</p><p>This failure mode is well-documented in multi-agent research and is precisely what sophisticated systems like ForgeCode address through specialised roles and verification gates. A Sage-style reviewer checking sub-agent outputs for consistency before merge would catch this. The naive parallel fan-out does not.</p><h3>H5 and H6: The Eval Loop and Skill Memory (0.840 / 0.900)</h3><p>H5 ran a generate-score-revise loop for three generations, automatically rewriting its system prompt each time a criterion scored below 0.70. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lkw-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06369e56-f3bb-40ce-b2c9-e2bd0c0ee27c_467x696.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lkw-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06369e56-f3bb-40ce-b2c9-e2bd0c0ee27c_467x696.png 424w, https://substackcdn.com/image/fetch/$s_!lkw-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06369e56-f3bb-40ce-b2c9-e2bd0c0ee27c_467x696.png 848w, https://substackcdn.com/image/fetch/$s_!lkw-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06369e56-f3bb-40ce-b2c9-e2bd0c0ee27c_467x696.png 1272w, https://substackcdn.com/image/fetch/$s_!lkw-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06369e56-f3bb-40ce-b2c9-e2bd0c0ee27c_467x696.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lkw-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06369e56-f3bb-40ce-b2c9-e2bd0c0ee27c_467x696.png" width="467" height="696" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/06369e56-f3bb-40ce-b2c9-e2bd0c0ee27c_467x696.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:696,&quot;width&quot;:467,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:55457,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/197169160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06369e56-f3bb-40ce-b2c9-e2bd0c0ee27c_467x696.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lkw-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06369e56-f3bb-40ce-b2c9-e2bd0c0ee27c_467x696.png 424w, https://substackcdn.com/image/fetch/$s_!lkw-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06369e56-f3bb-40ce-b2c9-e2bd0c0ee27c_467x696.png 848w, https://substackcdn.com/image/fetch/$s_!lkw-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06369e56-f3bb-40ce-b2c9-e2bd0c0ee27c_467x696.png 1272w, https://substackcdn.com/image/fetch/$s_!lkw-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06369e56-f3bb-40ce-b2c9-e2bd0c0ee27c_467x696.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>H6 took the best prompt from H5, crystallised it into a skill memory file, and applied it to a completely different set of vendors.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!P8NV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff95ed997-db0d-4149-9d70-13695f512f7d_571x733.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!P8NV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff95ed997-db0d-4149-9d70-13695f512f7d_571x733.png 424w, https://substackcdn.com/image/fetch/$s_!P8NV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff95ed997-db0d-4149-9d70-13695f512f7d_571x733.png 848w, https://substackcdn.com/image/fetch/$s_!P8NV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff95ed997-db0d-4149-9d70-13695f512f7d_571x733.png 1272w, https://substackcdn.com/image/fetch/$s_!P8NV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff95ed997-db0d-4149-9d70-13695f512f7d_571x733.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!P8NV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff95ed997-db0d-4149-9d70-13695f512f7d_571x733.png" width="571" height="733" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f95ed997-db0d-4149-9d70-13695f512f7d_571x733.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:733,&quot;width&quot;:571,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:66184,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/197169160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff95ed997-db0d-4149-9d70-13695f512f7d_571x733.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!P8NV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff95ed997-db0d-4149-9d70-13695f512f7d_571x733.png 424w, https://substackcdn.com/image/fetch/$s_!P8NV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff95ed997-db0d-4149-9d70-13695f512f7d_571x733.png 848w, https://substackcdn.com/image/fetch/$s_!P8NV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff95ed997-db0d-4149-9d70-13695f512f7d_571x733.png 1272w, https://substackcdn.com/image/fetch/$s_!P8NV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff95ed997-db0d-4149-9d70-13695f512f7d_571x733.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Both scored well. But notice what H5 was actually doing: iterating toward a good system prompt. The <strong>eval loop is an automated prompt optimisation process</strong>. What it converged on, after three generations of self-revision, was essentially a well-crafted instruction &#8212; which is what H2 had from the start.</p><p>H6 produced the genuinely useful finding: the crystallised skill transferred. Applied to new vendors (SwiftShip, NexaLog, BridgeCargo), it maintained 0.900 quality with no additional effort. This is skill memory working as intended &#8212; not as a substitute for good initial design, but as a mechanism for preserving and reusing good design once it exists.</p><p>This maps directly to what Addy Osmani describes in his agent skills framework: <strong>skills are the reusable workflow chunks that get progressively disclosed into the system prompt</strong>. The skill crystallised from H5 is exactly this &#8212; a compact, tested behavioural specification that can be loaded on demand.</p><h3>H7 Through H9: The Complexity Plateau</h3><p>Model routing (H7), HITL confidence gating (H8), and sub-agent specialisation (H9) all clustered around 0.815 to 0.840. They added significant latency and token cost without improving quality beyond H5.</p><p>H7 is particularly instructive.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kNaV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d75d642-2424-4c85-b013-48668d204267_762x738.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kNaV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d75d642-2424-4c85-b013-48668d204267_762x738.png 424w, https://substackcdn.com/image/fetch/$s_!kNaV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d75d642-2424-4c85-b013-48668d204267_762x738.png 848w, https://substackcdn.com/image/fetch/$s_!kNaV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d75d642-2424-4c85-b013-48668d204267_762x738.png 1272w, https://substackcdn.com/image/fetch/$s_!kNaV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d75d642-2424-4c85-b013-48668d204267_762x738.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kNaV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d75d642-2424-4c85-b013-48668d204267_762x738.png" width="762" height="738" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0d75d642-2424-4c85-b013-48668d204267_762x738.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:738,&quot;width&quot;:762,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:86330,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/197169160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d75d642-2424-4c85-b013-48668d204267_762x738.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kNaV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d75d642-2424-4c85-b013-48668d204267_762x738.png 424w, https://substackcdn.com/image/fetch/$s_!kNaV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d75d642-2424-4c85-b013-48668d204267_762x738.png 848w, https://substackcdn.com/image/fetch/$s_!kNaV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d75d642-2424-4c85-b013-48668d204267_762x738.png 1272w, https://substackcdn.com/image/fetch/$s_!kNaV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d75d642-2424-4c85-b013-48668d204267_762x738.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p> The theory was that routing cheap extraction tasks to a lighter model would reduce token costs while maintaining quality. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Dv_S!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b44cc9f-b156-44bf-9828-0435c168ec03_660x741.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Dv_S!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b44cc9f-b156-44bf-9828-0435c168ec03_660x741.png 424w, https://substackcdn.com/image/fetch/$s_!Dv_S!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b44cc9f-b156-44bf-9828-0435c168ec03_660x741.png 848w, https://substackcdn.com/image/fetch/$s_!Dv_S!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b44cc9f-b156-44bf-9828-0435c168ec03_660x741.png 1272w, https://substackcdn.com/image/fetch/$s_!Dv_S!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b44cc9f-b156-44bf-9828-0435c168ec03_660x741.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Dv_S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b44cc9f-b156-44bf-9828-0435c168ec03_660x741.png" width="660" height="741" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0b44cc9f-b156-44bf-9828-0435c168ec03_660x741.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:741,&quot;width&quot;:660,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:63616,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/197169160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b44cc9f-b156-44bf-9828-0435c168ec03_660x741.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Dv_S!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b44cc9f-b156-44bf-9828-0435c168ec03_660x741.png 424w, https://substackcdn.com/image/fetch/$s_!Dv_S!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b44cc9f-b156-44bf-9828-0435c168ec03_660x741.png 848w, https://substackcdn.com/image/fetch/$s_!Dv_S!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b44cc9f-b156-44bf-9828-0435c168ec03_660x741.png 1272w, https://substackcdn.com/image/fetch/$s_!Dv_S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b44cc9f-b156-44bf-9828-0435c168ec03_660x741.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In practice, the <strong>routing overhead &#8212; orchestration calls, context passed between model tiers &#8212; outweighed the savings</strong>. Total tokens (4,384) nearly tripled compared to H2 (1,556) for a lower quality score. <strong>Model routing only wins on cost when cheap tasks are large in volume, not merely low in complexity</strong>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FCl1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21ff7cef-d076-4ea3-963b-6d739ad6edce_892x637.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FCl1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21ff7cef-d076-4ea3-963b-6d739ad6edce_892x637.png 424w, https://substackcdn.com/image/fetch/$s_!FCl1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21ff7cef-d076-4ea3-963b-6d739ad6edce_892x637.png 848w, https://substackcdn.com/image/fetch/$s_!FCl1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21ff7cef-d076-4ea3-963b-6d739ad6edce_892x637.png 1272w, https://substackcdn.com/image/fetch/$s_!FCl1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21ff7cef-d076-4ea3-963b-6d739ad6edce_892x637.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FCl1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21ff7cef-d076-4ea3-963b-6d739ad6edce_892x637.png" width="892" height="637" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/21ff7cef-d076-4ea3-963b-6d739ad6edce_892x637.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:637,&quot;width&quot;:892,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:106228,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/197169160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21ff7cef-d076-4ea3-963b-6d739ad6edce_892x637.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FCl1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21ff7cef-d076-4ea3-963b-6d739ad6edce_892x637.png 424w, https://substackcdn.com/image/fetch/$s_!FCl1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21ff7cef-d076-4ea3-963b-6d739ad6edce_892x637.png 848w, https://substackcdn.com/image/fetch/$s_!FCl1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21ff7cef-d076-4ea3-963b-6d739ad6edce_892x637.png 1272w, https://substackcdn.com/image/fetch/$s_!FCl1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21ff7cef-d076-4ea3-963b-6d739ad6edce_892x637.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>H10: The Meta-Harness Failure (0.230)</h3><p>H10 ran 20 mixed tasks through a type-aware routing system (EVALUATE, EXTRACT, RESEARCH, GENERATE, COMPRESS), then ran the same 20 tasks through a single generalist configuration for comparison. The meta-harness scored 0.230. The generalist scored 0.473. The sophisticated routing system was outperformed by a single config by 0.243 points.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gmq8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc12db551-3ff3-4d88-8c83-19285cff5d58_1012x732.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gmq8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc12db551-3ff3-4d88-8c83-19285cff5d58_1012x732.png 424w, https://substackcdn.com/image/fetch/$s_!gmq8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc12db551-3ff3-4d88-8c83-19285cff5d58_1012x732.png 848w, https://substackcdn.com/image/fetch/$s_!gmq8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc12db551-3ff3-4d88-8c83-19285cff5d58_1012x732.png 1272w, https://substackcdn.com/image/fetch/$s_!gmq8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc12db551-3ff3-4d88-8c83-19285cff5d58_1012x732.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gmq8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc12db551-3ff3-4d88-8c83-19285cff5d58_1012x732.png" width="1012" height="732" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c12db551-3ff3-4d88-8c83-19285cff5d58_1012x732.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:732,&quot;width&quot;:1012,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:124499,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/197169160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc12db551-3ff3-4d88-8c83-19285cff5d58_1012x732.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gmq8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc12db551-3ff3-4d88-8c83-19285cff5d58_1012x732.png 424w, https://substackcdn.com/image/fetch/$s_!gmq8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc12db551-3ff3-4d88-8c83-19285cff5d58_1012x732.png 848w, https://substackcdn.com/image/fetch/$s_!gmq8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc12db551-3ff3-4d88-8c83-19285cff5d58_1012x732.png 1272w, https://substackcdn.com/image/fetch/$s_!gmq8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc12db551-3ff3-4d88-8c83-19285cff5d58_1012x732.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The immediate cause was a mismatch between the routing table&#8217;s max_tokens settings and the rubric&#8217;s expectations. EXTRACT and COMPRESS tasks were given 400-600 tokens, which produced thin outputs that the rubric &#8212; calibrated for detailed vendor analysis &#8212; scored at zero. But the root cause was more fundamental: <strong>the routing table was designed for one task type and applied to five. The meta-harness amplified the mismatch rather than correcting for it.</strong></p><p>This is the H10 lesson that applies most broadly:<strong> adaptive routing without per-type calibration produces worse results than a well-tuned generalist. The architecture was correct in principle. The implementation lacked the rubric alignment that would make it function.</strong></p><h2><strong>What to Take Forward</strong></h2><p><strong>For harness design generally:</strong> The data supports a simple hierarchy. Get behavioral specification right first &#8212; that is H2. Only add mechanism when you have a specific failure mode that mechanism addresses. Tools for external state. Parallelism for genuine independence. Evaluation loops for tasks where the acceptance criterion is not fully specifiable upfront. Sub-agents for tasks that require domain specialization the orchestrator cannot hold in one context.</p><p><strong>For the H10 failure specifically:</strong> If you want to rerun it properly, the fix is not a better routing table &#8212; it is separate rubrics per task type. EXTRACT tasks should score on precision and completeness of extracted fields, not on recommendation quality. COMPRESS tasks should score on information density and accuracy of reduction. Running the existing rubric against EXTRACT/COMPRESS output is category error scoring.</p><p><strong>For research purposes:</strong> This experiment produced a clean, citable empirical result &#8212; that prompt engineering at H2 quality outperforms multi-agent architectures on single-context reasoning tasks, and that complexity additions carry measurable quality penalties rather than gains until the task architecture genuinely requires them. That finding directly supports my reframe analysis: <strong>capability and deployment strategy are different optimization targets. Here, reasoning capability and agentic architecture are different optimization targets, and conflating them produces worse outcomes than addressing each on its own terms.</strong></p><p>The <a href="https://interestingengineering.substack.com/p/the-architecture-of-awareness-design">ASCRS case study would show the inverse pattern</a> &#8212; where <strong>the task genuinely requires parallelism (multiple concurrent supply chain checks), external tool state (live logistics APIs), and iterative evaluation (threshold-based rerouting decisions), and where H4/H5/H9 architectures would outperform H2 because the task structure matches the harness structure</strong>. That contrast &#8212; same harness stack, different task profile, inverted performance ranking &#8212; is the analytically interesting piece.</p><h2>The Architecture: How Claude Code Fits In</h2><p>To understand what these findings mean in context, it helps to understand the stack these experiments ran on &#8212; and what I believe Addy Osmani means when he calls Claude Code a HaaS runtime.</p><pre><code><code>&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;
&#9474;                Claude Code (HaaS Runtime)                   &#9474;   &#8592; Outer Harness-as-a-Service
&#9474;  &#8226; ReAct / Agent Loop                                       &#9474;
&#9474;  &#8226; Tool Registry + Permission Gates                         &#9474;
&#9474;  &#8226; Context Assembly + Compaction                            &#9474;
&#9474;  &#8226; 3-Layer Memory System (in-context + MEMORY.md + files)   &#9474;
&#9474;  &#8226; Sub-agent / Swarm Orchestration                          &#9474;
&#9474;  &#8226; Safety, Hooks, Streaming, Sandboxes                      &#9474;
&#9474;  &#8226; Persistent Filesystem + Execution Environment            &#9474;
&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9650;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;
                               &#9474;   (Configure + Extend)
                               &#9474;
&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9524;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;
&#9474;           Pre-Experiment Harness (Lab OS / Foundation)      &#9474;   &#8592; Standardized Configuration Surface
&#9474;  &#8226; shared/client.js (model routing)                         &#9474;
&#9474;  &#8226; shared/scorer.js + rubric system                         &#9474;
&#9474;  &#8226; shared/self_heal.js + logger                             &#9474;
&#9474;  &#8226; Root CLAUDE.md (constitution + lab-wide rules)           &#9474;
&#9474;  &#8226; Memory conventions + task templates                      &#9474;
&#9474;  &#8226; Observability &amp; results pipeline                         &#9474;
&#9474;  &#8226; Common tools &amp; utilities                                 &#9474;
&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9650;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;
                               &#9474;   (Swappable)
                               &#9474;
&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9524;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;
&#9474;           Experiment Harness Layer (H1&#8211;H10)                 &#9474;   &#8592; Domain-Specific Scaffolding
&#9474;                                                             &#9474;
&#9474;  H1: Prompt + Constitution                                  &#9474;
&#9474;  H2: Reflection + Self-Critique Loop                        &#9474;
&#9474;  H3: Sequential Tool-Use                                    &#9474;
&#9474;  H4: Parallel Fan-Out + Merge                               &#9474;
&#9474;  H5: Eval + Revision Loop                                   &#9474;
&#9474;  H6: Skill / Memory Crystallization                         &#9474;
&#9474;  H7: Model Routing / Tiered                                 &#9474;
&#9474;  H8: HITL + Confidence Gating                               &#9474;
&#9474;  H9: Sub-Agent Swarm                                        &#9474;
&#9474;  H10: Meta-Router / Adaptive                                &#9474;
&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9650;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;
                               &#9474;
&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9524;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;
&#9474;                     Core Model                              &#9474;
&#9474;                gpt-oss-120b / Claude Sonnet etc.            &#9474;
&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;</code></code></pre><p><strong>Harness-as-a-Service</strong> is a framing that distinguishes between the platform you use and the configuration you build on top of it. Claude Code is the platform. It provides the execution loop, the tool registry, the memory system, the safety gates, and the context management machinery. You do not build those things. You configure them &#8212; through CLAUDE.md files, through tool definitions, through the structure of your prompts and the organisation of your files.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!J-4p!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca169c81-b99b-4c5d-a879-0d0429ebe367_1170x361.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!J-4p!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca169c81-b99b-4c5d-a879-0d0429ebe367_1170x361.png 424w, https://substackcdn.com/image/fetch/$s_!J-4p!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca169c81-b99b-4c5d-a879-0d0429ebe367_1170x361.png 848w, https://substackcdn.com/image/fetch/$s_!J-4p!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca169c81-b99b-4c5d-a879-0d0429ebe367_1170x361.png 1272w, https://substackcdn.com/image/fetch/$s_!J-4p!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca169c81-b99b-4c5d-a879-0d0429ebe367_1170x361.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!J-4p!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca169c81-b99b-4c5d-a879-0d0429ebe367_1170x361.png" width="1170" height="361" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ca169c81-b99b-4c5d-a879-0d0429ebe367_1170x361.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:361,&quot;width&quot;:1170,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:59935,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/197169160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca169c81-b99b-4c5d-a879-0d0429ebe367_1170x361.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!J-4p!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca169c81-b99b-4c5d-a879-0d0429ebe367_1170x361.png 424w, https://substackcdn.com/image/fetch/$s_!J-4p!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca169c81-b99b-4c5d-a879-0d0429ebe367_1170x361.png 848w, https://substackcdn.com/image/fetch/$s_!J-4p!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca169c81-b99b-4c5d-a879-0d0429ebe367_1170x361.png 1272w, https://substackcdn.com/image/fetch/$s_!J-4p!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca169c81-b99b-4c5d-a879-0d0429ebe367_1170x361.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: center;"><em>Table 2. The four-layer HaaS architecture. Each layer has a single, non-overlapping responsibility.</em></p><p>The <strong>March 2026 Claude Code source leak</strong> &#8212; when a debugging artifact was accidentally included in an npm release, exposing the internal architecture &#8212; confirmed what practitioners had suspected: <strong>Claude Code is a sophisticated single-primary-loop agent with approximately 40-54 permission-gated tools, a three-layer memory system (in-context state, a MEMORY.md index file, and a background daemon), and a QueryEngine for large-context retrieval. It is not a simple chatbot with file access. It is a production-grade agentic runtime.</strong></p><p>Boris Cherny, the engineer who created Claude Code, describes his vision in terms that connect to this experiment&#8217;s findings. He imagines a world where anyone can build software &#8212; where the mechanical translation between intent and implementation can be delegated to an AI agent, much as the printing press delegated the mechanical reproduction of text. <strong>The constraint is not model capability. It is the clarity of the specification.</strong></p><blockquote><p><em><strong>&#8220;I imagine a world where everyone is able to program. Anyone can just build software anytime.&#8221; &#8212; Boris Cherny, Anthropic</strong></em></p></blockquote><p>This is precisely what H2 demonstrated at the experiment scale. The bare model (H1) had all the capability needed to complete the task correctly. What it lacked was clear specification. Forty additional tokens of role, weights, and output schema lifted quality from 0.665 to 0.920. The bottleneck was never intelligence. It was instruction.</p><h2>Thoughts on Claude Code, ForgeCode and the Alternative Agent Harness Architecture</h2><p>I also started looking into Terminal Bench 2.0 and note how different Agent Harnesses affect model outcome:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LM6m!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b8637c4-70ee-48bd-af16-b574419b1d98_943x713.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LM6m!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b8637c4-70ee-48bd-af16-b574419b1d98_943x713.png 424w, https://substackcdn.com/image/fetch/$s_!LM6m!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b8637c4-70ee-48bd-af16-b574419b1d98_943x713.png 848w, https://substackcdn.com/image/fetch/$s_!LM6m!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b8637c4-70ee-48bd-af16-b574419b1d98_943x713.png 1272w, https://substackcdn.com/image/fetch/$s_!LM6m!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b8637c4-70ee-48bd-af16-b574419b1d98_943x713.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LM6m!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b8637c4-70ee-48bd-af16-b574419b1d98_943x713.png" width="943" height="713" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0b8637c4-70ee-48bd-af16-b574419b1d98_943x713.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:713,&quot;width&quot;:943,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:166814,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/197169160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b8637c4-70ee-48bd-af16-b574419b1d98_943x713.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LM6m!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b8637c4-70ee-48bd-af16-b574419b1d98_943x713.png 424w, https://substackcdn.com/image/fetch/$s_!LM6m!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b8637c4-70ee-48bd-af16-b574419b1d98_943x713.png 848w, https://substackcdn.com/image/fetch/$s_!LM6m!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b8637c4-70ee-48bd-af16-b574419b1d98_943x713.png 1272w, https://substackcdn.com/image/fetch/$s_!LM6m!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b8637c4-70ee-48bd-af16-b574419b1d98_943x713.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WpUy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7c5ef7-5282-41d0-a6b4-05e5184c3e99_917x663.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WpUy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7c5ef7-5282-41d0-a6b4-05e5184c3e99_917x663.png 424w, https://substackcdn.com/image/fetch/$s_!WpUy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7c5ef7-5282-41d0-a6b4-05e5184c3e99_917x663.png 848w, https://substackcdn.com/image/fetch/$s_!WpUy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7c5ef7-5282-41d0-a6b4-05e5184c3e99_917x663.png 1272w, https://substackcdn.com/image/fetch/$s_!WpUy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7c5ef7-5282-41d0-a6b4-05e5184c3e99_917x663.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WpUy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7c5ef7-5282-41d0-a6b4-05e5184c3e99_917x663.png" width="917" height="663" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d7c5ef7-5282-41d0-a6b4-05e5184c3e99_917x663.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:663,&quot;width&quot;:917,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:162344,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/197169160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7c5ef7-5282-41d0-a6b4-05e5184c3e99_917x663.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WpUy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7c5ef7-5282-41d0-a6b4-05e5184c3e99_917x663.png 424w, https://substackcdn.com/image/fetch/$s_!WpUy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7c5ef7-5282-41d0-a6b4-05e5184c3e99_917x663.png 848w, https://substackcdn.com/image/fetch/$s_!WpUy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7c5ef7-5282-41d0-a6b4-05e5184c3e99_917x663.png 1272w, https://substackcdn.com/image/fetch/$s_!WpUy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7c5ef7-5282-41d0-a6b4-05e5184c3e99_917x663.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Separately On X:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kyOv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12140603-967f-4edd-8124-ec9b4d03ad23_832x837.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kyOv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12140603-967f-4edd-8124-ec9b4d03ad23_832x837.png 424w, https://substackcdn.com/image/fetch/$s_!kyOv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12140603-967f-4edd-8124-ec9b4d03ad23_832x837.png 848w, https://substackcdn.com/image/fetch/$s_!kyOv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12140603-967f-4edd-8124-ec9b4d03ad23_832x837.png 1272w, https://substackcdn.com/image/fetch/$s_!kyOv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12140603-967f-4edd-8124-ec9b4d03ad23_832x837.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kyOv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12140603-967f-4edd-8124-ec9b4d03ad23_832x837.png" width="832" height="837" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/12140603-967f-4edd-8124-ec9b4d03ad23_832x837.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:837,&quot;width&quot;:832,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:231469,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/197169160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12140603-967f-4edd-8124-ec9b4d03ad23_832x837.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kyOv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12140603-967f-4edd-8124-ec9b4d03ad23_832x837.png 424w, https://substackcdn.com/image/fetch/$s_!kyOv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12140603-967f-4edd-8124-ec9b4d03ad23_832x837.png 848w, https://substackcdn.com/image/fetch/$s_!kyOv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12140603-967f-4edd-8124-ec9b4d03ad23_832x837.png 1272w, https://substackcdn.com/image/fetch/$s_!kyOv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12140603-967f-4edd-8124-ec9b4d03ad23_832x837.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A useful contrast between Claude Code&#8217;s architecture &#8212; I describe as a Solo Genius, reactive and deep &#8212; and ForgeCode&#8217;s Multi-Agent Factory architecture, which uses specialised roles (Muse for planning, Forge for execution, Sage for review) with proactive state mapping, bounded context, and T3 verification gates.</p><p>This contrast maps precisely onto the experiment&#8217;s H4 failure. ForgeCode&#8217;s Sage role is a designated reviewer that checks sub-agent outputs for consistency before they are merged. A naive parallel fan-out, which is what H4 implemented, has no equivalent. The coherence collapse in H4 would not have occurred &#8212; or would have been caught before scoring &#8212; in a properly implemented specialised-role architecture.</p><p>The Terminal-Bench 2.0 data makes this concrete. Claude Opus 4.6 running in Claude Code scores 58.0%. The same model running in ForgeCode scores 79.8% &#8212; a 21.8-point improvement from harness architecture alone. One team moved a coding agent from the top forty to the top five by changing only the harness. The model was identical. The instructions around it were not.</p><p>The key insight from Osmani&#8217;s harness engineering framework: every component in a good harness should be traceable to a specific failure you observed. You do not add a verification gate because verification gates are fashionable. You add one because you watched a model declare a task complete when it was not. ForgeCode&#8217;s Sage exists because parallel coherence failures are a documented failure mode in multi-agent systems. H4 demonstrated exactly that failure mode without a Sage to catch it.</p><p>I could hypothetically update and refine the full stack, incorporating the key differences between Claude Code (Flat/Reactive Loop) and ForgeCode (Governed/Multi-Agent Factory) while staying true to Addy Osmani&#8217;s Harness-as-a-Service philosophy.</p><p><strong>Refined High-Level Stack (HaaS-Aligned)</strong></p><pre><code><code>&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;
&#9474;          Outer HaaS Runtime (Choose One)                    &#9474;   &#8592; Platform Layer
&#9474;                                                             &#9474;
&#9474;   &#8226; Claude Code: Flat ReAct Loop + Memory.md                &#9474;
&#9474;     &#8594; "Solo Genius" &#8212; Fast, reactive, high reasoning        &#9474;
&#9474;     &#8594; Strong on deep context but higher invocation errors   &#9474;
&#9474;                              OR                             &#9474;
&#9474;   &#8226; ForgeCode: Multi-Agent Swarm (Manager + Librarian +     &#9474;
&#9474;     Worker + Critic / Muse + Forge + Sage)                  &#9474;
&#9474;     &#8594; "Managed Factory" &#8212; Proactive mapping, T3 verification&#9474;
&#9474;     &#8594; Higher reliability &amp; lower execution errors           &#9474;
&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9650;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;
                               &#9474;   (Configure &amp; Extend)
                               &#9474;
&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9524;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;
&#9474;        Pre-Experiment Harness (Lab OS / Foundation)       &#8592; Standardized Configuration Surface
&#9474;  &#8226; shared/client.js (model + provider routing)              &#9474;
&#9474;  &#8226; Root CLAUDE.md + AGENTS.md style constitution            &#9474;
&#9474;  &#8226; Scorer, Self-Heal, Logger, Memory conventions            &#9474;
&#9474;  &#8226; Task templates + multi-task support                      &#9474;
&#9474;  &#8226; State Manifest / Pre-mapping utilities                 &#8592; Especially powerful with ForgeCode
&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9650;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;
                               &#9474;   (Swappable Scaffolding)
                               &#9474;
&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9524;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;
&#9474;           Experiment Harness Layer (H1&#8211;H10)                &#8592; Testable Domain-Specific Patterns
&#9474;                                                             &#9474;
&#9474;  H1: Prompt + Constitution          H2: Reflection Loop     &#9474;
&#9474;  H3: Sequential Tools               H4: Parallel + Merge    &#9474;
&#9474;  H5: Eval + Revision                H6: Skill Crystallization&#9474;
&#9474;  H7: Model Routing                  H8: HITL Gating         &#9474;
&#9474;  H9: Sub-Agent Swarm                H10: Meta-Router        &#9474;
&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9650;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;
                               &#9474;
&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9524;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;
&#9474;                     Core Model                              &#9474;
&#9474;          gpt-oss-120b, Claude Sonnet/Opus, etc.             &#9474;
&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;</code></code></pre><p><strong>Key Insights from the Redesign</strong></p><ul><li><p><strong>Outer Layer is now explicitly swappable</strong> &#8212; You can run the exact same middle layers (Pre-Experiment + Experiments) on top of either Claude Code or ForgeCode. This makes your lab a true harness comparison platform.</p></li><li><p><strong>ForgeCode&#8217;s strengths (Librarian-style pre-mapping, multi-agent specialization, strong verification gates) are reflected by enhancing the Pre-Experiment Harness with better state manifests and tool registries</strong>. This turns the &#8220;Dark Room&#8221; problem into a &#8220;Mapped Lab.&#8221;</p></li><li><p><strong>Claude Code&#8217;s strengths (strong single-agent reasoning) pair beautifully with simpler Experiment Harnesses like H1 and H2.</strong></p></li><li><p><strong>The Experiment Layer remains the fungible testing ground</strong>. You can now compare how the same H9 (Sub-Agent Swarm) performs differently when the outer runtime is Claude Code vs ForgeCode.</p></li><li><p><strong>Pre-Experiment Harness gains a State Manifest component </strong>&#8212; this is inspired by ForgeCode&#8217;s proactive environment mapping and makes the whole stack more robust regardless of which outer harness you use.</p></li></ul><h2>What This Means: The Deeper Pattern</h2><p>The lift table tells a story. But the story worth telling is not which harness scored highest &#8212; it is what the pattern of results reveals about where AI systems actually fail, and what actually fixes them.</p><h3>The Specification Problem Is the Problem</h3><p>H1 to H2 is the biggest single quality jump in the entire experiment: 0.665 to 0.920, a gain of 0.255 points. Every subsequent experiment, including ones that added substantially more architectural complexity, failed to produce a comparable improvement. The prompt harness is the most efficient intervention in the stack.</p><p>This is not a finding unique to this experiment. Osmani documents it in his LLM coding workflow: </p><blockquote><p><strong>The biggest lever in AI-assisted work is not the model choice or the architecture choice &#8212; it is the quality of the specification. Vague instructions produce vague results. Precise instructions produce precise results. This is obvious in principle and systematically ignored in practice, because adding architectural complexity feels like doing something while improving a specification feels like writing documentation.</strong></p></blockquote><h3>Complexity Has a Cost That Is Rarely Measured</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Y8vr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d055e48-8e40-44ec-b93a-4b88a473317a_1258x498.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Y8vr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d055e48-8e40-44ec-b93a-4b88a473317a_1258x498.png 424w, https://substackcdn.com/image/fetch/$s_!Y8vr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d055e48-8e40-44ec-b93a-4b88a473317a_1258x498.png 848w, https://substackcdn.com/image/fetch/$s_!Y8vr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d055e48-8e40-44ec-b93a-4b88a473317a_1258x498.png 1272w, https://substackcdn.com/image/fetch/$s_!Y8vr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d055e48-8e40-44ec-b93a-4b88a473317a_1258x498.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Y8vr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d055e48-8e40-44ec-b93a-4b88a473317a_1258x498.png" width="1258" height="498" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9d055e48-8e40-44ec-b93a-4b88a473317a_1258x498.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:498,&quot;width&quot;:1258,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:44063,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/197169160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d055e48-8e40-44ec-b93a-4b88a473317a_1258x498.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Y8vr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d055e48-8e40-44ec-b93a-4b88a473317a_1258x498.png 424w, https://substackcdn.com/image/fetch/$s_!Y8vr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d055e48-8e40-44ec-b93a-4b88a473317a_1258x498.png 848w, https://substackcdn.com/image/fetch/$s_!Y8vr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d055e48-8e40-44ec-b93a-4b88a473317a_1258x498.png 1272w, https://substackcdn.com/image/fetch/$s_!Y8vr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d055e48-8e40-44ec-b93a-4b88a473317a_1258x498.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The results table shows total token counts alongside quality scores. This is not just a cost metric. Tokens are a proxy for how much cognitive overhead the architecture imposes. H9 (sub-agent swarm) consumed 3,831 tokens to score 0.815. H2 consumed 1,556 tokens to score 0.920. The swarm was 2.5 times more expensive and produced worse results.</p><p>What those additional tokens represent is coordination overhead &#8212; the messages between orchestrator and sub-agents, the context that must be rebuilt at each handoff, the tool call scaffolding that wraps each action. This is the cost of complexity, and it is rarely included in the analysis when multi-agent architectures are proposed.</p><p>Osmani&#8217;s framing of context rot is relevant here: model performance degrades as the context window fills, well before the hard limit. A long, complex multi-agent session is not just more expensive than a short, focused single-agent session &#8212; it may be actively worse, because the model at turn 40 is operating with degraded coherence compared to the model at turn 5.</p><h3>The Task Has to Match the Architecture</h3><p>This is the qualification that prevents the experiment from being misread as an argument against multi-agent systems generally. The vendor evaluation task is fundamentally <strong>a single-context reasoning problem. All the information needed to solve it is available at the start. There are no external systems to query, no genuinely parallel workstreams, no acceptance criterion that cannot be specified upfront.</strong></p><p><strong>Multi-agent architectures exist because some tasks genuinely require them: long-horizon software projects where different specialists own different layers; research tasks that require simultaneous querying of multiple external sources; production systems that need continuous monitoring and autonomous response. These tasks have structures that map to multi-agent architectures.</strong></p><p><em><strong>The vendor evaluation task does not. Deploying a sub-agent swarm on it is like hiring a project management team to write a one-page memo</strong></em>. The overhead is real. The memo is not better for it.</p><blockquote><p><em><strong>&#8220;The right harness for your codebase is shaped by your failure history. You cannot download it.&#8221; &#8212; Addy Osmani</strong></em></p></blockquote><p>The H10 failure makes this concrete. Routing 20 mixed tasks to type-specific configurations was correct in architectural principle. It failed because the rubric was not calibrated per type. An EXTRACT task should be judged on precision and completeness of extracted fields. A COMPRESS task should be judged on information density and accuracy of reduction. Running them both through an evaluation rubric designed for detailed analytical reasoning was category error scoring. The architecture was not wrong. The measurement harness was wrong.</p><p>This was my directory structure:</p><pre><code><code>harness-lab/
&#9500;&#9472;&#9472; IMPLEMENTATION_GUIDE.md
&#9500;&#9472;&#9472; CLAUDE.md                          # Root constitution
&#9500;&#9472;&#9472; package.json
&#9500;&#9472;&#9472; .env.example
&#9500;&#9472;&#9472; .env
&#9500;&#9472;&#9472; .gitignore
&#9474;
&#9500;&#9472;&#9472; shared/                            # Pre-Experiment Harness (Lab OS)
&#9474;   &#9500;&#9472;&#9472; client.js
&#9474;   &#9500;&#9472;&#9472; task.md                        # Default task (vendor selection)
&#9474;   &#9500;&#9472;&#9472; tasks/                         # Multi-task support
&#9474;   &#9500;&#9472;&#9472; rubric.json
&#9474;   &#9500;&#9472;&#9472; gold_answer.md
&#9474;   &#9500;&#9472;&#9472; scorer.js
&#9474;   &#9500;&#9472;&#9472; self_heal.js
&#9474;   &#9500;&#9472;&#9472; logger.js
&#9474;   &#9500;&#9472;&#9472; memory/                        # Crystallized skills
&#9474;   &#9492;&#9472;&#9472; tools/
&#9474;
&#9500;&#9472;&#9472; experiments/
&#9474;   &#9500;&#9472;&#9472; pre-experiment/                # Foundation test
&#9474;   &#9500;&#9472;&#9472; h1-prompt-constitution/
&#9474;   &#9500;&#9472;&#9472; h2-reflection-loop/
&#9474;   &#9500;&#9472;&#9472; h3-sequential-tools/
&#9474;   &#9500;&#9472;&#9472; h4-parallel-merge/
&#9474;   &#9500;&#9472;&#9472; h5-eval-revision/
&#9474;   &#9500;&#9472;&#9472; h6-skill-memory/
&#9474;   &#9500;&#9472;&#9472; h7-model-routing/
&#9474;   &#9500;&#9472;&#9472; h8-hitl-gating/
&#9474;   &#9500;&#9472;&#9472; h9-subagent-swarm/
&#9474;   &#9492;&#9472;&#9472; h10-meta-router/
&#9474;
&#9500;&#9472;&#9472; results/
&#9474;   &#9492;&#9472;&#9472; results.jsonl
&#9474;
&#9492;&#9472;&#9472; scripts/
    &#9500;&#9472;&#9472; run_all.js
    &#9492;&#9472;&#9472; compare.js</code></code></pre><h3>What Changes With Better Models</h3><p>There is an important caveat to the experiment&#8217;s findings: <strong>model capability interacts with harness design in ways that shift with each new model generation</strong>. </p><p><strong>This experiment ran on gpt-oss-120b</strong>. A frontier model might handle H4&#8217;s coherence problem better &#8212; not because parallelism is inherently coherent, but because a stronger model might infer the need for consistent scoring scales without being told.</p><p>This is the trajectory Boris Cherny describes when he talks about coding being largely solved. <strong>As models improve, some harness components that exist to compensate for model limitations become redundant. The history of software is full of abstractions that existed to manage hardware constraints that eventually became cheap &#8212; the constraints disappeared and the abstractions were abandoned.</strong></p><p>But<strong> the specification problem is not a model limitation problem. It is a human communication problem. A model can only do what it is asked to do</strong>. The precision with which you articulate what you want does not become less important as models become more capable &#8212; it may become more important, because more capable models execute imprecise specifications more confidently and at greater scale.</p><h2>Running the Experiment</h2><p>For me, the Harness Lab project is designed to be reproduced. The full implementation guide &#8212; which served as the specification Claude Code read to build the project &#8212; is approximately 2,300 lines of structured Markdown covering every file in the project. Claude Code built the entire thing from that document in under ten minutes.</p><p>The practical setup: <em>VS Code, the Claude Code extension, and an OpenRouter API key. </em></p><p>One practical note on model selection: <strong>using the same model as both the experiment model and the scorer model produces inflated scores, because models tend to grade their own outputs generously. Setting a different model as scorer &#8212; or using a model from a different provider &#8212; produces more discriminating evaluations.</strong></p><p>The bootstrap process is a single prompt to Claude Code:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TIM5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc789fdf-179f-4651-97bb-c2e9dd3076a7_1065x167.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TIM5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc789fdf-179f-4651-97bb-c2e9dd3076a7_1065x167.png 424w, https://substackcdn.com/image/fetch/$s_!TIM5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc789fdf-179f-4651-97bb-c2e9dd3076a7_1065x167.png 848w, https://substackcdn.com/image/fetch/$s_!TIM5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc789fdf-179f-4651-97bb-c2e9dd3076a7_1065x167.png 1272w, https://substackcdn.com/image/fetch/$s_!TIM5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc789fdf-179f-4651-97bb-c2e9dd3076a7_1065x167.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TIM5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc789fdf-179f-4651-97bb-c2e9dd3076a7_1065x167.png" width="1065" height="167" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fc789fdf-179f-4651-97bb-c2e9dd3076a7_1065x167.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:167,&quot;width&quot;:1065,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:22472,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/197169160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc789fdf-179f-4651-97bb-c2e9dd3076a7_1065x167.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TIM5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc789fdf-179f-4651-97bb-c2e9dd3076a7_1065x167.png 424w, https://substackcdn.com/image/fetch/$s_!TIM5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc789fdf-179f-4651-97bb-c2e9dd3076a7_1065x167.png 848w, https://substackcdn.com/image/fetch/$s_!TIM5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc789fdf-179f-4651-97bb-c2e9dd3076a7_1065x167.png 1272w, https://substackcdn.com/image/fetch/$s_!TIM5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc789fdf-179f-4651-97bb-c2e9dd3076a7_1065x167.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Claude Code then builds the entire project autonomously, creating over 30 files across the directory structure. Each experiment subsequently runs as a single npm command: npm run h1 through npm run h10. The compare script prints the full lift table after all experiments complete.</p><h2>Conclusions: What the Numbers Say</h2><p>The lift table for this experiment tells a story in numbers. The story in words is this:</p><p>&#8211; <strong>Instruction quality is the highest-leverage variable in AI system performance</strong>, and it costs almost nothing relative to architectural complexity.</p><p>&#8211; <strong>Multi-agent architectures add value when the task genuinely requires parallelism, external state, or specialised domain knowledge that cannot be held in one context.</strong> They add overhead, coherence risk, and cost when the task does not.</p><p>&#8211; <strong>Evaluation design is inseparable from system design</strong>. A rubric that cannot detect quality differences makes every architecture look equivalent. The first version of this experiment&#8217;s rubric produced exactly that false equivalence.</p><p>&#8211; <strong>Skill crystallisation works.</strong> A prompt optimised through an eval loop and saved as a skill file transferred cleanly to new data. This is the practical mechanism behind Osmani&#8217;s agent skills framework.</p><p>&#8211; <strong>Adaptive routing requires per-type calibration throughout the entire stack, including the evaluation rubric.</strong> A routing table without rubric alignment will amplify mismatches rather than correct for them.</p><p>None of these findings are arguments against building sophisticated AI systems. They are arguments for <strong>building the right system for the task</strong>, measuring the outcome honestly, and being honest about what the complexity is actually buying you.</p><p>Boris Cherny&#8217;s vision is one where everyone can build software &#8212; where the barrier is not technical skill but clear communication of what you want. The experiment confirms this from the other direction: when the communication is clear, the architecture becomes almost irrelevant. When the communication is unclear, no architecture compensates for it.</p><div id="youtube2-SlGRN8jh2RI" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;SlGRN8jh2RI&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/SlGRN8jh2RI?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>The harness is not the intelligence. The harness is the frame around the intelligence. And a frame that does not fit the picture makes the picture worse, not better.</p><h3><strong>POST-NOTE: Would /goal Have Changed the Results?</strong></h3><p><em>On Ralph Loop 2.0, Persistent Goals, and What Automated Continuation Actually Fixes</em></p><p>Since this article was completed, <code>/goal</code> has shipped as a first-class feature in both <a href="https://www.mindstudio.ai/blog/codex-goal-ralph-loop-14-hour-autonomous-task">Codex CLI</a> and <a href="https://hermes-agent.nousresearch.com/docs/user-guide/features/goals">Hermes Agent</a> (Nous Research&#8217;s independent implementation). Both teams describe it as their take on the <strong>Ralph Loop</strong> &#8212; a pattern named after a fictional agent who simply keeps working until the job is done. The 2.0 designation distinguishes it from the original concept by adding a structural element: a separate judge model that evaluates after every turn whether the goal has been achieved, rather than relying on the agent to decide when to stop.</p><p>After every turn, Hermes calls an auxiliary model with the standing goal text, the agent&#8217;s most recent final response, and a system prompt telling the judge to reply with strict JSON: <code>{"done": bool, "reason": "one-sentence rationale"}</code>. The judge is deliberately conservative &#8212; it marks a goal done only when the response explicitly confirms completion, when the final deliverable is clearly produced, or when the goal is unachievable. If the judge errors for any reason, Hermes treats the verdict as continue rather than stop, so a broken judge never wedges progress. The turn budget is the real backstop, defaulting to 20 continuation turns. <a href="https://haverin.substack.com/p/what-if-pandora-had-found-a-harness">Substack</a></p><p>The question for this experiment is specific: would <code>/goal</code> have moved the needle on the results that mattered?</p><p><strong>Where it would have directly helped:</strong></p><p><em>H5 (Eval Loop, 0.840).</em> H5 manually implemented a generate-score-revise loop capped at three generations. <code>/goal</code> would have done this automatically and without the artificial ceiling. You would set the goal as &#8220;produce a vendor evaluation that scores above 0.70 on all rubric criteria&#8221; and the judge would keep iterating until that threshold was met or the turn budget was exhausted. Crucially, the judge is a separate model from the experiment model &#8212; structurally identical to the scorer-as-judge pattern the experiment already used, but now embedded in the loop rather than bolted on externally. H5&#8217;s manual N=3 cap was arbitrary; <code>/goal</code> removes that arbitrariness.</p><p><em>H10 (Meta-harness, 0.230).</em> The meta-harness declared completion on EXTRACT and COMPRESS tasks that scored zero because they were being judged by the wrong rubric. A <code>/goal</code> judge set to &#8220;all 20 tasks must produce a non-zero score&#8221; would have caught those failures before the session closed, flagged the zero-scoring task types, and continued working. Whether the continuation would have fixed the underlying rubric mismatch is a separate question &#8212; but it would at minimum have prevented premature declaration of completion on demonstrably incomplete work.</p><p><em>The weak verification failure class (~15% of Terminal-Bench failures).</em> The <code>/goal</code> judge is, architecturally, an automated verification gate. It is the same structural intervention as H8&#8217;s HITL checkpoint but with an auxiliary model playing the reviewer role rather than a simulated human pass. Every experiment that scored poorly on the rubric&#8217;s &#8220;confidence calibration&#8221; criterion &#8212; agents declaring high confidence on thin outputs &#8212; would have been extended rather than closed.</p><p><strong>Where it would not have helped:</strong></p><p><em>H4 (Parallel fan-out, 0.440).</em> The coherence collapse in H4 was not a stopping problem. Three agents calibrated their 0-10 scoring scales independently and the inconsistency was baked in before any output was produced. A <code>/goal</code> judge running after the merge would have caught the low score and continued &#8212; but continuation here means asking the same incoherent merge to try again, not fixing the underlying calibration problem. The fix for H4 is architectural: a verification gate between fan-out and merge (ForgeCode&#8217;s Sage role), not a persistence mechanism after the fact.</p><p><em>H1 (Bare model, 0.665).</em> The bare model did not fail because it stopped too early. It failed because it had no specification. <code>/goal</code> would have kept it working longer toward a goal it lacked the instructions to meet. More turns of a poorly specified task produces more poorly specified output, not better output. This is the same principle that explains why the experiment found no correlation between episode count and success rate in Terminal-Bench: persistence cannot substitute for specification.</p><p><em>H2 (Prompt harness, 0.920 &#8212; the winner).</em> H2 already reached its quality ceiling in a single turn. A persistence mechanism adds nothing to a task that is already solved on the first attempt.</p><p><strong>The architectural insight:</strong></p><p><code>/goal</code> is described by Hermes as tasks where &#8220;you&#8217;d otherwise have to say &#8216;keep going&#8217; three times.&#8221; That framing is precise and useful. It targets the session persistence problem &#8212; the gap between what a model can accomplish in a single turn and what it can accomplish if allowed to iterate toward a stated objective. What it does not target is the specification problem (H1 and H2), the coherence problem (H4), or the rubric calibration problem (H10&#8217;s real failure). <a href="https://haverin.substack.com/p/what-if-pandora-had-found-a-harness">Substack</a></p><p>The distinction maps onto two different categories of harness failure that the experiment separated empirically: failures of <em>continuation</em> (the agent stopped before the goal was reached) and failures of <em>specification</em> (the agent never had a clear enough goal to reach). Ralph Loop 2.0 is a structural solution to the first category. It does not address the second. The experiment&#8217;s most important finding &#8212; that H2&#8217;s clear instruction outperformed every architecture including H9&#8217;s sub-agent swarm &#8212; belongs entirely to the specification category that <code>/goal</code> does not touch.</p><p>That said, combining <code>/goal</code> with H5&#8217;s eval loop would produce something meaningfully more powerful than either alone: a persistent goal with an automated judge that also revises the approach, not just continues the attempt. That combination &#8212; goal persistence plus approach revision &#8212; is the architecture the experiment was approximating manually in H5. It would be the correct H11 if the experiment were extended.</p><p><em>Hermes Agent </em><code>/goal</code><em> documentation:</em> <a href="https://hermes-agent.nousresearch.com/docs/user-guide/features/goals">hermes-agent.nousresearch.com/docs/user-guide/features/goals</a> &#8212; <em>Codex CLI </em><code>/goal</code><em> (Eric Traut, OpenAI):</em> <a href="https://github.com/openai/codex">github.com/openai/codex</a></p><h2>References and Further Reading</h2><p>The following sources informed this analysis:</p><p><strong>Harness Engineering &amp; HaaS</strong></p><p>Addy Osmani &#8212; Agent Harness Engineering: <a href="https://addyosmani.com/blog/agent-harness-engineering/">addyosmani.com/blog/agent-harness-engineering</a></p><p>Addy Osmani &#8212; Long-running Agents: <a href="https://addyo.substack.com/p/long-running-agents">addyo.substack.com/p/long-running-agents</a></p><p>Addy Osmani &#8212; Agent Skills: <a href="https://addyosmani.com/blog/agent-skills/">addyosmani.com/blog/agent-skills</a></p><p>Addy Osmani &#8212; Future of Agentic Coding (Conductors to Orchestrators): <a href="https://addyosmani.com/blog/future-agentic-coding/">addyosmani.com/blog/future-agentic-coding</a></p><p>Martin Fowler / Birgitta B&#246;ckeler &#8212; Agent Harness Engineering (ThoughtWorks): <a href="https://martinfowler.com/articles/agent-harness.html">martinfowler.com/articles/agent-harness</a></p><p>Complete Claude Code Harness Engineering Guide (5 Layers): <a href="https://dev.to/shipwithaiio/the-complete-claude-code-harness-engineering-guide-5-layers-8-deep-dives-3d4j">dev.to/shipwithaiio/harness-engineering-guide</a></p><p><strong>Boris Cherny &amp; Claude Code</strong></p><p>Boris Cherny at AI Ascent 2026 &#8212; Why Coding Is Solved, and What Comes Next: <a href="https://www.youtube.com/watch?v=SlGRN8jh2RI">youtube.com/watch?v=SlGRN8jh2RI</a></p><p>The Claude Code Handbook (freeCodeCamp): <a href="https://www.freecodecamp.org/news/claude-code-handbook/">freecodecamp.org/news/claude-code-handbook</a></p><p>8 Insights from Boris Cherny (Waydev): <a href="https://waydev.co/8-game-changing-insights-from-anthropic-claudecode-boris-cherny/">waydev.co/boris-cherny-insights</a></p><p>Great Claude Code Leak of March 2026 &#8212; Denser.ai deep-dive: <a href="https://denser.ai/blog/claude-code-leak/">denser.ai/blog/claude-code-leak</a></p><p><strong>Terminal-Bench &amp; Benchmark Research</strong></p><p>Terminal-Bench 2.0 &#8212; Benchmarking Agents on Hard, Realistic Tasks (arXiv): <a href="https://arxiv.org/abs/2601.11868">arxiv.org/abs/2601.11868</a></p><p>Terminal-Bench Leaderboard: <a href="https://www.tbench.ai/leaderboard/terminal-bench/2.0">tbench.ai/leaderboard/terminal-bench/2.0</a></p><p><strong>Agentic Engineering Context</strong></p><p>From Vibe Coding to Agentic Engineering: <a href="https://dev.to/jasonguo/from-vibe-coding-to-agentic-engineering-when-coding-becomes-orchestrating-agents-1b0n">dev.to/jasonguo/agentic-engineering</a></p><p>OpenRouter Free Models: <a href="https://openrouter.ai/collections/free-models">openrouter.ai/collections/free-models</a></p><p><strong>Key arXiv / Research Papers</strong></p><ol><li><p><strong>Towards a Science of Scaling Agent Systems</strong> (Google DeepMind + collaborators, Dec 2025)<br>Link: <a href="https://arxiv.org/html/2512.08296v1">https://arxiv.org/html/2512.08296v1</a><br>Relevance: The strongest and most cited empirical study. Tested 180+ configurations across models and benchmarks. Overall, MAS showed -3.5% mean performance vs single agents. Multi-agent hurt sequential planning by 39&#8211;70%, with massive error amplification (up to 17.2&#215; in independent swarms). Coordination overhead dominated on tool-heavy or high baseline tasks. Strong support for &#8220;simple when possible.&#8221;</p></li><li><p><strong>Single-agent or Multi-agent Systems? Why Not Both? </strong>(May 2025)<br>Link: <a href="https://arxiv.org/html/2505.18286v1">https://arxiv.org/html/2505.18286v1</a><br>Relevance: MAS incurs 4&#8211;220&#215; more tokens and significantly higher latency/cost. Even with strong models, the coordination overhead often negates benefits. Includes failure examples (e.g., debate-style agents reinforcing wrong answers).</p></li><li><p><strong>Why Your Multi-Agent System is Failing: Escaping the 17x Error Trap </strong>(Jan 2026)<br>Link: <a href="https://towardsdatascience.com/why-your-multi-agent-system-is-failing-escaping-the-17x-error-trap-of-the-bag-of-agents/">https://towardsdatascience.com/why-your-multi-agent-system-is-failing-escaping-the-17x-error-trap-of-the-bag-of-agents/</a>Published: January 30, 2026 (Towards Data Science)</p><p>Relevance: Directly calls out the &#8220;bag of agents&#8221; problem &#8212; unstructured multi-agent setups amplify errors dramatically without proper topology. Emphasizes that coordination costs often outweigh decomposition benefits.</p></li></ol><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Two Architectures of Control]]></title><description><![CDATA[Bounded Evolution vs. Composable Constraints in Agentic Systems]]></description><link>https://interestingengineering.substack.com/p/two-architectures-of-control</link><guid isPermaLink="false">https://interestingengineering.substack.com/p/two-architectures-of-control</guid><dc:creator><![CDATA[Interesting Engineering ++]]></dc:creator><pubDate>Tue, 05 May 2026 16:16:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!tQyg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71e751af-eda2-4137-9f6d-49a4397ea5c9_1405x380.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tQyg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71e751af-eda2-4137-9f6d-49a4397ea5c9_1405x380.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tQyg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71e751af-eda2-4137-9f6d-49a4397ea5c9_1405x380.png 424w, https://substackcdn.com/image/fetch/$s_!tQyg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71e751af-eda2-4137-9f6d-49a4397ea5c9_1405x380.png 848w, https://substackcdn.com/image/fetch/$s_!tQyg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71e751af-eda2-4137-9f6d-49a4397ea5c9_1405x380.png 1272w, https://substackcdn.com/image/fetch/$s_!tQyg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71e751af-eda2-4137-9f6d-49a4397ea5c9_1405x380.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tQyg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71e751af-eda2-4137-9f6d-49a4397ea5c9_1405x380.png" width="1405" height="380" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/71e751af-eda2-4137-9f6d-49a4397ea5c9_1405x380.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:380,&quot;width&quot;:1405,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:818961,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71e751af-eda2-4137-9f6d-49a4397ea5c9_1405x380.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tQyg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71e751af-eda2-4137-9f6d-49a4397ea5c9_1405x380.png 424w, https://substackcdn.com/image/fetch/$s_!tQyg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71e751af-eda2-4137-9f6d-49a4397ea5c9_1405x380.png 848w, https://substackcdn.com/image/fetch/$s_!tQyg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71e751af-eda2-4137-9f6d-49a4397ea5c9_1405x380.png 1272w, https://substackcdn.com/image/fetch/$s_!tQyg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71e751af-eda2-4137-9f6d-49a4397ea5c9_1405x380.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I recently wrote <a href="https://interestingengineering.substack.com/p/what-should-and-should-not-evolve">&#8216;What Should &#8212; and Should Not &#8212; Evolve in Self-Improving Multi-Agent Systems?&#8217;</a>, which builds a four-tier safety taxonomy from a convergent body of academic research spanning Columbia, Princeton, Renmin University, and Anthropic &#8212; arguing that certain components of any self-improving agentic system must be architecturally frozen, and others must evolve only under strict governance conditions. I do apply this framework, which I find creates amazing discipline in harness engineering or at least how multi-agent systems guardrails apply. I have additionally taken note of some critical findings (in <a href="https://interestingengineering.substack.com/p/when-the-recipe-gets-in-the-way">When The Recipe Gets In The Way</a>) depending on how much thought-chaining and multi-step approaches are necessary depending on domain (within which it is being applied to) vs what the base model can perhaps already respond to - To <strong><a href="https://interestingengineering.substack.com/p/when-the-recipe-gets-in-the-way">ensure the instruction (skills or markdown) file(s) do not overwrite any latent competence the model already possesses.</a></strong></p><p>Note that I treat everything &#8220;agentic&#8221; as fast evolving, from which there are many wonderful learning lessons (let me refer to them simply as &#8220;adaptations&#8221; or &#8220;evolutions&#8221;, no different from the systems we use, study, or put into place). Therefore many, manyyyy things we figure out, as we get into the weeds of various ongoing projects. Adapted to circumstances and &#8220;fit for purpose&#8221; requirements.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;cd600863-50ae-4b54-b263-7b352b6f3ed5&quot;,&quot;caption&quot;:&quot;A Core Tension&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;WHAT SHOULD &#8212; AND SHOULD NOT &#8212; EVOLVE IN SELF-IMPROVING MULTI-AGENT SYSTEMS?&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:124460392,&quot;name&quot;:&quot;Interesting Engineering ++&quot;,&quot;bio&quot;:&quot;I spend my time learning about, and understanding our complex world better. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/977225f0-cc19-41f4-9df4-e21d01541411_347x347.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-04-28T16:50:34.221Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!qoGV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F123b7ec0-079c-4261-9dc4-8e7c863863da_1183x665.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://interestingengineering.substack.com/p/what-should-and-should-not-evolve&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:195763857,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:2,&quot;comment_count&quot;:0,&quot;publication_id&quot;:1335585,&quot;publication_name&quot;:&quot;Interesting Engineering++&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!-M9w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05150353-1bdc-48d2-b72c-c0bd499513eb_1024x1024.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>So Matt Pocock&#8217;s skills repository, additionally, came as a fantastic, relevant, recent find - offering <a href="https://github.com/mattpocock/skills/tree/main/skills/engineering">9 Core Engineering markdown files</a> and a one-line npm installer, arguing that <em><strong>the best way to maintain control over a coding agent is to stay simple, composable, and close to the decision chain. </strong></em>There is much elegance in the way he addresses agentic construct.</p><p>In truth, these two positions are not in direct conflict. They are operating at different altitudes. But placing them in dialogue surfaces something important: the architecture article builds, and the practical control the skills approach exercises, are describing the same problem from the top down and the bottom up, respectively. From my perspective, understanding where they agree, where they diverge, and what each gets wrong is more useful than treating them as separate concerns. </p><blockquote><p>The <a href="https://interestingengineering.substack.com/p/what-should-and-should-not-evolve">multi-agent article</a> asks: <em><strong>how do you govern a system that can modify itself? Pocock asks: how do you stay in control of a tool that can act for you? They are both correct that the answer is the same &#8212; you fix what must not move, and you let everything else evolve.</strong></em></p></blockquote><h2>The <a href="https://interestingengineering.substack.com/p/what-should-and-should-not-evolve">Multi-Agent Article</a>: A Taxonomy of Bounded Evolution</h2><p>The article&#8217;s central claim is structural: <em><strong>in any multi-agent system capable of self-modification, components must be classified before they are built according to how dangerous it is to allow them to evolve</strong></em>. The four-tier taxonomy it proposes assigns everything from skill libraries (Tier 1, safe to evolve autonomously) to constitutional constraints and audit logs (Tier 4, never modified under any circumstances) into a clear governance hierarchy.</p><p>The architecture principle the article derives is somewhat elegant: fix the axioms, fix the verifier, and let everything else evolve within those constraints. The verifier &#8212; whatever checks whether an evolution is safe &#8212; must itself be outside the evolution loop. If the system can modify its own safety checks, the safety boundary collapses. The article calls this the <strong><a href="https://interestingengineering.substack.com/p/what-should-and-should-not-evolve">Golden Rule of agentic evolution</a></strong>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Zdo0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473b7a74-2db8-4061-ab1e-f1edb2274255_3822x2134.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Zdo0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473b7a74-2db8-4061-ab1e-f1edb2274255_3822x2134.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Zdo0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473b7a74-2db8-4061-ab1e-f1edb2274255_3822x2134.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Zdo0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473b7a74-2db8-4061-ab1e-f1edb2274255_3822x2134.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Zdo0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473b7a74-2db8-4061-ab1e-f1edb2274255_3822x2134.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Zdo0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473b7a74-2db8-4061-ab1e-f1edb2274255_3822x2134.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/473b7a74-2db8-4061-ab1e-f1edb2274255_3822x2134.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1117991,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473b7a74-2db8-4061-ab1e-f1edb2274255_3822x2134.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Zdo0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473b7a74-2db8-4061-ab1e-f1edb2274255_3822x2134.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Zdo0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473b7a74-2db8-4061-ab1e-f1edb2274255_3822x2134.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Zdo0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473b7a74-2db8-4061-ab1e-f1edb2274255_3822x2134.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Zdo0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473b7a74-2db8-4061-ab1e-f1edb2274255_3822x2134.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: <a href="https://interestingengineering.substack.com/p/what-should-and-should-not-evolve">Article</a></figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!I97S!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74b60c96-fe36-4911-9148-9b52b6b56587_3822x2134.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!I97S!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74b60c96-fe36-4911-9148-9b52b6b56587_3822x2134.jpeg 424w, https://substackcdn.com/image/fetch/$s_!I97S!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74b60c96-fe36-4911-9148-9b52b6b56587_3822x2134.jpeg 848w, https://substackcdn.com/image/fetch/$s_!I97S!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74b60c96-fe36-4911-9148-9b52b6b56587_3822x2134.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!I97S!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74b60c96-fe36-4911-9148-9b52b6b56587_3822x2134.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!I97S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74b60c96-fe36-4911-9148-9b52b6b56587_3822x2134.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/74b60c96-fe36-4911-9148-9b52b6b56587_3822x2134.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1110195,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74b60c96-fe36-4911-9148-9b52b6b56587_3822x2134.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!I97S!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74b60c96-fe36-4911-9148-9b52b6b56587_3822x2134.jpeg 424w, https://substackcdn.com/image/fetch/$s_!I97S!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74b60c96-fe36-4911-9148-9b52b6b56587_3822x2134.jpeg 848w, https://substackcdn.com/image/fetch/$s_!I97S!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74b60c96-fe36-4911-9148-9b52b6b56587_3822x2134.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!I97S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74b60c96-fe36-4911-9148-9b52b6b56587_3822x2134.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: <a href="https://interestingengineering.substack.com/p/what-should-and-should-not-evolve">Article</a></figcaption></figure></div><p>Supporting this argument are two specific research findings the article leans on heavily. The Statistical Limits Theorem, attributed to Columbia University (2025), claims that simultaneous unconstrained improvement across five axes of self-improvement is statistically impossible without system instability &#8212; meaning at least one axis must be frozen, and the article argues that axis should be alignment. The Safety Vanishing problem, from Renmin University and BAAI (2026), documents that safety specifications drift toward reduced restrictiveness over time in self-evolving multi-agent societies, even without any malicious intent &#8212; simply through the accumulated effect of individually reasonable adaptations.</p><h2>Matt Pocock&#8217;s Skills: Composable Practitioner Control!!!</h2><p>Pocock&#8217;s argument is operational (taking more control back), non-theoretical - i enjoyed recent sharings of his here (100% watch and take notes). </p><div id="youtube2-v4F1gFy-hqg" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;v4F1gFy-hqg&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/v4F1gFy-hqg?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><div id="youtube2--QFHIoCo-Ko" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;-QFHIoCo-Ko&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/-QFHIoCo-Ko?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>His repository is a collection of <strong>9 Core Engineering markdown files &#8212; skills</strong> &#8212; that a developer loads into a .claude or agents directory and invokes as slash commands. <strong>Skills like grill me, TDD, and diagnose are designed to solve specific, recurring failure modes: the agent going in the wrong direction, the code not working, the codebase accumulating structural debt.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!x0mL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F124134e5-be6c-464e-8014-cb66e84db35f_1138x474.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!x0mL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F124134e5-be6c-464e-8014-cb66e84db35f_1138x474.png 424w, https://substackcdn.com/image/fetch/$s_!x0mL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F124134e5-be6c-464e-8014-cb66e84db35f_1138x474.png 848w, https://substackcdn.com/image/fetch/$s_!x0mL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F124134e5-be6c-464e-8014-cb66e84db35f_1138x474.png 1272w, https://substackcdn.com/image/fetch/$s_!x0mL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F124134e5-be6c-464e-8014-cb66e84db35f_1138x474.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!x0mL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F124134e5-be6c-464e-8014-cb66e84db35f_1138x474.png" width="1138" height="474" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/124134e5-be6c-464e-8014-cb66e84db35f_1138x474.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:474,&quot;width&quot;:1138,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:60480,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F124134e5-be6c-464e-8014-cb66e84db35f_1138x474.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!x0mL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F124134e5-be6c-464e-8014-cb66e84db35f_1138x474.png 424w, https://substackcdn.com/image/fetch/$s_!x0mL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F124134e5-be6c-464e-8014-cb66e84db35f_1138x474.png 848w, https://substackcdn.com/image/fetch/$s_!x0mL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F124134e5-be6c-464e-8014-cb66e84db35f_1138x474.png 1272w, https://substackcdn.com/image/fetch/$s_!x0mL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F124134e5-be6c-464e-8014-cb66e84db35f_1138x474.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: <a href="https://github.com/mattpocock/skills/tree/main/skills/engineering">Github</a></figcaption></figure></div><p>The philosophy is explicit: no orchestrator, no planner, no spec kit, no runtime. <strong>Heavy-process frameworks like BMAD or GSD, in Pocock&#8217;s view, take control away from the engineer and make debugging harder. The value of skills is precisely their simplicity &#8212; they are small, composable, and adapt to the developer&#8217;s workflow rather than forcing the developer to adapt to a rigid framework.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yjOH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4dcaa059-c75c-4b54-b2b5-0b048f930654_908x461.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yjOH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4dcaa059-c75c-4b54-b2b5-0b048f930654_908x461.png 424w, https://substackcdn.com/image/fetch/$s_!yjOH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4dcaa059-c75c-4b54-b2b5-0b048f930654_908x461.png 848w, https://substackcdn.com/image/fetch/$s_!yjOH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4dcaa059-c75c-4b54-b2b5-0b048f930654_908x461.png 1272w, https://substackcdn.com/image/fetch/$s_!yjOH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4dcaa059-c75c-4b54-b2b5-0b048f930654_908x461.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yjOH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4dcaa059-c75c-4b54-b2b5-0b048f930654_908x461.png" width="908" height="461" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4dcaa059-c75c-4b54-b2b5-0b048f930654_908x461.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:461,&quot;width&quot;:908,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:50501,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4dcaa059-c75c-4b54-b2b5-0b048f930654_908x461.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yjOH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4dcaa059-c75c-4b54-b2b5-0b048f930654_908x461.png 424w, https://substackcdn.com/image/fetch/$s_!yjOH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4dcaa059-c75c-4b54-b2b5-0b048f930654_908x461.png 848w, https://substackcdn.com/image/fetch/$s_!yjOH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4dcaa059-c75c-4b54-b2b5-0b048f930654_908x461.png 1272w, https://substackcdn.com/image/fetch/$s_!yjOH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4dcaa059-c75c-4b54-b2b5-0b048f930654_908x461.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: <a href="https://github.com/mattpocock/skills">Github</a></figcaption></figure></div><p>The <strong>grill me skill &#8212; which interrogates the developer to resolve decision trees before any code is written &#8212; is the most architecturally significant. It functions as a context construction program that front-loads disambiguation, preventing the agent from building the wrong thing</strong>. <a href="https://tosea.ai/blog/matt-pocock-skills-claude-code-guide">Tosea.ai</a> have a guide <a href="https://tosea.ai/blog/matt-pocock-skills-claude-code-guide">here</a>. </p><h2>The Composable Safety Stack</h2><p>The central insight of the comparative analysis is this: Pocock&#8217;s skills are an excellent implementation of Tier 1 in the 4-Tier taxonomy. They are precisely the kind of safely-evolvable, practitioner-tunable, verification-gated (by the human developer) components that the ISR article argues should be autonomous. </p><p>The synthesis framework proposed here &#8212; the <strong>Composable Safety Stack </strong>&#8212; treats the two approaches not as alternatives but as complementary layers in a single architecture, with the selection of which layers to activate governed by task complexity, autonomy level, and consequence severity.</p><blockquote><p><em>The Composable Safety Stack: Start with Pocock&#8217;s skills as the T1 evolvable layer. Add governance layers incrementally as task complexity, autonomy, and consequence severity increase. Never skip a level.</em></p></blockquote><h3>The Stack</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VPv9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187ef3e0-ec26-4ec7-9509-2138da9f6cbc_1044x594.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VPv9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187ef3e0-ec26-4ec7-9509-2138da9f6cbc_1044x594.png 424w, https://substackcdn.com/image/fetch/$s_!VPv9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187ef3e0-ec26-4ec7-9509-2138da9f6cbc_1044x594.png 848w, https://substackcdn.com/image/fetch/$s_!VPv9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187ef3e0-ec26-4ec7-9509-2138da9f6cbc_1044x594.png 1272w, https://substackcdn.com/image/fetch/$s_!VPv9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187ef3e0-ec26-4ec7-9509-2138da9f6cbc_1044x594.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VPv9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187ef3e0-ec26-4ec7-9509-2138da9f6cbc_1044x594.png" width="1044" height="594" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/187ef3e0-ec26-4ec7-9509-2138da9f6cbc_1044x594.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:594,&quot;width&quot;:1044,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:63055,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187ef3e0-ec26-4ec7-9509-2138da9f6cbc_1044x594.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VPv9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187ef3e0-ec26-4ec7-9509-2138da9f6cbc_1044x594.png 424w, https://substackcdn.com/image/fetch/$s_!VPv9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187ef3e0-ec26-4ec7-9509-2138da9f6cbc_1044x594.png 848w, https://substackcdn.com/image/fetch/$s_!VPv9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187ef3e0-ec26-4ec7-9509-2138da9f6cbc_1044x594.png 1272w, https://substackcdn.com/image/fetch/$s_!VPv9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187ef3e0-ec26-4ec7-9509-2138da9f6cbc_1044x594.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>When to Add Each Layer</h3><p>The practitioner&#8217;s question is not &#8216;which tier taxonomy applies to my system?&#8217; but &#8216;at what point do I need to add the next layer?&#8217; The Composable Safety Stack answers this with a simple escalation decision tree.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qOXs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9596bb9a-d12d-4f6e-8df8-62b4c87b7890_1160x631.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qOXs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9596bb9a-d12d-4f6e-8df8-62b4c87b7890_1160x631.png 424w, https://substackcdn.com/image/fetch/$s_!qOXs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9596bb9a-d12d-4f6e-8df8-62b4c87b7890_1160x631.png 848w, https://substackcdn.com/image/fetch/$s_!qOXs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9596bb9a-d12d-4f6e-8df8-62b4c87b7890_1160x631.png 1272w, https://substackcdn.com/image/fetch/$s_!qOXs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9596bb9a-d12d-4f6e-8df8-62b4c87b7890_1160x631.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qOXs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9596bb9a-d12d-4f6e-8df8-62b4c87b7890_1160x631.png" width="1160" height="631" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9596bb9a-d12d-4f6e-8df8-62b4c87b7890_1160x631.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:631,&quot;width&quot;:1160,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:733567,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9596bb9a-d12d-4f6e-8df8-62b4c87b7890_1160x631.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qOXs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9596bb9a-d12d-4f6e-8df8-62b4c87b7890_1160x631.png 424w, https://substackcdn.com/image/fetch/$s_!qOXs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9596bb9a-d12d-4f6e-8df8-62b4c87b7890_1160x631.png 848w, https://substackcdn.com/image/fetch/$s_!qOXs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9596bb9a-d12d-4f6e-8df8-62b4c87b7890_1160x631.png 1272w, https://substackcdn.com/image/fetch/$s_!qOXs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9596bb9a-d12d-4f6e-8df8-62b4c87b7890_1160x631.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Layer 0 is always present. </strong>It is the default. Any developer using Claude Code, Cursor, or a similar tool with markdown skill files is operating at Layer 0. This is the Pocock starting point.</p><p><strong>Add Layer 1 when: </strong>the task involves domain-specific language, multi-session continuity, or onboarding a new engineer. The contract.mmd pattern is the minimal viable implementation.</p><p><strong>Add Layer 2 when: </strong>the agent&#8217;s output will be used by another person, published, committed to a shared codebase, or used as input to a downstream automated process. A lightweight eval step &#8212; even a structured human review checklist &#8212; is sufficient at this level.</p><p><strong>Add Layer 3 when: </strong>the agent can take actions with external consequences: writing to production, executing API calls that affect real data, spawning sub-agents, or operating with reduced human supervision. This is where the ISR article&#8217;s harness architecture becomes mandatory rather than optional.</p><p><strong>Add Layer 4 when: </strong>the system is multi-agent, partially autonomous, or capable of self-modification. At this level, the Anthropic Constitutional AI layer in the underlying model is not sufficient &#8212; it must be supplemented by explicit architectural enforcement.</p><p>To make the comparison concrete, we follow a single software development task through all three approaches - Pocock&#8217;s skills, Multi-Agent 4 Tier Framework and then a blend of both cases. </p><h2>PART II</h2><p>The task is this: a developer working on a customer-facing SaaS application needs to integrate Stripe payments &#8212; handling checkout sessions, processing webhooks, and writing payment records to the application&#8217;s database.</p><p>This task is a useful test case for several reasons. It involves <em><strong>multiple sequential steps that build on each other. It requires the agent to read and write real files in an existing codebase. It involves external API calls and, at the end, code that runs in production where mistakes can charge customers or corrupt financial records. And it is the kind of task a developer might reasonably hand to a capable coding agent today.</strong></em></p><blockquote><h6><em><strong>Task</strong></em></h6><p><em>Integrate Stripe payments (checkout + webhooks + database writes) into an existing Node.js/Express application</em></p><h6><em><strong>Developer</strong></em></h6><p><em>Solo developer, familiar with the codebase, first time working with Stripe</em></p><h6><em><strong>Agent</strong></em></h6><p><em>Claude Code or equivalent coding agent with file read/write and bash access</em></p><h6><em><strong>Stakes</strong></em></h6><p><em>Medium-high: errors could affect live users; payment data requires care; some actions are hard to reverse</em></p><h6><em><strong>Duration</strong></em></h6><p><em>Estimated 2&#8211;4 hours with agent assistance</em></p></blockquote><p><em>Each approach below follows the same task from first prompt to shipped code. Pay attention to where the human makes decisions, where the agent acts autonomously, and where the system stops and asks for confirmation.</em></p><h3><strong>APPROACH 1: POCOCK&#8217;S SKILLS</strong></h3><p><strong>The Practitioner Approach</strong></p><p><em>Composable markdown skills &#183; Developer stays in the loop &#183; No framework, no runtime</em></p><p>The developer opens their project in their editor (VS Code with Claude Code, or Cursor). They have installed the Pocock skills via the npx installer, which has placed the skill files in the .claude/commands directory. There is no daemon running, no background process, no orchestrator. The developer types a slash command to begin.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kcQi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd18d475-75d4-4d59-bd2f-a26d9c29db3f_783x486.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kcQi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd18d475-75d4-4d59-bd2f-a26d9c29db3f_783x486.png 424w, https://substackcdn.com/image/fetch/$s_!kcQi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd18d475-75d4-4d59-bd2f-a26d9c29db3f_783x486.png 848w, https://substackcdn.com/image/fetch/$s_!kcQi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd18d475-75d4-4d59-bd2f-a26d9c29db3f_783x486.png 1272w, https://substackcdn.com/image/fetch/$s_!kcQi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd18d475-75d4-4d59-bd2f-a26d9c29db3f_783x486.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kcQi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd18d475-75d4-4d59-bd2f-a26d9c29db3f_783x486.png" width="783" height="486" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fd18d475-75d4-4d59-bd2f-a26d9c29db3f_783x486.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:486,&quot;width&quot;:783,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:79147,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd18d475-75d4-4d59-bd2f-a26d9c29db3f_783x486.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kcQi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd18d475-75d4-4d59-bd2f-a26d9c29db3f_783x486.png 424w, https://substackcdn.com/image/fetch/$s_!kcQi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd18d475-75d4-4d59-bd2f-a26d9c29db3f_783x486.png 848w, https://substackcdn.com/image/fetch/$s_!kcQi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd18d475-75d4-4d59-bd2f-a26d9c29db3f_783x486.png 1272w, https://substackcdn.com/image/fetch/$s_!kcQi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd18d475-75d4-4d59-bd2f-a26d9c29db3f_783x486.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7I33!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2989f951-e19d-4150-bbbe-dc8883e8174a_780x488.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7I33!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2989f951-e19d-4150-bbbe-dc8883e8174a_780x488.png 424w, https://substackcdn.com/image/fetch/$s_!7I33!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2989f951-e19d-4150-bbbe-dc8883e8174a_780x488.png 848w, https://substackcdn.com/image/fetch/$s_!7I33!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2989f951-e19d-4150-bbbe-dc8883e8174a_780x488.png 1272w, https://substackcdn.com/image/fetch/$s_!7I33!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2989f951-e19d-4150-bbbe-dc8883e8174a_780x488.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7I33!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2989f951-e19d-4150-bbbe-dc8883e8174a_780x488.png" width="780" height="488" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2989f951-e19d-4150-bbbe-dc8883e8174a_780x488.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:488,&quot;width&quot;:780,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:73338,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2989f951-e19d-4150-bbbe-dc8883e8174a_780x488.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7I33!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2989f951-e19d-4150-bbbe-dc8883e8174a_780x488.png 424w, https://substackcdn.com/image/fetch/$s_!7I33!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2989f951-e19d-4150-bbbe-dc8883e8174a_780x488.png 848w, https://substackcdn.com/image/fetch/$s_!7I33!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2989f951-e19d-4150-bbbe-dc8883e8174a_780x488.png 1272w, https://substackcdn.com/image/fetch/$s_!7I33!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2989f951-e19d-4150-bbbe-dc8883e8174a_780x488.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!V0Sf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33774d95-29bd-4c8d-a427-a8aa6d37cff4_787x488.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!V0Sf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33774d95-29bd-4c8d-a427-a8aa6d37cff4_787x488.png 424w, https://substackcdn.com/image/fetch/$s_!V0Sf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33774d95-29bd-4c8d-a427-a8aa6d37cff4_787x488.png 848w, https://substackcdn.com/image/fetch/$s_!V0Sf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33774d95-29bd-4c8d-a427-a8aa6d37cff4_787x488.png 1272w, https://substackcdn.com/image/fetch/$s_!V0Sf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33774d95-29bd-4c8d-a427-a8aa6d37cff4_787x488.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!V0Sf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33774d95-29bd-4c8d-a427-a8aa6d37cff4_787x488.png" width="787" height="488" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/33774d95-29bd-4c8d-a427-a8aa6d37cff4_787x488.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:488,&quot;width&quot;:787,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:79179,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33774d95-29bd-4c8d-a427-a8aa6d37cff4_787x488.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!V0Sf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33774d95-29bd-4c8d-a427-a8aa6d37cff4_787x488.png 424w, https://substackcdn.com/image/fetch/$s_!V0Sf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33774d95-29bd-4c8d-a427-a8aa6d37cff4_787x488.png 848w, https://substackcdn.com/image/fetch/$s_!V0Sf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33774d95-29bd-4c8d-a427-a8aa6d37cff4_787x488.png 1272w, https://substackcdn.com/image/fetch/$s_!V0Sf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33774d95-29bd-4c8d-a427-a8aa6d37cff4_787x488.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VFCa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18a486bb-b17f-4ab8-b3e5-6701f08ee7ff_786x456.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VFCa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18a486bb-b17f-4ab8-b3e5-6701f08ee7ff_786x456.png 424w, https://substackcdn.com/image/fetch/$s_!VFCa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18a486bb-b17f-4ab8-b3e5-6701f08ee7ff_786x456.png 848w, https://substackcdn.com/image/fetch/$s_!VFCa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18a486bb-b17f-4ab8-b3e5-6701f08ee7ff_786x456.png 1272w, https://substackcdn.com/image/fetch/$s_!VFCa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18a486bb-b17f-4ab8-b3e5-6701f08ee7ff_786x456.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VFCa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18a486bb-b17f-4ab8-b3e5-6701f08ee7ff_786x456.png" width="786" height="456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/18a486bb-b17f-4ab8-b3e5-6701f08ee7ff_786x456.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:456,&quot;width&quot;:786,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:67041,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18a486bb-b17f-4ab8-b3e5-6701f08ee7ff_786x456.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VFCa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18a486bb-b17f-4ab8-b3e5-6701f08ee7ff_786x456.png 424w, https://substackcdn.com/image/fetch/$s_!VFCa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18a486bb-b17f-4ab8-b3e5-6701f08ee7ff_786x456.png 848w, https://substackcdn.com/image/fetch/$s_!VFCa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18a486bb-b17f-4ab8-b3e5-6701f08ee7ff_786x456.png 1272w, https://substackcdn.com/image/fetch/$s_!VFCa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18a486bb-b17f-4ab8-b3e5-6701f08ee7ff_786x456.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>APPROACH 2: THE HARNESS STACK</strong></h3><p><strong>The Governed Architecture</strong></p><p><em>External evaluation harness &#183; Multi-agent pool &#183; Immutable safety layer &#183; Structured governance</em></p><p>In a harness-based system, the same task is handled by a coordinated team of specialist agents, each with a specific role, governed by an external evaluation layer that the agents cannot modify. The human engineer submits a task specification; the system does not return an output until that output has passed structured quality and safety gates.</p><p><em><strong>This architecture is overkill for a solo developer building a Stripe integration on a Tuesday afternoon. It becomes appropriate &#8212; and eventually mandatory &#8212; when the agent is operating at reduced human supervision, when the codebase is shared across a team, when the output goes to production automatically, or when the task involves consequential external actions at scale.</strong></em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XeFO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb86307b3-69d8-4861-8717-a921bfcf3c6f_784x491.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XeFO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb86307b3-69d8-4861-8717-a921bfcf3c6f_784x491.png 424w, https://substackcdn.com/image/fetch/$s_!XeFO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb86307b3-69d8-4861-8717-a921bfcf3c6f_784x491.png 848w, https://substackcdn.com/image/fetch/$s_!XeFO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb86307b3-69d8-4861-8717-a921bfcf3c6f_784x491.png 1272w, https://substackcdn.com/image/fetch/$s_!XeFO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb86307b3-69d8-4861-8717-a921bfcf3c6f_784x491.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XeFO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb86307b3-69d8-4861-8717-a921bfcf3c6f_784x491.png" width="784" height="491" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b86307b3-69d8-4861-8717-a921bfcf3c6f_784x491.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:491,&quot;width&quot;:784,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:82963,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb86307b3-69d8-4861-8717-a921bfcf3c6f_784x491.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XeFO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb86307b3-69d8-4861-8717-a921bfcf3c6f_784x491.png 424w, https://substackcdn.com/image/fetch/$s_!XeFO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb86307b3-69d8-4861-8717-a921bfcf3c6f_784x491.png 848w, https://substackcdn.com/image/fetch/$s_!XeFO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb86307b3-69d8-4861-8717-a921bfcf3c6f_784x491.png 1272w, https://substackcdn.com/image/fetch/$s_!XeFO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb86307b3-69d8-4861-8717-a921bfcf3c6f_784x491.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OLkZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24629824-78fc-442d-8b96-1b8686e3bd4c_787x574.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OLkZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24629824-78fc-442d-8b96-1b8686e3bd4c_787x574.png 424w, https://substackcdn.com/image/fetch/$s_!OLkZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24629824-78fc-442d-8b96-1b8686e3bd4c_787x574.png 848w, https://substackcdn.com/image/fetch/$s_!OLkZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24629824-78fc-442d-8b96-1b8686e3bd4c_787x574.png 1272w, https://substackcdn.com/image/fetch/$s_!OLkZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24629824-78fc-442d-8b96-1b8686e3bd4c_787x574.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OLkZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24629824-78fc-442d-8b96-1b8686e3bd4c_787x574.png" width="787" height="574" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/24629824-78fc-442d-8b96-1b8686e3bd4c_787x574.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:574,&quot;width&quot;:787,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:96952,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24629824-78fc-442d-8b96-1b8686e3bd4c_787x574.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OLkZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24629824-78fc-442d-8b96-1b8686e3bd4c_787x574.png 424w, https://substackcdn.com/image/fetch/$s_!OLkZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24629824-78fc-442d-8b96-1b8686e3bd4c_787x574.png 848w, https://substackcdn.com/image/fetch/$s_!OLkZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24629824-78fc-442d-8b96-1b8686e3bd4c_787x574.png 1272w, https://substackcdn.com/image/fetch/$s_!OLkZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24629824-78fc-442d-8b96-1b8686e3bd4c_787x574.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_cY_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2550909e-330b-4faf-a493-b7b4e9b6a483_786x223.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_cY_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2550909e-330b-4faf-a493-b7b4e9b6a483_786x223.png 424w, https://substackcdn.com/image/fetch/$s_!_cY_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2550909e-330b-4faf-a493-b7b4e9b6a483_786x223.png 848w, https://substackcdn.com/image/fetch/$s_!_cY_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2550909e-330b-4faf-a493-b7b4e9b6a483_786x223.png 1272w, https://substackcdn.com/image/fetch/$s_!_cY_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2550909e-330b-4faf-a493-b7b4e9b6a483_786x223.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_cY_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2550909e-330b-4faf-a493-b7b4e9b6a483_786x223.png" width="786" height="223" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2550909e-330b-4faf-a493-b7b4e9b6a483_786x223.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:223,&quot;width&quot;:786,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:41281,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2550909e-330b-4faf-a493-b7b4e9b6a483_786x223.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_cY_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2550909e-330b-4faf-a493-b7b4e9b6a483_786x223.png 424w, https://substackcdn.com/image/fetch/$s_!_cY_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2550909e-330b-4faf-a493-b7b4e9b6a483_786x223.png 848w, https://substackcdn.com/image/fetch/$s_!_cY_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2550909e-330b-4faf-a493-b7b4e9b6a483_786x223.png 1272w, https://substackcdn.com/image/fetch/$s_!_cY_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2550909e-330b-4faf-a493-b7b4e9b6a483_786x223.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-v4F!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe718130b-66ba-49f2-b947-74feb8737406_785x739.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-v4F!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe718130b-66ba-49f2-b947-74feb8737406_785x739.png 424w, https://substackcdn.com/image/fetch/$s_!-v4F!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe718130b-66ba-49f2-b947-74feb8737406_785x739.png 848w, https://substackcdn.com/image/fetch/$s_!-v4F!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe718130b-66ba-49f2-b947-74feb8737406_785x739.png 1272w, https://substackcdn.com/image/fetch/$s_!-v4F!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe718130b-66ba-49f2-b947-74feb8737406_785x739.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-v4F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe718130b-66ba-49f2-b947-74feb8737406_785x739.png" width="785" height="739" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e718130b-66ba-49f2-b947-74feb8737406_785x739.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:739,&quot;width&quot;:785,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:129941,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe718130b-66ba-49f2-b947-74feb8737406_785x739.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-v4F!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe718130b-66ba-49f2-b947-74feb8737406_785x739.png 424w, https://substackcdn.com/image/fetch/$s_!-v4F!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe718130b-66ba-49f2-b947-74feb8737406_785x739.png 848w, https://substackcdn.com/image/fetch/$s_!-v4F!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe718130b-66ba-49f2-b947-74feb8737406_785x739.png 1272w, https://substackcdn.com/image/fetch/$s_!-v4F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe718130b-66ba-49f2-b947-74feb8737406_785x739.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dTYV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfee7bd1-c47d-4daf-8f57-14aa9ed71f1e_786x222.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dTYV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfee7bd1-c47d-4daf-8f57-14aa9ed71f1e_786x222.png 424w, https://substackcdn.com/image/fetch/$s_!dTYV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfee7bd1-c47d-4daf-8f57-14aa9ed71f1e_786x222.png 848w, https://substackcdn.com/image/fetch/$s_!dTYV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfee7bd1-c47d-4daf-8f57-14aa9ed71f1e_786x222.png 1272w, https://substackcdn.com/image/fetch/$s_!dTYV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfee7bd1-c47d-4daf-8f57-14aa9ed71f1e_786x222.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dTYV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfee7bd1-c47d-4daf-8f57-14aa9ed71f1e_786x222.png" width="786" height="222" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cfee7bd1-c47d-4daf-8f57-14aa9ed71f1e_786x222.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:222,&quot;width&quot;:786,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:47261,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfee7bd1-c47d-4daf-8f57-14aa9ed71f1e_786x222.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dTYV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfee7bd1-c47d-4daf-8f57-14aa9ed71f1e_786x222.png 424w, https://substackcdn.com/image/fetch/$s_!dTYV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfee7bd1-c47d-4daf-8f57-14aa9ed71f1e_786x222.png 848w, https://substackcdn.com/image/fetch/$s_!dTYV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfee7bd1-c47d-4daf-8f57-14aa9ed71f1e_786x222.png 1272w, https://substackcdn.com/image/fetch/$s_!dTYV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfee7bd1-c47d-4daf-8f57-14aa9ed71f1e_786x222.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3><strong>APPROACH 3: THE SYNTHESIS STACK</strong></h3><p><strong>The Layered Approach</strong></p><p><em>Start simple &#183; Escalate deliberately &#183; Add governance exactly where consequences require it</em></p><p>The Composable Safety Stack does not ask the developer to choose between Pocock&#8217;s skills and a full harness architecture. It asks a different question: <em><strong>at which point in this specific task does the next governance layer become necessary? The answer is determined by three factors &#8212; autonomy (how much the agent is acting without direct human oversight), consequence severity (how hard is it to reverse a mistake), and coordination scope (are multiple agents involved?).</strong></em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UKr6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a0900e5-7d0c-429b-9f9f-209b1d91f676_1150x624.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UKr6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a0900e5-7d0c-429b-9f9f-209b1d91f676_1150x624.png 424w, https://substackcdn.com/image/fetch/$s_!UKr6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a0900e5-7d0c-429b-9f9f-209b1d91f676_1150x624.png 848w, https://substackcdn.com/image/fetch/$s_!UKr6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a0900e5-7d0c-429b-9f9f-209b1d91f676_1150x624.png 1272w, https://substackcdn.com/image/fetch/$s_!UKr6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a0900e5-7d0c-429b-9f9f-209b1d91f676_1150x624.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UKr6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a0900e5-7d0c-429b-9f9f-209b1d91f676_1150x624.png" width="1150" height="624" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9a0900e5-7d0c-429b-9f9f-209b1d91f676_1150x624.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:624,&quot;width&quot;:1150,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:976786,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a0900e5-7d0c-429b-9f9f-209b1d91f676_1150x624.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UKr6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a0900e5-7d0c-429b-9f9f-209b1d91f676_1150x624.png 424w, https://substackcdn.com/image/fetch/$s_!UKr6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a0900e5-7d0c-429b-9f9f-209b1d91f676_1150x624.png 848w, https://substackcdn.com/image/fetch/$s_!UKr6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a0900e5-7d0c-429b-9f9f-209b1d91f676_1150x624.png 1272w, https://substackcdn.com/image/fetch/$s_!UKr6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a0900e5-7d0c-429b-9f9f-209b1d91f676_1150x624.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For the Stripe integration, the synthesis approach starts exactly where Pocock starts &#8212; with a grill me session and a TDD discipline. The difference appears at the moment the developer&#8217;s code will process real financial data on behalf of real users. That is the escalation trigger.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eLVi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd69ea6d-7389-4d88-b3d6-8fbba56ad081_790x470.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eLVi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd69ea6d-7389-4d88-b3d6-8fbba56ad081_790x470.png 424w, https://substackcdn.com/image/fetch/$s_!eLVi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd69ea6d-7389-4d88-b3d6-8fbba56ad081_790x470.png 848w, https://substackcdn.com/image/fetch/$s_!eLVi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd69ea6d-7389-4d88-b3d6-8fbba56ad081_790x470.png 1272w, https://substackcdn.com/image/fetch/$s_!eLVi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd69ea6d-7389-4d88-b3d6-8fbba56ad081_790x470.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eLVi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd69ea6d-7389-4d88-b3d6-8fbba56ad081_790x470.png" width="790" height="470" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bd69ea6d-7389-4d88-b3d6-8fbba56ad081_790x470.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:470,&quot;width&quot;:790,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:76858,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd69ea6d-7389-4d88-b3d6-8fbba56ad081_790x470.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eLVi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd69ea6d-7389-4d88-b3d6-8fbba56ad081_790x470.png 424w, https://substackcdn.com/image/fetch/$s_!eLVi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd69ea6d-7389-4d88-b3d6-8fbba56ad081_790x470.png 848w, https://substackcdn.com/image/fetch/$s_!eLVi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd69ea6d-7389-4d88-b3d6-8fbba56ad081_790x470.png 1272w, https://substackcdn.com/image/fetch/$s_!eLVi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd69ea6d-7389-4d88-b3d6-8fbba56ad081_790x470.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-p-v!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa33479ef-63b5-4138-b4c8-1ccefa931a04_789x118.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-p-v!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa33479ef-63b5-4138-b4c8-1ccefa931a04_789x118.png 424w, https://substackcdn.com/image/fetch/$s_!-p-v!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa33479ef-63b5-4138-b4c8-1ccefa931a04_789x118.png 848w, https://substackcdn.com/image/fetch/$s_!-p-v!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa33479ef-63b5-4138-b4c8-1ccefa931a04_789x118.png 1272w, https://substackcdn.com/image/fetch/$s_!-p-v!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa33479ef-63b5-4138-b4c8-1ccefa931a04_789x118.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-p-v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa33479ef-63b5-4138-b4c8-1ccefa931a04_789x118.png" width="789" height="118" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a33479ef-63b5-4138-b4c8-1ccefa931a04_789x118.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:118,&quot;width&quot;:789,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:16936,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa33479ef-63b5-4138-b4c8-1ccefa931a04_789x118.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-p-v!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa33479ef-63b5-4138-b4c8-1ccefa931a04_789x118.png 424w, https://substackcdn.com/image/fetch/$s_!-p-v!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa33479ef-63b5-4138-b4c8-1ccefa931a04_789x118.png 848w, https://substackcdn.com/image/fetch/$s_!-p-v!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa33479ef-63b5-4138-b4c8-1ccefa931a04_789x118.png 1272w, https://substackcdn.com/image/fetch/$s_!-p-v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa33479ef-63b5-4138-b4c8-1ccefa931a04_789x118.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KhRB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17b1477c-0c5b-467c-af65-8acaa42f7c30_785x701.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KhRB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17b1477c-0c5b-467c-af65-8acaa42f7c30_785x701.png 424w, https://substackcdn.com/image/fetch/$s_!KhRB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17b1477c-0c5b-467c-af65-8acaa42f7c30_785x701.png 848w, https://substackcdn.com/image/fetch/$s_!KhRB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17b1477c-0c5b-467c-af65-8acaa42f7c30_785x701.png 1272w, https://substackcdn.com/image/fetch/$s_!KhRB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17b1477c-0c5b-467c-af65-8acaa42f7c30_785x701.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KhRB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17b1477c-0c5b-467c-af65-8acaa42f7c30_785x701.png" width="785" height="701" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/17b1477c-0c5b-467c-af65-8acaa42f7c30_785x701.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:701,&quot;width&quot;:785,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:113384,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17b1477c-0c5b-467c-af65-8acaa42f7c30_785x701.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KhRB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17b1477c-0c5b-467c-af65-8acaa42f7c30_785x701.png 424w, https://substackcdn.com/image/fetch/$s_!KhRB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17b1477c-0c5b-467c-af65-8acaa42f7c30_785x701.png 848w, https://substackcdn.com/image/fetch/$s_!KhRB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17b1477c-0c5b-467c-af65-8acaa42f7c30_785x701.png 1272w, https://substackcdn.com/image/fetch/$s_!KhRB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17b1477c-0c5b-467c-af65-8acaa42f7c30_785x701.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!y-5J!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3541b2e-f8a1-45dd-92df-824954f016e0_786x679.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!y-5J!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3541b2e-f8a1-45dd-92df-824954f016e0_786x679.png 424w, https://substackcdn.com/image/fetch/$s_!y-5J!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3541b2e-f8a1-45dd-92df-824954f016e0_786x679.png 848w, https://substackcdn.com/image/fetch/$s_!y-5J!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3541b2e-f8a1-45dd-92df-824954f016e0_786x679.png 1272w, https://substackcdn.com/image/fetch/$s_!y-5J!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3541b2e-f8a1-45dd-92df-824954f016e0_786x679.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!y-5J!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3541b2e-f8a1-45dd-92df-824954f016e0_786x679.png" width="786" height="679" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d3541b2e-f8a1-45dd-92df-824954f016e0_786x679.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:679,&quot;width&quot;:786,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:112313,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3541b2e-f8a1-45dd-92df-824954f016e0_786x679.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!y-5J!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3541b2e-f8a1-45dd-92df-824954f016e0_786x679.png 424w, https://substackcdn.com/image/fetch/$s_!y-5J!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3541b2e-f8a1-45dd-92df-824954f016e0_786x679.png 848w, https://substackcdn.com/image/fetch/$s_!y-5J!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3541b2e-f8a1-45dd-92df-824954f016e0_786x679.png 1272w, https://substackcdn.com/image/fetch/$s_!y-5J!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3541b2e-f8a1-45dd-92df-824954f016e0_786x679.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!s-3y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c56b863-c8df-4bc4-a6e6-58742aff960d_709x742.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!s-3y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c56b863-c8df-4bc4-a6e6-58742aff960d_709x742.png 424w, https://substackcdn.com/image/fetch/$s_!s-3y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c56b863-c8df-4bc4-a6e6-58742aff960d_709x742.png 848w, https://substackcdn.com/image/fetch/$s_!s-3y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c56b863-c8df-4bc4-a6e6-58742aff960d_709x742.png 1272w, https://substackcdn.com/image/fetch/$s_!s-3y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c56b863-c8df-4bc4-a6e6-58742aff960d_709x742.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!s-3y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c56b863-c8df-4bc4-a6e6-58742aff960d_709x742.png" width="709" height="742" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3c56b863-c8df-4bc4-a6e6-58742aff960d_709x742.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:742,&quot;width&quot;:709,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:112255,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c56b863-c8df-4bc4-a6e6-58742aff960d_709x742.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!s-3y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c56b863-c8df-4bc4-a6e6-58742aff960d_709x742.png 424w, https://substackcdn.com/image/fetch/$s_!s-3y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c56b863-c8df-4bc4-a6e6-58742aff960d_709x742.png 848w, https://substackcdn.com/image/fetch/$s_!s-3y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c56b863-c8df-4bc4-a6e6-58742aff960d_709x742.png 1272w, https://substackcdn.com/image/fetch/$s_!s-3y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c56b863-c8df-4bc4-a6e6-58742aff960d_709x742.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>SIDE BY SIDE</strong></h3><p>The table below compares the three approaches across the dimensions that matter most for a practitioner deciding which to use.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!G7in!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F508cc0a0-6d46-4386-b528-5e827efe97a7_784x522.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!G7in!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F508cc0a0-6d46-4386-b528-5e827efe97a7_784x522.png 424w, https://substackcdn.com/image/fetch/$s_!G7in!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F508cc0a0-6d46-4386-b528-5e827efe97a7_784x522.png 848w, https://substackcdn.com/image/fetch/$s_!G7in!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F508cc0a0-6d46-4386-b528-5e827efe97a7_784x522.png 1272w, https://substackcdn.com/image/fetch/$s_!G7in!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F508cc0a0-6d46-4386-b528-5e827efe97a7_784x522.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!G7in!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F508cc0a0-6d46-4386-b528-5e827efe97a7_784x522.png" width="784" height="522" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/508cc0a0-6d46-4386-b528-5e827efe97a7_784x522.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:522,&quot;width&quot;:784,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:69811,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196496789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F508cc0a0-6d46-4386-b528-5e827efe97a7_784x522.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!G7in!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F508cc0a0-6d46-4386-b528-5e827efe97a7_784x522.png 424w, https://substackcdn.com/image/fetch/$s_!G7in!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F508cc0a0-6d46-4386-b528-5e827efe97a7_784x522.png 848w, https://substackcdn.com/image/fetch/$s_!G7in!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F508cc0a0-6d46-4386-b528-5e827efe97a7_784x522.png 1272w, https://substackcdn.com/image/fetch/$s_!G7in!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F508cc0a0-6d46-4386-b528-5e827efe97a7_784x522.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>A NOTE ON READING THE WORKFLOWS</strong></p><p>Three things are worth noticing across these walkthroughs.</p><p>First, the Pocock approach and the Synthesis Stack look almost identical for the first two steps. The divergence only appears when the task crosses a consequence threshold. This is the point: the Synthesis Stack does not add overhead upfront &#8212; it adds it precisely when the task requires it.</p><p>Second, the Harness Stack&#8217;s overhead is real and significant. The <strong>benefit is that the output arrives with a verifiable provenance trail and structured safety review that no human reviewer can replicate at the speed an agent team operates. For solo developers, this is unnecessary. For teams deploying code automatically, it is the baseline.</strong></p><p>Third, <strong>none of these approaches removes the developer from the process entirely. Even the full harness architecture requires human review at T2/T3 boundaries and explicit sign-off for production deployment. The question each approach answers differently is not &#8216;how do we remove the human?&#8217; but &#8216;where in the process does the human&#8217;s attention have the most leverage?&#8217; Whatever said and done - HITL always! At least for now&#8230;.</strong></p><blockquote><p><em><strong>Pocock: the developer&#8217;s attention is everywhere, always. Harness: the developer&#8217;s attention is concentrated at the governance gates. Synthesis: the developer decides where their attention is needed and escalates to governance exactly there.</strong></em></p></blockquote><p style="text-align: center;"></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[When The Recipe Gets In The Way]]></title><description><![CDATA[A fascinating field experiment reveals the uncomfortable limits of structured AI instruction &#8212; and what comes after]]></description><link>https://interestingengineering.substack.com/p/when-the-recipe-gets-in-the-way</link><guid isPermaLink="false">https://interestingengineering.substack.com/p/when-the-recipe-gets-in-the-way</guid><dc:creator><![CDATA[Interesting Engineering ++]]></dc:creator><pubDate>Thu, 30 Apr 2026 17:39:41 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!67O7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa67e2de-d8bd-4b7f-a1f2-ac431615c944_3822x2134.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!67O7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa67e2de-d8bd-4b7f-a1f2-ac431615c944_3822x2134.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!67O7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa67e2de-d8bd-4b7f-a1f2-ac431615c944_3822x2134.jpeg 424w, https://substackcdn.com/image/fetch/$s_!67O7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa67e2de-d8bd-4b7f-a1f2-ac431615c944_3822x2134.jpeg 848w, https://substackcdn.com/image/fetch/$s_!67O7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa67e2de-d8bd-4b7f-a1f2-ac431615c944_3822x2134.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!67O7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa67e2de-d8bd-4b7f-a1f2-ac431615c944_3822x2134.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!67O7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa67e2de-d8bd-4b7f-a1f2-ac431615c944_3822x2134.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fa67e2de-d8bd-4b7f-a1f2-ac431615c944_3822x2134.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1423359,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196019662?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa67e2de-d8bd-4b7f-a1f2-ac431615c944_3822x2134.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!67O7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa67e2de-d8bd-4b7f-a1f2-ac431615c944_3822x2134.jpeg 424w, https://substackcdn.com/image/fetch/$s_!67O7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa67e2de-d8bd-4b7f-a1f2-ac431615c944_3822x2134.jpeg 848w, https://substackcdn.com/image/fetch/$s_!67O7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa67e2de-d8bd-4b7f-a1f2-ac431615c944_3822x2134.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!67O7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa67e2de-d8bd-4b7f-a1f2-ac431615c944_3822x2134.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I published &#8220;<a href="https://interestingengineering.substack.com/p/the-recipe-for-thinking">The Recipe for Thinking</a>&#8221; arguing that SKILL.md files &#8212; structured instruction documents that guide AI agents how to approach a domain &#8212; represent a genuine democratisation of specialised knowledge. Write the recipe well, and any capable model becomes your logistics expert, your travel planner, your compliance officer. No unnecessary fine-tuning required.</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;fc3503f2-d6c7-4903-a3cf-c5bae4ef07ba&quot;,&quot;caption&quot;:&quot;Do androids dream of electric sheep? Maybe, they Autodream? A simulation?&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;THE RECIPE FOR THINKING?&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:124460392,&quot;name&quot;:&quot;Interesting Engineering ++&quot;,&quot;bio&quot;:&quot;I spend my time learning about, and understanding our complex world better. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/977225f0-cc19-41f4-9df4-e21d01541411_347x347.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-04-27T17:32:34.705Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!gTsK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bdc3e42-66c5-4acb-a7b6-63840831d230_3822x2075.jpeg&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://interestingengineering.substack.com/p/the-recipe-for-thinking&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:195646005,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:4,&quot;comment_count&quot;:1,&quot;publication_id&quot;:1335585,&quot;publication_name&quot;:&quot;Interesting Engineering++&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!-M9w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05150353-1bdc-48d2-b72c-c0bd499513eb_1024x1024.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p style="text-align: justify;">Yesterday, wonderfully by chance, I read a fascinating blog post by <a href="https://alexzhang13.github.io/blog/2026/longcot-rlm/">Alex Zhang and Omar Khattab</a> that made me reconsider one of my quieter assumptions. Not the central thesis &#8212; but a corollary I had taken for granted.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p style="text-align: justify;">Their experiment and review on <a href="https://raw.works/longcot-a-benchmark-worthy-of-a-rlms-attention/">Raymond Waitekamp&#8217;s experiment </a>was blunt and empirical. They looked at a hard benchmark &#8212; LongCoT, which tests <strong>whether AI can solve multi-step problems whose parts depend on each other like a chain of dominoes &#8212; and tried to improve performance through better structured reasoning prompts</strong>. <strong>The result was not a uniform improvement.</strong> In two task categories, the structured approach made things significantly worse. The base model, left to its own devices, actually outperformed the carefully prompted one.</p><p style="text-align: justify;">In a sense, the recipe had gotten in the cook&#8217;s way.</p><p style="text-align: justify;"><strong>Important Note:</strong> </p><blockquote><p style="text-align: justify;">A SKILL.md file <em>is</em> a structured reasoning prompt, just packaged differently. Both are <strong>text-based harnesses</strong> that tell a model how to decompose and approach a problem before it acts. Zhang&#8217;s decomposition tips (&#8221;break the problem into sub-tasks, handle dependencies in this order&#8221;) are functionally identical to a SKILL.md&#8217;s workflow section (&#8221;Step 1: extract data, Step 2: validate, Step 3: route&#8221;). The delivery mechanism differs &#8212; one is injected at inference time, the other is a file the agent loads &#8212; but the effect on the model is the same: <strong>you are pre-specifying a reasoning path</strong>.</p><p>So the failure mode Zhang identifies transfers directly. When you write a SKILL.md procedure for something the model already handles well through its own latent strategy, you risk the same thing his experiment showed &#8212; the imposed path overrides a better implicit one.</p><p>The one honest caveat worth keeping: Zhang&#8217;s experiment is at the <strong>reasoning decomposition</strong> layer (how to think), while SKILL.md also carries <strong>domain knowledge and policy</strong> (what to know, what rules apply). That second layer has fewer failure modes from over-specification &#8212; a model genuinely doesn&#8217;t know your company&#8217;s customs codes or SLA windows. The risk zone is specifically the <em><strong>procedural reasoning steps</strong></em>, not the domain-specific facts.</p><p>So I hold to my thoughts and conclusions, just with that distinction made: be careful prescribing <em>how to think</em>, generous with <em>what to know</em>.</p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UihJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa12ad3bd-f2e1-4399-a080-e6eb6abec82f_3822x2134.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UihJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa12ad3bd-f2e1-4399-a080-e6eb6abec82f_3822x2134.jpeg 424w, https://substackcdn.com/image/fetch/$s_!UihJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa12ad3bd-f2e1-4399-a080-e6eb6abec82f_3822x2134.jpeg 848w, https://substackcdn.com/image/fetch/$s_!UihJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa12ad3bd-f2e1-4399-a080-e6eb6abec82f_3822x2134.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!UihJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa12ad3bd-f2e1-4399-a080-e6eb6abec82f_3822x2134.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UihJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa12ad3bd-f2e1-4399-a080-e6eb6abec82f_3822x2134.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a12ad3bd-f2e1-4399-a080-e6eb6abec82f_3822x2134.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1636242,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196019662?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa12ad3bd-f2e1-4399-a080-e6eb6abec82f_3822x2134.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UihJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa12ad3bd-f2e1-4399-a080-e6eb6abec82f_3822x2134.jpeg 424w, https://substackcdn.com/image/fetch/$s_!UihJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa12ad3bd-f2e1-4399-a080-e6eb6abec82f_3822x2134.jpeg 848w, https://substackcdn.com/image/fetch/$s_!UihJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa12ad3bd-f2e1-4399-a080-e6eb6abec82f_3822x2134.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!UihJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa12ad3bd-f2e1-4399-a080-e6eb6abec82f_3822x2134.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The Experiment and What It Found</h2><p style="text-align: justify;">Zhang and Khattab&#8217;s research blog is framed around what they call the <em>&#8220;<strong>Mismanaged Geniuses Hypothesis&#8221;</strong></em><strong> (MGH): the idea that AI models are systematically underestimated because of how we deploy them, not because of their actual capability</strong>. Their test case was the <strong>LongCoT benchmark</strong> &#8212; a set of compositional reasoning problems involving graphs of sub-tasks, each of which feeds into the next.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WArn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b06c6f9-888b-4e63-85ce-fd11f9e60a4d_3822x2084.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WArn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b06c6f9-888b-4e63-85ce-fd11f9e60a4d_3822x2084.jpeg 424w, https://substackcdn.com/image/fetch/$s_!WArn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b06c6f9-888b-4e63-85ce-fd11f9e60a4d_3822x2084.jpeg 848w, https://substackcdn.com/image/fetch/$s_!WArn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b06c6f9-888b-4e63-85ce-fd11f9e60a4d_3822x2084.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!WArn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b06c6f9-888b-4e63-85ce-fd11f9e60a4d_3822x2084.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WArn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b06c6f9-888b-4e63-85ce-fd11f9e60a4d_3822x2084.jpeg" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2b06c6f9-888b-4e63-85ce-fd11f9e60a4d_3822x2084.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1316811,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196019662?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b06c6f9-888b-4e63-85ce-fd11f9e60a4d_3822x2084.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WArn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b06c6f9-888b-4e63-85ce-fd11f9e60a4d_3822x2084.jpeg 424w, https://substackcdn.com/image/fetch/$s_!WArn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b06c6f9-888b-4e63-85ce-fd11f9e60a4d_3822x2084.jpeg 848w, https://substackcdn.com/image/fetch/$s_!WArn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b06c6f9-888b-4e63-85ce-fd11f9e60a4d_3822x2084.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!WArn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b06c6f9-888b-4e63-85ce-fd11f9e60a4d_3822x2084.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The headline result was encouraging: <strong>by prompting an RLM with well-structured decomposition tips &#8212; a primitive, session-level SKILL.md &#8212; overall performance jumped from 38.7% to 65.6%. Harness engineering working exactly as I described it.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yIiE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55af1845-d58b-48dc-a0dd-55013061f30b_1108x707.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yIiE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55af1845-d58b-48dc-a0dd-55013061f30b_1108x707.png 424w, https://substackcdn.com/image/fetch/$s_!yIiE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55af1845-d58b-48dc-a0dd-55013061f30b_1108x707.png 848w, https://substackcdn.com/image/fetch/$s_!yIiE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55af1845-d58b-48dc-a0dd-55013061f30b_1108x707.png 1272w, https://substackcdn.com/image/fetch/$s_!yIiE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55af1845-d58b-48dc-a0dd-55013061f30b_1108x707.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yIiE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55af1845-d58b-48dc-a0dd-55013061f30b_1108x707.png" width="1108" height="707" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/55af1845-d58b-48dc-a0dd-55013061f30b_1108x707.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:707,&quot;width&quot;:1108,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:257774,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196019662?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55af1845-d58b-48dc-a0dd-55013061f30b_1108x707.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yIiE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55af1845-d58b-48dc-a0dd-55013061f30b_1108x707.png 424w, https://substackcdn.com/image/fetch/$s_!yIiE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55af1845-d58b-48dc-a0dd-55013061f30b_1108x707.png 848w, https://substackcdn.com/image/fetch/$s_!yIiE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55af1845-d58b-48dc-a0dd-55013061f30b_1108x707.png 1272w, https://substackcdn.com/image/fetch/$s_!yIiE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55af1845-d58b-48dc-a0dd-55013061f30b_1108x707.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: <a href="https://alexzhang13.github.io/blog/2026/longcot-rlm/">Alex L Zhang&#8217;s Blog</a></figcaption></figure></div><p style="text-align: justify;">But here is the result I could not stop thinking about. <strong>On mathematical and computer science sub-tasks</strong>, <strong>the structured reasoning model scored 5.6% and 11.0% respectively. The base model, with no special prompting, scored 26.0% and 40.4% on the same tasks. The recipe &#8212; in these specific domains &#8212; cut performance by a factor of four.</strong></p><p><em>&#8220;On mathematical tasks, the carefully prompted model scored 5.6%. The unprompted base model scored 26.0%. The recipe cut performance by a factor of four.&#8221;</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NW02!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65ae84a4-0a3c-46f1-83cb-70eb506faf24_3822x2079.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NW02!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65ae84a4-0a3c-46f1-83cb-70eb506faf24_3822x2079.jpeg 424w, https://substackcdn.com/image/fetch/$s_!NW02!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65ae84a4-0a3c-46f1-83cb-70eb506faf24_3822x2079.jpeg 848w, https://substackcdn.com/image/fetch/$s_!NW02!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65ae84a4-0a3c-46f1-83cb-70eb506faf24_3822x2079.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!NW02!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65ae84a4-0a3c-46f1-83cb-70eb506faf24_3822x2079.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NW02!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65ae84a4-0a3c-46f1-83cb-70eb506faf24_3822x2079.jpeg" width="1456" height="792" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/65ae84a4-0a3c-46f1-83cb-70eb506faf24_3822x2079.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:792,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1153330,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196019662?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65ae84a4-0a3c-46f1-83cb-70eb506faf24_3822x2079.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NW02!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65ae84a4-0a3c-46f1-83cb-70eb506faf24_3822x2079.jpeg 424w, https://substackcdn.com/image/fetch/$s_!NW02!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65ae84a4-0a3c-46f1-83cb-70eb506faf24_3822x2079.jpeg 848w, https://substackcdn.com/image/fetch/$s_!NW02!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65ae84a4-0a3c-46f1-83cb-70eb506faf24_3822x2079.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!NW02!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65ae84a4-0a3c-46f1-83cb-70eb506faf24_3822x2079.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The reason turned out to be mechanical: <strong>the structured approach forced the model into a brute-force execution loop that repeatedly crashed the coding environment. The model&#8217;s own implicit strategy for maths &#8212; flexible, non-procedural &#8212; was better than the one being imposed on it. The instruction file had overwritten a latent competence the model already possessed.</strong></p><h2>The Competence Band</h2><p style="text-align: justify;">This suggests a refinement to the SKILL.md thesis that I think is important to state clearly. <strong>The value of a structured instruction harness is not uniform. It depends on where a task sits relative to what the model can already do.</strong></p><p style="text-align: justify;">I find it useful to think in terms of three zones:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VTy6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e9873b6-54c4-42f0-b90d-d74c319ed24d_794x336.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VTy6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e9873b6-54c4-42f0-b90d-d74c319ed24d_794x336.png 424w, https://substackcdn.com/image/fetch/$s_!VTy6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e9873b6-54c4-42f0-b90d-d74c319ed24d_794x336.png 848w, https://substackcdn.com/image/fetch/$s_!VTy6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e9873b6-54c4-42f0-b90d-d74c319ed24d_794x336.png 1272w, https://substackcdn.com/image/fetch/$s_!VTy6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e9873b6-54c4-42f0-b90d-d74c319ed24d_794x336.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VTy6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e9873b6-54c4-42f0-b90d-d74c319ed24d_794x336.png" width="794" height="336" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0e9873b6-54c4-42f0-b90d-d74c319ed24d_794x336.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:336,&quot;width&quot;:794,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:35209,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196019662?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e9873b6-54c4-42f0-b90d-d74c319ed24d_794x336.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VTy6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e9873b6-54c4-42f0-b90d-d74c319ed24d_794x336.png 424w, https://substackcdn.com/image/fetch/$s_!VTy6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e9873b6-54c4-42f0-b90d-d74c319ed24d_794x336.png 848w, https://substackcdn.com/image/fetch/$s_!VTy6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e9873b6-54c4-42f0-b90d-d74c319ed24d_794x336.png 1272w, https://substackcdn.com/image/fetch/$s_!VTy6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e9873b6-54c4-42f0-b90d-d74c319ed24d_794x336.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: center;"><em>Figure 1 &#183; The three zones of harness value. SKILL.md provides maximum leverage only in the middle band.</em></p><p style="text-align: justify;">Most enterprise use cases &#8212; domain-specific compliance checks, structured document handling, specialised customer workflows &#8212; sit squarely in the middle band. The model has general capability but lacks domain context. A SKILL.md file provides exactly the scaffolding it needs.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4cPe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4233c48-05d7-4691-ae9a-b1f6d1e00df4_3822x2084.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4cPe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4233c48-05d7-4691-ae9a-b1f6d1e00df4_3822x2084.jpeg 424w, https://substackcdn.com/image/fetch/$s_!4cPe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4233c48-05d7-4691-ae9a-b1f6d1e00df4_3822x2084.jpeg 848w, https://substackcdn.com/image/fetch/$s_!4cPe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4233c48-05d7-4691-ae9a-b1f6d1e00df4_3822x2084.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!4cPe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4233c48-05d7-4691-ae9a-b1f6d1e00df4_3822x2084.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4cPe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4233c48-05d7-4691-ae9a-b1f6d1e00df4_3822x2084.jpeg" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d4233c48-05d7-4691-ae9a-b1f6d1e00df4_3822x2084.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1438278,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196019662?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4233c48-05d7-4691-ae9a-b1f6d1e00df4_3822x2084.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4cPe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4233c48-05d7-4691-ae9a-b1f6d1e00df4_3822x2084.jpeg 424w, https://substackcdn.com/image/fetch/$s_!4cPe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4233c48-05d7-4691-ae9a-b1f6d1e00df4_3822x2084.jpeg 848w, https://substackcdn.com/image/fetch/$s_!4cPe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4233c48-05d7-4691-ae9a-b1f6d1e00df4_3822x2084.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!4cPe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4233c48-05d7-4691-ae9a-b1f6d1e00df4_3822x2084.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The below-band failure mode is what Zhang&#8217;s experiment exposed: <strong>domains where the model&#8217;s latent knowledge is already highly developed (formal mathematics, structured code), and where procedural instructions impose an inferior strategy</strong>. This is the trap of <strong>over-engineering a harness for something the model already handles better without one.</strong></p><p style="text-align: justify;">The above-band failure mode is subtler and more interesting. <strong>For genuinely complex compositional tasks, rigid step-by-step instructions calcify into the wrong approach. The model needs exploratory latitude, not a script. This is where autonomous skill discovery &#8212; systems like SkillWeaver that generate their own operational approaches from interaction &#8212; starts to look not just interesting but necessary.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!v3ZD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f7c48ee-36f3-402e-b8cc-45afa73d3268_3822x2134.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!v3ZD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f7c48ee-36f3-402e-b8cc-45afa73d3268_3822x2134.jpeg 424w, https://substackcdn.com/image/fetch/$s_!v3ZD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f7c48ee-36f3-402e-b8cc-45afa73d3268_3822x2134.jpeg 848w, https://substackcdn.com/image/fetch/$s_!v3ZD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f7c48ee-36f3-402e-b8cc-45afa73d3268_3822x2134.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!v3ZD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f7c48ee-36f3-402e-b8cc-45afa73d3268_3822x2134.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!v3ZD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f7c48ee-36f3-402e-b8cc-45afa73d3268_3822x2134.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4f7c48ee-36f3-402e-b8cc-45afa73d3268_3822x2134.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1350565,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196019662?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f7c48ee-36f3-402e-b8cc-45afa73d3268_3822x2134.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!v3ZD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f7c48ee-36f3-402e-b8cc-45afa73d3268_3822x2134.jpeg 424w, https://substackcdn.com/image/fetch/$s_!v3ZD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f7c48ee-36f3-402e-b8cc-45afa73d3268_3822x2134.jpeg 848w, https://substackcdn.com/image/fetch/$s_!v3ZD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f7c48ee-36f3-402e-b8cc-45afa73d3268_3822x2134.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!v3ZD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f7c48ee-36f3-402e-b8cc-45afa73d3268_3822x2134.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Tuba!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb80778cb-6a04-466a-98be-8b8a9b6c1aba_3822x2134.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Tuba!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb80778cb-6a04-466a-98be-8b8a9b6c1aba_3822x2134.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Tuba!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb80778cb-6a04-466a-98be-8b8a9b6c1aba_3822x2134.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Tuba!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb80778cb-6a04-466a-98be-8b8a9b6c1aba_3822x2134.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Tuba!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb80778cb-6a04-466a-98be-8b8a9b6c1aba_3822x2134.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Tuba!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb80778cb-6a04-466a-98be-8b8a9b6c1aba_3822x2134.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b80778cb-6a04-466a-98be-8b8a9b6c1aba_3822x2134.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1647604,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196019662?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb80778cb-6a04-466a-98be-8b8a9b6c1aba_3822x2134.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Tuba!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb80778cb-6a04-466a-98be-8b8a9b6c1aba_3822x2134.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Tuba!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb80778cb-6a04-466a-98be-8b8a9b6c1aba_3822x2134.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Tuba!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb80778cb-6a04-466a-98be-8b8a9b6c1aba_3822x2134.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Tuba!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb80778cb-6a04-466a-98be-8b8a9b6c1aba_3822x2134.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The Recipe That Writes Itself</h2><p style="text-align: justify;">The most quietly significant finding in Zhang&#8217;s experiment was not the performance numbers. It was this: <strong>the improved decomposition prompt &#8212; the one that produced the jump to 65.6% &#8212; was generated by Claude Code, not by a human.</strong></p><p style="text-align: justify;"><strong>Zhang asked </strong><em><strong>Claude Code to analyse the failure traces and write tips</strong></em><strong> for how not to make the same mistakes. The result outperformed everything a human had designed. In other words, the language model recognised the decomposition strategy it needed better than any external author could have specified it.</strong></p><p style="text-align: justify;">This is the self-writing recipe card I gestured at, but it is now an empirical result rather than a projection. And it has a specific implication for how I think about the SKILL.md trajectory.</p><blockquote><p style="text-align: justify;"><em><strong>The recipe file does not have to be authored by a human expert. It can emerge from the model&#8217;s own reflection on its performance. The human&#8217;s job shifts from writing instructions to curating, validating, and constraining what the model has learned about itself. This is a different &#8212; and arguably more sustainable &#8212; division of labour.</strong></em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8OZv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09c3bdf5-4202-4987-a394-77074ba24c3b_784x342.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8OZv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09c3bdf5-4202-4987-a394-77074ba24c3b_784x342.png 424w, https://substackcdn.com/image/fetch/$s_!8OZv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09c3bdf5-4202-4987-a394-77074ba24c3b_784x342.png 848w, https://substackcdn.com/image/fetch/$s_!8OZv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09c3bdf5-4202-4987-a394-77074ba24c3b_784x342.png 1272w, https://substackcdn.com/image/fetch/$s_!8OZv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09c3bdf5-4202-4987-a394-77074ba24c3b_784x342.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8OZv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09c3bdf5-4202-4987-a394-77074ba24c3b_784x342.png" width="784" height="342" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/09c3bdf5-4202-4987-a394-77074ba24c3b_784x342.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:342,&quot;width&quot;:784,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:41454,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196019662?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09c3bdf5-4202-4987-a394-77074ba24c3b_784x342.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8OZv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09c3bdf5-4202-4987-a394-77074ba24c3b_784x342.png 424w, https://substackcdn.com/image/fetch/$s_!8OZv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09c3bdf5-4202-4987-a394-77074ba24c3b_784x342.png 848w, https://substackcdn.com/image/fetch/$s_!8OZv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09c3bdf5-4202-4987-a394-77074ba24c3b_784x342.png 1272w, https://substackcdn.com/image/fetch/$s_!8OZv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09c3bdf5-4202-4987-a394-77074ba24c3b_784x342.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: center;"><em>Figure 2 &#183; The evolution of skill creation, execution, and oversight across four eras. The ISR Part One thesis covers Eras 1&#8211;3; Zhang&#8217;s experiment is live evidence from the Era 3 transition.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Fvjc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1115de4-60a4-4a5b-82e5-2a5aebd6c9a5_3822x2134.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Fvjc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1115de4-60a4-4a5b-82e5-2a5aebd6c9a5_3822x2134.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Fvjc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1115de4-60a4-4a5b-82e5-2a5aebd6c9a5_3822x2134.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Fvjc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1115de4-60a4-4a5b-82e5-2a5aebd6c9a5_3822x2134.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Fvjc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1115de4-60a4-4a5b-82e5-2a5aebd6c9a5_3822x2134.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Fvjc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1115de4-60a4-4a5b-82e5-2a5aebd6c9a5_3822x2134.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c1115de4-60a4-4a5b-82e5-2a5aebd6c9a5_3822x2134.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1423359,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196019662?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1115de4-60a4-4a5b-82e5-2a5aebd6c9a5_3822x2134.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Fvjc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1115de4-60a4-4a5b-82e5-2a5aebd6c9a5_3822x2134.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Fvjc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1115de4-60a4-4a5b-82e5-2a5aebd6c9a5_3822x2134.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Fvjc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1115de4-60a4-4a5b-82e5-2a5aebd6c9a5_3822x2134.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Fvjc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1115de4-60a4-4a5b-82e5-2a5aebd6c9a5_3822x2134.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Move 37 and What We Are Actually Building Toward</h2><p style="text-align: justify;">Zhang ends his post with an aspiration borrowed from the history of Go. In 2016, AlphaGo played a move &#8212; Move 37 in its second game against Lee Sedol &#8212; that no human would have chosen. It looked wrong. It turned out to be decisive. The machine had found a strategy that simply did not exist in the human repertoire.</p><p style="text-align: justify;">Zhang wants an RLM system that reaches its own Move 37: decisions <em>&#8220;we do not understand, but which are significantly better than the decompositions we come up with.&#8221;</em> He is honest that we are not there yet &#8212; and that in the short term, human-written steering remains the practical approach.</p><p style="text-align: justify;">I think this is exactly right, and it maps cleanly onto the latent-direction trajectory I described in my article. <strong>The SKILL.md file is the bridge between where we are and where we are going. It is the interface through which human expertise and machine discovery converge.</strong></p><blockquote><p style="text-align: justify;">But Zhang&#8217;s experiment adds a warning that I did not emphasise enough: <em><strong>the bridge has a direction. You cross it toward reduced instruction and greater model latitude. You do not fortify it indefinitely. Organisations that treat SKILL.md files as a permanent architecture rather than a transitional scaffold will find themselves in the below-band trap &#8212; prescribing procedures to a system that has grown beyond them.</strong></em></p></blockquote><p><em>&#8220;The recipe file is a bridge, not a fortress. It has a direction. You cross it toward reduced instruction &#8212; you do not fortify it indefinitely.&#8221;</em></p><h2>What I Am Taking Forward</h2><p style="text-align: justify;">Publishing The Recipe For Thinking, <em><strong>I was confident in the core claim: structured skill files democratise domain expertise and reduce the cost of building capable agentic systems. I remain confident.</strong></em> The experiment Zhang describes is evidence for that thesis, not against it &#8212; <em><strong>the harness-engineered result beat the baseline overall</strong></em>.</p><p style="text-align: justify;">But the experiment also taught me to be more precise about three things:</p><p>&#8226; Where in the competence band your task sits before you decide how much structure to impose.</p><p>&#8226; Whether the skill file you are writing is adding context the model lacks, or overwriting competence it already has.</p><p>&#8226; Whether you are treating your SKILL.md as a living document that the model itself should help evolve, or as a fixed specification that humans maintain alone.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YLo1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22414e65-f7ac-4796-b1b4-5feb1e0311ac_3822x2134.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YLo1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22414e65-f7ac-4796-b1b4-5feb1e0311ac_3822x2134.jpeg 424w, https://substackcdn.com/image/fetch/$s_!YLo1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22414e65-f7ac-4796-b1b4-5feb1e0311ac_3822x2134.jpeg 848w, https://substackcdn.com/image/fetch/$s_!YLo1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22414e65-f7ac-4796-b1b4-5feb1e0311ac_3822x2134.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!YLo1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22414e65-f7ac-4796-b1b4-5feb1e0311ac_3822x2134.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YLo1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22414e65-f7ac-4796-b1b4-5feb1e0311ac_3822x2134.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/22414e65-f7ac-4796-b1b4-5feb1e0311ac_3822x2134.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1446932,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://interestingengineering.substack.com/i/196019662?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22414e65-f7ac-4796-b1b4-5feb1e0311ac_3822x2134.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!YLo1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22414e65-f7ac-4796-b1b4-5feb1e0311ac_3822x2134.jpeg 424w, https://substackcdn.com/image/fetch/$s_!YLo1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22414e65-f7ac-4796-b1b4-5feb1e0311ac_3822x2134.jpeg 848w, https://substackcdn.com/image/fetch/$s_!YLo1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22414e65-f7ac-4796-b1b4-5feb1e0311ac_3822x2134.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!YLo1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22414e65-f7ac-4796-b1b4-5feb1e0311ac_3822x2134.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">Zhang and Khattab&#8217;s Mismanaged Geniuses Hypothesis is, at its core, a call for intellectual honesty about what constrains AI performance. More often than we admit, the constraint is us &#8212; our prompts, our instructions, our assumptions about what the model needs to be told.</p><p style="text-align: justify;"><em><strong>The best skill file, in the end, may be the one that knows when to get out of the way</strong></em>.</p><h3><strong>References</strong></h3><p><strong>[1] </strong>Zhang, A. &amp; Khattab, O. &#8220;A Mini Exercise on the Mismanaged Geniuses Hypothesis (RLMs on LongCoT).&#8221; Personal research blog, April 26, 2026. <a href="https://alexzhang13.github.io/blog/2026/longcot-rlm/">https://alexzhang13.github.io/blog/2026/longcot-rlm/</a></p><p><strong>[2] </strong>Motwani, S., et al. &#8220;LongCoT: A Long-Horizon Compositional Reasoning Benchmark.&#8221; arXiv preprint arXiv:2604.14140, 2026. <a href="https://arxiv.org/abs/2604.14140">https://arxiv.org/abs/2604.14140</a></p><p><strong>[3] </strong>Zhou, S., et al. &#8220;SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills.&#8221; arXiv preprint arXiv:2504.07079, 2025. <a href="https://arxiv.org/abs/2504.07079">https://arxiv.org/abs/2504.07079</a></p><p><strong>[4] </strong>Wang, L., et al. &#8220;Beyond Pipelines: A Survey of the Paradigm Shift toward Model-Native Agentic AI.&#8221; arXiv preprint arXiv:2510.16720, 2025. <a href="https://arxiv.org/abs/2510.16720">https://arxiv.org/abs/2510.16720</a></p><p><strong>[5] </strong>&#8220;<a href="https://interestingengineering.substack.com/p/the-recipe-for-thinking">The Recipe for Thinking: How AI Agents Learned to Follow Instructions &#8212; and Why They&#8217;re About to Forget Them.</a>&#8221; </p><p>[6] Reading List and References from Alex L Zhang&#8217;s blog:</p><ul><li><p>Trajectories and visualizer for the main experiment above: <a href="https://github.com/alexzhang13/longcot-mini-rlm-results">https://github.com/alexzhang13/longcot-mini-rlm-results</a></p></li><li><p>LongCoT Dataset: <a href="https://huggingface.co/datasets/LongHorizonReasoning/longcot">https://huggingface.co/datasets/LongHorizonReasoning/longcot</a></p></li><li><p>LongCoT Repository: <a href="https://github.com/LongHorizonReasoning/longcot">https://github.com/LongHorizonReasoning/longcot</a></p></li><li><p>LongCoT paper: <a href="https://arxiv.org/abs/2604.14140">https://arxiv.org/abs/2604.14140</a></p></li><li><p>Raymond Weitekamp&#8217;s blog on RLMs: <a href="https://raw.works/longcot-a-benchmark-worthy-of-a-rlms-attention/">https://raw.works/longcot-a-benchmark-worthy-of-a-rlms-attention/</a></p></li><li><p>Recursive Language Models (RLM) paper: <a href="https://arxiv.org/abs/2512.24601">https://arxiv.org/abs/2512.24601</a></p></li><li><p>My RLM implementation: <a href="https://github.com/alexzhang13/rlm">https://github.com/alexzhang13/rlm</a></p></li><li><p>Prime Intellect&#8217;s RLM implementation in verifiers: <a href="https://github.com/PrimeIntellect-ai/verifiers/blob/main/verifiers/envs/experimental/rlm_env.py">https://github.com/PrimeIntellect-ai/verifiers/blob/main/verifiers/envs/experimental/rlm_env.py</a></p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://interestingengineering.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Interesting Engineering++! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item></channel></rss>