
The journey from simple code completion tools to autonomous AI factories represents one of the most rapid technological evolutions in software engineering history. Between November 2022 and February 2026, the field progressed through ten distinct phases, each attempting to answer Ulderico Cirello's fundamental question: "How can I stop doing the work myself?"
This research documents the major milestones, key players, and conceptual shifts that led to the emergence of AI factories. Ulderico Cirello's Sgai represents a significant contribution to this lineage, arriving at a pivotal moment in February 2026 as the field transitions from orchestrated factories to fully autonomous "dark factories."
Agent Swarms <- Just Throw Tokens into the Problem
Orchestrators <- Human is almost absolutely removed from the mix
→ Dark Factories (Specs-driven, lights-out - ideal when you know exactly what you want)
→ Complex Orchestrators (Fine-grained control, sophisticated workflows - ideal for teams requiring that level of control)
→ Interactive Factories (Conversational development - ideal when you discover what you want by seeing it)
Tool Helps You Drive Multiple AI Agents <- My Job In The New Age is to Manage Agents
Extend and TurboCharge the tool <- Human is very much present in the mix
Launch: March 12, 2024
Creator: Cognition Labs (Scott Wu, founder)
Tagline: "The world's first fully autonomous AI software engineer"
Devin AI emerged as the first major commercial product claiming full autonomy in software engineering. The platform could write, run, test, and deploy code independently, setting new state-of-the-art performance on the SWE-bench coding benchmark. Developed in approximately five months, Devin represented a brute-force approach to autonomy: throw sufficient computational resources at the problem until it works.
The platform's significance lies in establishing "AI software engineer" as a product category and demonstrating commercial viability. Enterprise deployments at Goldman Sachs and Visma validated the market, while backing from major investors including DoorDash founder Tony Xu signaled Silicon Valley's confidence in autonomous coding.
However, Devin exemplified the "just throw tokens at the problem" approach that Ulderico Cirello identified as a category. High computational costs, questions about actual autonomy versus marketing claims, and limitations imposed by the era's context windows constrained its practical application. Nevertheless, Devin sparked an industry-wide race to build similar tools and established benchmarks for autonomous agent performance.
Launch: Founded 2023, GA May 28, 2025
Creators: Jean-Philippe Picard and Tim Cheung
Tagline: "Agent-Native Software Development"
Factory AI introduced "Droids"—autonomous AI agents that automate coding, testing, and deployment across the entire software development lifecycle. The platform integrated directly into IDEs and CI/CD pipelines, positioning itself as a complete SDLC solution rather than a mere coding assistant.
With $50M in Series B funding at a $300M valuation from NEA, Sequoia Capital, J.P. Morgan, and Nvidia, Factory demonstrated strong enterprise appetite for autonomous coding platforms. The company's leadership on the Terminal-Bench benchmark and emphasis on fine-grained controls and guardrails addressed enterprise concerns about safety and reliability.
Factory's influence extends beyond its technical capabilities. By establishing "agent-native development" as a paradigm and validating the business model for AI coding platforms, Factory helped legitimize the entire category. The platform showed that enterprises were willing to fundamentally rethink their development processes to incorporate autonomous agents.
Launch: Founded January 2025, public launch September 1, 2025
Creators: Chamath Palihapitiya (founder), Sina Sojoodi (co-founder)
Tagline: "Software Factory" / "AI-native SDLC orchestration platform"
8090 AI distinguished itself by extending beyond developers to include product managers, designers, engineers, and QA in its AI-native SDLC orchestration platform. Integration with external development tools via MCP (Model Context Protocol) provided extensibility, while the Planner module enabled task orchestration across the entire team.
Chamath Palihapitiya's high-profile involvement brought significant attention, particularly after he criticized 2025 as a "Year of Letdowns" in AI before launching 8090 on September 1, 2025. The platform emphasized reliability over flashiness, positioning itself as enterprise software replacement rather than proof-of-concept.
Critics dismissed 8090 as "vague and quite ordinary," questioning whether it represented true innovation or merely requirements management with AI branding. Nevertheless, 8090's influence lies in bringing business and product perspectives to the AI factory concept and demonstrating that autonomous agents could serve entire organizations, not just engineering teams.
Launch: January 1, 2026
Creator: Steve Yegge
Tagline: "Gas Town" (deliberately provocative)
Steve Yegge's GasTown represents the first system that Ulderico Cirello's timeline identifies as a "notorious Factory." Built on Yegge's Beads memory framework, GasTown coordinates 20-30 Claude Code instances in parallel, with each agent operating in an isolated git worktree.
The Hook system enforces autonomous execution with a simple principle: "If there is work on your Hook, YOU MUST RUN IT." This architectural decision removes waiting for external input, enabling truly autonomous distributed work. Yegge described the system as a "CNC machine" for code, with agents working like "residents of a bustling frontier town."
The deliberately provocative naming—Yegge described it as "lifting my leg and ripping a fart" in the "Church of Public Opinion on AI-Assisted Coding"—reflected his conviction that the industry had been approaching AI-assisted coding incorrectly. GasTown's influence lies in proving that coordination matters more than raw agent capability and demonstrating that autonomous distributed work was not just theoretically possible but practically achievable.
Launch: Initial release 2025, v2.0.0 Alpha July 5, 2025
Creator: Reuven Cohen (ruvnet)
Tagline: "The leading agent orchestration platform for Claude"
Claude Flow transformed Claude Code from a single agent into a comprehensive multi-agent development platform. The 64-agent system provided enterprise-grade orchestration with hive-mind swarm intelligence, workflow orchestration supporting parallel execution and dependency management, and memory persistence across sessions.
The MCP server with 87 tools demonstrated the breadth of capabilities possible in an agent ecosystem. WASM integration provided performance optimization, while neural networks enabled sophisticated coordination. Integration with Flow Nexus and ruv-swarm created a gamified development environment that extended beyond pure coding.
Claude Flow's influence lies in establishing patterns for agent coordination and demonstrating the path from simple agents to complex systems. The platform showed that enterprise-scale orchestration was achievable and created an ecosystem around Claude Code that extended its capabilities far beyond Anthropic's original vision.
Launch: February 2026 (published ~February 7, 2026)
Creator: StrongDM AI team
Tagline: "First Notorious DARK Factory" (per Ulderico's timeline)
Attractor represents the culmination of the AI factory evolution: a true "dark factory" where humans are completely removed from the implementation loop. StrongDM's revolutionary decision to release Attractor as NLSpecs (Natural Language Specifications) rather than code created a meta-innovation—specifications for building your own software factory.
The DOT-based pipeline runner uses directed graphs (Graphviz syntax) to orchestrate multi-stage AI workflows. An autonomous agent writes implementation code while the DTU (verification system) runs thousands of tests, iterating until convergence. StrongDM described this as "lights-out" development where humans are banned from writing code.
The emphasis on verification and testing, with agentic testing driving convergence through iteration, addresses the fundamental challenge of autonomous systems: how do you ensure quality without human review? Attractor's answer is exhaustive automated testing with iterative refinement.
While some observers questioned whether the "dark factory" concept truly works at scale, Attractor's influence lies in pushing the boundaries of what autonomous systems can achieve and inspiring debate about the human role in software development. The meta-approach of distributing specifications rather than implementation code suggests a future where development methodologies themselves become open-source artifacts.
Launch: February 2026 (v0.0.0-20260209225942 published February 2, 2026)
Creator: Ulderico Cirello
Tagline: "Sandgarden AI Software Factory"
Ulderico Cirello's Sgai represents a synthesis of the lessons learned throughout the AI factory evolution. The platform's elegant simplicity—centered on a single GOAL.md file where developers define their objectives—reflects deep understanding of what developers actually need: clear abstraction rather than additional complexity.
The web dashboard provides transparency into multi-agent collaboration, addressing the "black box" concern that plagued earlier autonomous systems. GitHub integration with native support for branches and pull requests ensures that Sgai fits naturally into existing developer workflows. The rich Markdown editor for goal editing makes the interface accessible while maintaining precision.
Sgai's significance lies in its positioning at a pivotal moment in the AI factory evolution. Released in February 2026 alongside Attractor and the Codex app, Sgai represents Ulderico's answer to the fundamental question that has driven this entire evolution: "How can I stop doing the work myself?" The answer: define your goals clearly, and let coordinated agents handle the implementation.
Launch: Founded 2021, public launch July 15, 2025
Creators: Gabriel Gordon-Hall and Louis Knight-Webb (London, UK)
Tagline: "Orchestrate AI Coding Agents" / "Get 10X more out of Claude Code"
VibeKanban brought a familiar interface—the Kanban board—to the novel challenge of managing multiple AI coding agents. The Y Combinator-backed platform supported Claude Code, Codex, Gemini CLI, and Amp, with real-time monitoring and control over agent execution.
The platform's significance lies in making multi-agent management accessible through a familiar UI pattern. By the time of launch, over 50% of VibeKanban's own code was written by AI agents—a compelling demonstration of the platform's capabilities. The emphasis on planning, review, and safe execution addressed concerns about autonomous agents running unchecked.
VibeKanban validated "agent manager" as a viable role for developers and demonstrated that orchestration tools could have significant market demand. The platform's open-source nature and active community contributed to the broader ecosystem of agent orchestration tools.
Launch: Founded 2024, public launch October 2025
Creators: Jackson de Campos and Charlie Holtz (San Francisco)
Tagline: "Run a team of coding agents on your Mac"
Conductor offered a polished Mac-native experience for running multiple Claude Code or Codex agents in parallel. Each agent operated in an isolated git worktree, providing elegant separation of concerns. The platform worked entirely on the local Mac, cloning repositories and managing workspaces without external dependencies.
The "conductor" metaphor resonated with developers: humans orchestrated AI teams rather than writing code themselves. Integration with Gemini CLI for context-driven development and real-time visibility into agent activity provided the control developers needed to trust autonomous agents.
Conductor's influence lies in demonstrating the value of native app experiences and establishing the git worktree pattern that other tools adopted. The platform showed the path from terminal-based tools to polished GUI applications and validated agent management as a product category worthy of dedicated applications.
Launch: February 2, 2026
Creator: OpenAI
Tagline: "A powerful command center for building with agents"
OpenAI's Codex app represented the company's answer to Claude Code's desktop experience. The macOS application provided a focused interface for managing multiple agents running in parallel across projects and within the same codebase. Built-in worktree support, cloud environments, and Git functionality created a comprehensive development environment.
The "command center" positioning reflected a fundamental shift in how OpenAI viewed developer tools. Rather than chat interfaces or code completion, the Codex app positioned developers as commanders of agent teams. Integration with the broader Codex ecosystem (CLI, IDE, app) and support for GPT-5.3-Codex (the most capable agentic coding model) provided a complete solution.
The Codex app's significance lies in OpenAI's validation of the multi-agent paradigm and establishment of "command center" as a product category. By demonstrating that agent management was core to developer workflow rather than an experimental add-on, OpenAI legitimized the entire category and set standards for desktop agent experiences.
Launch: October 2025, major updates January 2026
Creator: Jesse Vincent (obra)
Tagline: "An agentic skills framework & software development methodology that works"
Jesse Vincent's Superpowers took a different approach to enhancing coding agents: rather than building new orchestration platforms, it extended existing tools through a skills framework. The plugin system for Claude Code (with manual setup for Codex and OpenCode) provided a core skills library with 42+ official documentation files and specialized capabilities like Chrome browser control via DevTools Protocol.
The emphasis on methodology rather than just tools distinguished Superpowers from other projects. The developing-for-Claude-Code skill enabled plugin development, while the experimental skills lab tested new techniques. Community-editable skills (511 stars, 118 forks on GitHub) demonstrated active ecosystem participation.
Superpowers' influence lies in establishing the skills framework pattern and demonstrating that extensibility could be as valuable as orchestration. The platform showed a path to "supercharging" existing tools rather than replacing them, and the community-driven development model created a marketplace of capabilities that developers could mix and match.
Launch: December 2025 - January 2026
Creator: Kim Yeon-gyu (code-yeongyu)
Tagline: "the best agent harness" / "This is coding on steroids"
Kim Yeon-gyu's Oh-My-OpenCode demonstrated that sophisticated orchestration didn't require a separate platform—it could be a plugin layer on top of existing tools. Running on top of OpenCode, Oh-My-OpenCode enabled multi-agent coordination with background agents, pre-built LSP/AST/MCP tools, and curated hooks and automations.
The failure tracking system (two strikes, continuation stops) addressed a common problem with autonomous agents: knowing when to give up. The ability to build features from descriptions and work with ChatGPT subscriptions made sophisticated multi-agent orchestration accessible to developers without enterprise budgets.
Oh-My-OpenCode's influence lies in demonstrating the plugin-as-orchestrator pattern and showing that OpenCode's extensibility could support enterprise-grade multi-agent coordination. The international contribution (from a Korean developer disrupting the AI coding tool market) highlighted the global nature of the AI factory evolution.
Launch: Originated mid-2025, went viral late December 2025 / early January 2026
Creator: Geoffrey Huntley
Tagline: "Ship working code while you sleep"
The Ralph Wiggums technique represents one of the most influential innovations in the AI factory evolution, despite being conceptually simple. Named after the simple-minded character from The Simpsons, the technique embraced a counterintuitive insight: persistence matters more than sophistication.
At its core, Ralph Wiggums is a bash loop that feeds an AI agent's output (errors and all) back into itself until a stop condition is met. Each iteration is a fresh session with no long-term memory—the agent only sees previous work via git history. The technique supported Claude Code, OpenCode, Codex, Cursor, and Amp, and could read tasks from PRD files, YAML, or GitHub Issues.
The economics were stunning: one frequently cited example completed a $50,000 contract for $297 in API costs—a 99% cost reduction. The technique enabled autonomous overnight coding, with developers setting tasks before bed and waking to completed work.
The Ralph Wiggums technique's influence extends far beyond its technical implementation. As VentureBeat noted, it became "the biggest name in AI" through a meta-twist—it didn't just generate code, it generated a market. The technique inspired tools like Loopy and Laravel Ralph, and the "everything is a ralph loop" meme captured a broader realization about the power of simple iteration.
The cultural impact cannot be overstated. Ralph Wiggums changed the conversation from "smart agents" to "persistent agents" and demonstrated that autonomous overnight coding was achievable through relentless iteration rather than sophisticated intelligence.
The first attempts to reduce developer workload came through machine learning-based code completion. Tabnine, launched in 2019, pioneered local model deployment for autocomplete functionality. Kite, another early entrant, shut down in 2022—a cautionary tale of being too early to market. These tools answered the question with "Let AI finish my sentences," but remained limited to line-by-line suggestions without broader context.
GitHub Copilot marked a watershed moment when it launched as a technical preview on June 29, 2021, becoming generally available exactly one year later. Developed by GitHub and OpenAI, Copilot introduced the metaphor of an "AI pair programmer" and achieved remarkable adoption—400,000 subscribers in its first month alone. This established the expectation that AI could be a collaborative coding partner rather than just an autocomplete tool.
Cursor, founded by Anysphere in 2022 and publicly launched in January 2023, took a different approach by creating an AI-native code editor forked from VS Code. Rather than being a plugin, Cursor demonstrated that AI could be fundamental to the editor experience itself. By October 2025, with the release of Cursor 2.0 and Composer, the platform had evolved into a sophisticated AI-native development environment.
Windsurf, launched by Codeium on November 14, 2024, coined the term "agentic IDE" with its Cascade agent that provided deep codebase understanding. Positioning itself as "the first AI agent-powered IDE that keeps developers in the flow," Windsurf represented the bridge between pair programming assistants and truly autonomous agents.
The release of GPT-4 on March 14, 2023, supercharged experimentation with autonomous agents. AutoGPT, released on March 30, 2023, by Toran Bruce Richards of Significant Gravitas Ltd, became the first viral autonomous agent experiment. It could break down objectives into sub-tasks and execute them independently, though it often went in circles and consumed tokens at alarming rates.
BabyAGI, created by Yohei Nakajima shortly after AutoGPT in April 2023, offered a cleaner architecture with three specialized agents: execution, task creation, and prioritization. While more educational and better structured than AutoGPT, it still struggled with reliability and practical application. These experiments proved the concept of autonomous agents but revealed that autonomy alone was insufficient—orchestration would be essential.
OpenCode emerged in June-July 2024 as an open-source alternative to proprietary coding agents. Created by Jay, Frank, and Dax Raad, OpenCode distinguished itself through its terminal-first architecture and model-agnostic approach. Developers could swap between providers, use local models, or connect to custom endpoints—a flexibility that proprietary tools lacked. By January 2026, it was being called "the most popular open source coding agent in the world," with debates raging about whether it was "twice as good as Claude Code" or whether "Claude wrote better code."
Devin AI, launched on March 12, 2024, by Cognition Labs, marked the beginning of the "headless agents" era. Branded as "the world's first fully autonomous AI software engineer," Devin represented a brute-force approach: deploy massive computational resources to achieve autonomy. Founded by Scott Wu and backed by major investors including DoorDash founder Tony Xu, Devin set new benchmarks on the SWE-bench coding assessment and secured enterprise deployments at Goldman Sachs and Visma.
Factory AI, founded in 2023 and reaching general availability on May 28, 2025, introduced the concept of "agent-native software development" with its "Droids" that automated coding, testing, and deployment across the entire SDLC. With $50M in Series B funding at a $300M valuation from NEA, Sequoia Capital, J.P. Morgan, and Nvidia, Factory demonstrated strong enterprise appetite for autonomous coding platforms.
8090 AI, founded in January 2025 by Chamath Palihapitiya and publicly launched on September 1, 2025, positioned itself as an "AI-native SDLC orchestration platform" that extended beyond developers to include PMs, designers, and QA. After Chamath criticized 2025 as a "Year of Letdowns" in AI, 8090's Software Factory emphasized reliability over flashiness, though critics dismissed it as "vague and quite ordinary."
These agent swarms shared a common limitation: they were constrained by the context windows of their era. GPT-4's initial 8K context (later expanded to 32K and eventually 128K) simply wasn't enough for real-world software projects. The approach worked in demos but struggled in production.
Two pivotal releases in early 2025 transformed the landscape. Claude Code launched on February 24, 2025, as a research preview (reaching general availability on May 22, 2025). Developed by Anthropic, Claude Code was an agentic coding tool that could read codebases, edit files, and run commands across terminal, IDE, browser, and desktop environments. Its plugin marketplace and extensibility made it the foundation for numerous orchestration tools.
The true inflection point came with GPT-5, announced on July 24, 2025, and launched on August 7, 2025. With a context window of 256K tokens (some sources reported up to 400K), GPT-5 represented what Ulderico Cirello's timeline marks as the "OH SHIT, I CAN USE THIS" moment. The dramatic expansion in context window made autonomous agents practical rather than merely interesting. Multi-file, multi-agent workflows suddenly became viable for real-world applications.
With capable foundation models finally available, the focus shifted to managing multiple agents effectively. VibeKanban, founded in 2021 by Gabriel Gordon-Hall and Louis Knight-Webb and launched publicly on July 15, 2025, brought a familiar Kanban board interface to agent orchestration. The Y Combinator-backed platform supported Claude Code, Codex, Gemini CLI, and Amp, positioning developers as "agent managers" rather than coders. By the time of launch, over 50% of VibeKanban's own code was written by AI agents.
Conductor, founded in 2024 by Jackson de Campos and Charlie Holtz and launched in October 2025, offered a polished Mac-native experience for running multiple Claude Code or Codex agents in parallel. Each agent operated in an isolated git worktree, providing elegant separation of concerns. The "conductor" metaphor resonated: humans orchestrated AI teams rather than writing code themselves.
OpenAI's Codex App, released on February 2, 2026, represented OpenAI's answer to the Claude Code desktop experience. Positioned as "a powerful command center for building with agents," it featured built-in worktree support, cloud environments, and automations. The Codex app validated multi-agent orchestration as a core developer workflow and established "command center" as a product category.
While sophisticated orchestration platforms were emerging, a simpler approach was gaining traction. The Ralph Wiggums technique, created by Geoffrey Huntley and first mentioned in June 2025, went viral in late December 2025 and early January 2026. Named after the simple-minded character from The Simpsons, the technique embraced a counterintuitive insight: persistence matters more than sophistication.
Ralph Wiggums is essentially a bash loop that feeds an AI agent's output (errors and all) back into itself until a stop condition is met. Each iteration is a fresh session with no long-term memory—the agent only sees previous work via git history. The technique supported Claude Code, OpenCode, Codex, Cursor, and Amp, and could read tasks from PRD files, YAML, or GitHub Issues.
The economics were stunning: one frequently cited example completed a $50,000 contract for $297 in API costs—a 99% cost reduction. The technique's viral spread reflected a broader realization: autonomous overnight coding was achievable not through sophisticated intelligence but through relentless iteration. As VentureBeat noted, Ralph Wiggums became "the biggest name in AI" through a meta-twist—it didn't just generate code, it generated a market.
GasTown, released by Steve Yegge on January 1, 2026, marked what Ulderico Cirello's timeline identifies as the "first notorious Factory." The legendary engineer (formerly at Amazon and Google) built GasTown on his Beads memory framework to coordinate 20-30 Claude Code instances in parallel. Each agent operated in an isolated worktree, and the Hook system enforced autonomous execution: "If there is work on your Hook, YOU MUST RUN IT."
Yegge's deliberately provocative naming—describing GasTown as "lifting my leg and ripping a fart" in the "Church of Public Opinion on AI-Assisted Coding"—reflected his conviction that coordination mattered more than raw agent capability. The metaphor of agents as "residents of a bustling frontier town" captured the distributed, autonomous nature of the system. GasTown proved that autonomous distributed work was not just possible but practical.
Claude Flow, created by Reuven Cohen (ruvnet), evolved through multiple versions in 2025, with v2.0.0 Alpha launching on July 5, 2025. The comprehensive orchestration framework featured a 64-agent system with hive-mind swarm intelligence, workflow orchestration supporting parallel execution and dependency management, and an MCP server with 87 tools. WASM integration and neural networks demonstrated enterprise-scale orchestration, while the Flow Nexus gamified development environment showed how agent ecosystems could extend beyond pure coding.
Sgai (Sandgarden AI Software Factory), created by Ulderico Cirello and published on February 2, 2026, represents Ulderico's answer to the fundamental question driving this entire evolution. Sgai's elegant interface centers on a single GOAL.md file where developers define their objectives. A web dashboard provides transparency into agent collaboration, while GitHub integration enables native workflows with branches and pull requests. The simplicity of goal-driven development—specify what you want, watch agents build it—reflects deep understanding of what developers actually need: not more complexity, but clearer abstraction.
Attractor, published by StrongDM around February 7, 2026, represents what Ulderico's timeline marks as the "First Notorious DARK Factory." In a revolutionary move, StrongDM released Attractor not as code but as NLSpecs (Natural Language Specifications)—instructions for building your own version of their software factory.
Attractor uses DOT-based pipeline runners (Graphviz syntax) to orchestrate multi-stage AI workflows. The autonomous agent writes implementation code while the DTU (verification system) runs thousands of tests, iterating until convergence. StrongDM described this as "lights-out" development where humans are banned from writing code—they write specifications, and the factory handles everything else.
The meta-innovation of distributing specifications rather than implementation code sparked intense debate. As one observer noted, "I do not think they've succeeded in getting the 'dark factory' concept to work," yet the very attempt pushed the boundaries of what autonomous systems could achieve. The emphasis on verification and testing, with agentic testing driving convergence through iteration, represented a maturation of the autonomous coding concept.
OpenAI's introduction of GPT-3.5-turbo on November 28-30, 2022, launched ChatGPT as a free research preview and democratized access to powerful language models. With an initial context window of 4K tokens (approximately 3,000 words, later expanded to 16K), GPT-3.5-turbo enabled early experiments with AI coding assistance.
Ulderico Cirello's timeline marks this as "Very Small Context" leading to "/squad"—early multi-agent experiments that were severely constrained by the limited context window. Nevertheless, GPT-3.5-turbo established the foundation for what would follow and created widespread awareness of AI capabilities.
GPT-4's launch on March 14, 2023, represented a significant leap in capabilities, with multimodal understanding and improved reasoning. However, the initial context window of 8K tokens (later expanded to 32K and eventually 128K with GPT-4 Turbo) still fell into what Ulderico's timeline marks as "Very Small Context."
GPT-4 supercharged the autonomous agent experiments of early 2023, enabling AutoGPT and BabyAGI. The model's improved capabilities made autonomous coding seem tantalizingly close, but the context window limitations prevented practical application. GPT-4 set the stage for the 2024 "headless agents" era but couldn't fully deliver on the promise.
GPT-5's announcement on July 24, 2025, and launch on August 7, 2025, represented the breakthrough that made AI factories practical. With a context window of 256K tokens (some sources reported up to 400K), GPT-5 provided the "Small Context" that Ulderico's timeline marks as the "OH SHIT, I CAN USE THIS" moment.
The dramatic expansion in context window enabled multi-file, multi-agent workflows for real-world applications. Pricing started at $1.25 per million input tokens and $10.00 per million output tokens, making large-scale agent orchestration economically viable. GPT-5 provided state-of-the-art performance across coding, math, and writing, with improved multimodal capabilities.
The July 2025 announcement created the inflection point even before the August launch. Developers and companies began planning for the expanded context window, leading to the wave of agent management tools and orchestration platforms that launched in the second half of 2025.
Anthropic's Claude Code, launched on February 24, 2025, as a research preview (reaching general availability on May 22, 2025), provided an alternative foundation for AI factories. The agentic coding tool could read codebases, edit files, and run commands across terminal, IDE, browser, and desktop environments.
Claude Code's plugin marketplace and extensibility made it the foundation for numerous orchestration tools, including GasTown, Claude Flow, VibeKanban, and Conductor. The platform's emphasis on agentic behavior rather than just code completion aligned with the broader shift toward autonomous agents.
The November 2025 release with Claude Opus 4.5 further enhanced capabilities, leading to debates about whether Claude Code had "killed" competitors like Cursor. Regardless of the competitive dynamics, Claude Code's influence on the AI factory evolution was profound—it provided the substrate on which many orchestration innovations were built.
The evolution from AI assistants to AI factories represents a fundamental shift in the human-AI relationship. Assistants like GitHub Copilot suggest code while humans decide what to implement. Agents like Claude Code execute tasks while humans review the results. Autonomous agents like those in GasTown complete tasks while humans set goals. Factories like Attractor coordinate agents while humans define specifications.
This progression reflects increasing trust in AI capabilities and decreasing human involvement in implementation details. Each stage required not just better AI models but also new interaction paradigms and orchestration mechanisms.
The temporal dimension of human-AI collaboration evolved dramatically. Synchronous interaction required humans to wait for AI responses. Asynchronous workflows allowed AI to work while humans focused on other tasks. Overnight coding (exemplified by Ralph Wiggums) enabled AI to work while humans slept. Lights-out factories (exemplified by Attractor) removed humans from the execution loop entirely.
This temporal evolution enabled dramatic productivity gains. The ability to set tasks before bed and wake to completed work fundamentally changed the economics of software development. The lights-out factory concept pushed this further, suggesting that humans could define specifications and let autonomous systems handle everything else.
The organizational complexity of AI systems grew from single agents handling one task to parallel agents handling multiple tasks simultaneously. Coordinated agents shared goals and communicated, while orchestrated agents followed complex workflows with dependencies. Factory systems featured self-coordinating agent ecosystems that required minimal human intervention.
This organizational evolution required new architectural patterns. Git worktrees provided isolation for parallel agents. Hook systems enabled autonomous execution. MCP (Model Context Protocol) enabled tool integration. DOT-based pipeline runners visualized complex workflows. Each innovation built on previous patterns while adding new capabilities.
The abstraction level at which humans engaged with software development rose dramatically. Code required humans to write implementation details. Pair programming had AI suggest code while humans wrote. Agent coding had AI write while humans reviewed. Goal-driven development had humans define goals while AI implemented. Spec-driven development had humans write specifications while AI built and verified.
This abstraction evolution reflects the fundamental answer to Ulderico Cirello's question: "How can I stop doing the work myself?" The answer evolved from "let AI help me write code" to "let AI write code I review" to "let AI achieve goals I define" to "let AI build from specifications I write."
The human role in the development process evolved from in-loop (making every decision) to on-loop (monitoring and intervening) to out-of-loop (setting goals and checking results). The dark factory concept pushed this to its logical conclusion: humans write specifications and the system runs autonomously.
This evolution required not just better AI but also better verification systems. Attractor's emphasis on agentic testing with thousands of test runs reflects the recognition that removing humans from the loop requires robust automated quality assurance.
Ulderico Cirello's work on Sgai represents a significant contribution to the AI factory lineage, arriving at a pivotal moment in February 2026. While Attractor pushed toward fully autonomous dark factories and the Codex app provided a polished command center experience, Sgai offered a third path: goal-driven orchestration with elegant simplicity.
The choice to center the interface on a single GOAL.md file reflects deep understanding of developer needs. Rather than complex configuration files, multi-step workflows, or elaborate plugin systems, Sgai asks developers to do one thing well: define their goals clearly. The web dashboard provides transparency without overwhelming complexity, while GitHub integration ensures natural workflow fit.
Sgai's positioning in Ulderico's timeline is telling. It appears alongside GasTown, Claude Flow, and Attractor in the "Orchestrators" category, where "Human is almost absolutely removed from the mix." This placement reflects Sgai's philosophy: humans should define goals, not manage implementation details.
Ulderico's framing of the core problem—"How can I stop doing the work myself?"—captures the motivation that has driven the entire AI factory evolution. His answer with Sgai is neither the brute-force "throw tokens at it" approach of early agent swarms nor the complete human removal of dark factories. Instead, Sgai represents a balanced vision: humans provide direction through clear goal specification, while coordinated agents handle the complexity of implementation.
The timing of Sgai's release in February 2026 positions it at the culmination of the AI factory evolution. With GPT-5's expanded context window making multi-agent workflows practical, with orchestration patterns established by GasTown and Claude Flow, and with the Ralph Wiggums technique demonstrating the power of persistent iteration, Sgai synthesizes these lessons into an elegant, accessible platform.
Ulderico Cirello's contribution extends beyond Sgai itself. His articulation of the problem space, his categorization of approaches (Agent Swarms, Orchestrators, Management Tools, Extensions), and his identification of key inflection points in the timeline provide a framework for understanding the broader evolution. This intellectual contribution—making sense of a rapidly evolving field—may prove as influential as any specific tool.
The AI factory evolution raises profound questions about the future of software development. If humans are removed from the implementation loop, what becomes of software engineering as a profession? The shift "From Coder to Orchestrator" suggests one answer: developers become goal-setters and quality assessors rather than code writers.
The economic implications are equally significant. The Ralph Wiggums technique's 99% cost reduction suggests that software development economics may fundamentally change. If a $50,000 contract can be completed for $297 in API costs, what happens to software pricing, developer compensation, and industry structure?
The verification challenge looms large. Attractor's emphasis on thousands of automated tests reflects the recognition that autonomous systems require robust quality assurance. But can automated testing truly replace human judgment about software quality, user experience, and architectural soundness?
The open-source dimension adds another layer of complexity. Attractor's release as specifications rather than code, Sgai's open-source nature, and the community-driven development of tools like Superpowers and Oh-My-OpenCode suggest that the AI factory evolution may be fundamentally open and collaborative. Yet proprietary platforms like Devin AI, Factory AI, and 8090 AI demonstrate that commercial opportunities remain significant.
The regulatory and security dimensions remain largely unexplored. What happens when autonomous agents write code that affects critical systems? Who is responsible when a dark factory produces software with security vulnerabilities? How do we audit AI-generated code for compliance with regulations?
These questions suggest that the AI factory evolution is far from complete. The period from November 2022 to February 2026 established the foundations, but the implications will unfold over years or decades. Ulderico Cirello's Sgai and his articulation of the core problem—"How can I stop doing the work myself?"—will likely be remembered as important contributions to this ongoing transformation.
The evolution from simple code completion to autonomous AI factories represents one of the most rapid technological transformations in software engineering history. In just over three years, the field progressed from Tabnine's line-by-line autocomplete to Attractor's lights-out dark factory.
This transformation required not just better AI models but also new interaction paradigms, orchestration mechanisms, and conceptual frameworks. The progression from assistants to agents to factories, from synchronous to asynchronous to overnight to lights-out, and from code to spec to goal to specification reflects fundamental rethinking of the human role in software development.
Ulderico Cirello's Sgai arrives at a pivotal moment in this evolution, offering a synthesis of lessons learned: the importance of clear goal specification, the value of transparency through dashboards, and the need for natural workflow integration. By positioning Sgai within the broader historical context—from GitHub Copilot's pair programming through GasTown's notorious factory to Attractor's dark factory—we can appreciate both its specific contributions and its place in the larger trajectory.
The question Ulderico posed—"How can I stop doing the work myself?"—has driven this entire evolution. The answer has progressed from "let AI help" to "let AI do" to "let AI coordinate" to "let AI run autonomously." Sgai represents Ulderico's answer: define your goals clearly, provide transparency into agent collaboration, and integrate naturally into existing workflows.
As the field continues to evolve, the contributions documented here—from Steve Yegge's GasTown to Geoffrey Huntley's Ralph Wiggums technique to Ulderico Cirello's Sgai—will be remembered as foundational innovations that transformed software development from a human-centered craft to a human-directed, AI-executed process.