How We Built Genie: The AI Operating System That Builds SaaS From Scratch

Most software companies build tools for their clients. We built something bigger first: an AI operating system called Genie that we use internally to design, scaffold, and ship entire SaaS products from scratch. It is open source, it runs on a single command, and it is the engine behind everything we build at Heinrichs Software Solutions.

What Is Genie?

Genie is a self-improving AI operating system built on Node.js. It runs 47 specialist agents across 10 departments, maintains persistent memory across sessions, tracks goals across restarts, and can generate a production-grade AWS SaaS system including CDK infrastructure, a Node.js backend, Stripe billing, Cognito auth, and a GitHub Actions pipeline from a single command.

It is not a chatbot wrapper. It is not a prompt library. It is a structured multi-agent runtime where agents have defined roles, share memory, call each other as tools, and improve their own output over time through a dedicated reflection engine.

The full project is open source on GitHub: github.com/Drock91/Genie

Why We Built It

When you are building AI-powered software products for small businesses, speed and quality both matter. A client who needs an AI chatbot integrated into their existing website does not want to wait six weeks. A client who needs a custom SaaS dashboard built for their team cannot afford enterprise consulting rates.

We needed a way to move fast without cutting corners. That meant solving three problems:

Context loss — AI agents forget everything between sessions. Every conversation starts from zero.
Single-agent bottlenecks — One generalist AI trying to write code, review security, write copy, and architect infrastructure at the same time produces mediocre output across the board.
Repetitive infrastructure work — Setting up VPCs, ECS clusters, Aurora databases, Cognito user pools, and Stripe webhooks from scratch on every new project is tedious and error-prone.

Genie solves all three.

The 47-Agent Architecture

Genie organizes its agents into 10 departments, each with a clear lane of responsibility. There is no one agent trying to do everything.

Executive — CEO, Strategy, and Decision agents handle high-level planning and tradeoff resolution
Engineering — Code, DevOps, API, Frontend, Backend, Database, and Test agents divide the technical work by domain
QA and Security — Dedicated agents for quality assurance, security review, threat modeling, and compliance audit
Research and Finance — Market research, data analysis, budget estimation, and pricing strategy agents
Marketing and Legal — Content, SEO, contract, and compliance agents that run in parallel with the technical work
Intelligence Layer — Six specialized agents that manage memory, self-reflection, goal tracking, perception, and infrastructure generation

When Genie receives a task, it routes the work to the right department. Agents can call each other as tools. The ToolRegistry handles parallel execution, consensus aggregation, and result merging without the developer needing to orchestrate any of it manually.

Persistent Memory: The Part Most AI Systems Skip

The most important thing Genie does differently is memory. Not conversation history. Actual structured memory that persists across sessions, gets searched automatically, and gets injected into agent context before every task runs.

Genie uses three memory types:

Episodic memory — A JSONL log of every workflow outcome. What task ran, which agent handled it, what the result was, and how it scored. Stored for 90 days and pruned automatically.
Semantic memory — A searchable knowledge base built from project research, documentation, and extracted lessons. Uses vector embeddings (OpenAI) with a TF-IDF fallback if no embedding API is available.
Procedural memory — Proven strategies that worked, stored with an effectiveness score that updates on a weighted average each time they are reused. The system gets better at the things it does repeatedly.

Before any agent runs, the MemoryManager queries all three stores and injects relevant context automatically. The agent knows what worked before without being told. That is the foundation of the self-improvement loop.

The Reflection Engine

After every workflow completes, a ReflectionAgent scores the output on a 0-10 scale, identifies what worked, identifies what failed, and extracts any reusable strategy. That strategy gets written to procedural memory with an initial effectiveness score.

The next time a similar task runs, the strategy is already there. The agent does not repeat the same mistakes. Over time, across hundreds of tasks, the system builds a library of battle-tested approaches specific to the kinds of work it does for HSS clients.

This is not a gimmick. It is the difference between a tool that stays static and one that compounds.

One-Command SaaS Generation

The capability we use most for client work is the SaasArchitectAgent and AwsInfraAgent combination. Given a product name and a feature list, Genie generates:

AWS CDK TypeScript — VPC, ECS Fargate with autoscaling, Aurora PostgreSQL Multi-AZ, ElastiCache Redis, S3 and CloudFront CDN, Cognito User Pool, CloudWatch dashboards and alarms
Node.js ESM Backend — Multi-tenant JWT auth with refresh tokens, Stripe subscriptions and webhook handling, S3 presigned uploads, SES email, Redis feature flags, Zod input validation, rate limiting, health check endpoints, and PostgreSQL data models
DevOps pipeline — Dockerfile, docker-compose, and GitHub Actions CI/CD configured and ready to push

The output lands in my-workspace/output/<project-name>/ and is production-ready infrastructure, not pseudocode or a template. It is the actual CDK stack and the actual backend that gets deployed.

What used to take days of scaffolding work now takes minutes. That time savings goes directly into customization, testing, and client-specific features.

Multi-Provider LLM Consensus

Genie supports seven LLM providers: Groq, OpenAI, Anthropic, Google Gemini, Mistral, AI21, and xAI Grok. Only Groq is required to run the system since it has a free tier. Every other provider is optional.

For high-stakes tasks, Genie can run consensus mode: send the same prompt to multiple providers, compare outputs, and synthesize the best result. For cost-sensitive tasks, it routes to the cheapest capable model. There is a configurable monthly budget cap that prevents runaway API costs.

This is the same failover philosophy we use in our client-facing chatbot infrastructure. No single provider dependency means no single point of failure.

Multi-Modal Perception

Genie's PerceptionAgent handles inputs beyond text. Drop in a screenshot and it extracts UI structure. Send a PDF and it summarizes and answers questions about the content. Pass an audio file and it transcribes it through OpenAI Whisper. The agent auto-detects input type and routes to the right processor.

In practice this means we can feed Genie a client's existing website screenshot and have it reason about what to build. Or a PDF contract and have the LegalAgent extract obligations. Or a voice memo and have it become a spec document. The system meets the input where it is.

Cross-Session Goal Tracking

Goals in Genie are not to-do lists. They are tracked objects with deadlines, blockers, and attached tactics. They persist across every CLI restart and server reboot. At the start of every workflow, the GoalScheduler runs a tick that checks deadlines, flags anything overdue, and surfaces active blockers to the relevant agents.

For a small team building multiple client projects simultaneously, this is the difference between losing track of something and having the system surface it unprompted.

How This Powers HSS Client Work

Every managed AI solution we deliver for clients benefits from Genie under the hood. When we scope a new chatbot integration, Genie's research and architecture agents help us design the right system fast. When we build custom SaaS features, the infrastructure generation eliminates the scaffolding phase entirely. When we review security on a client deployment, the SecurityResearchAgent and ThreatModelingAgent run a structured analysis rather than an ad hoc checklist.

Genie is not a product we sell. It is the capability advantage that lets us deliver better work, faster, at a price point that works for small businesses.

It Is Open Source

Genie v2 is fully open source. Developers can clone it, run it locally with a free Groq API key, and use it for their own projects. The full source, documentation, and CLI reference are available at github.com/Drock91/Genie.

If you are building AI-powered products and want to talk about what Genie can do for your development workflow, or if you are a small business that wants a team running this kind of infrastructure on your behalf, reach out.

Want AI Like This Working for Your Business?

We use Genie and the same multi-agent architecture to build and manage AI solutions for small businesses. Start with a free trial of our AI chatbot and see what a real system looks like.

Start Free Trial Talk to Us

How We Built Genie