The Macro: CI Failures Are the Silent Productivity Killer
Every engineering team knows the pain. A CI build goes red. Someone has to stop what they are doing, investigate the failure, figure out if it is a real issue or a flaky test, and fix it. This can take minutes or hours. Multiply that by dozens of engineers and hundreds of builds per day, and CI failures become one of the biggest drains on engineering productivity.
The statistics are damning. Studies consistently show that developers spend 15-25% of their time dealing with broken builds, flaky tests, and CI infrastructure issues. That is time not spent building features or fixing real bugs. And the problem compounds. When CI is unreliable, developers lose trust in the system. They start ignoring failures. They merge without green builds. Code quality degrades.
Flaky tests are particularly insidious. A test that passes 90% of the time will fail often enough to create noise but rarely enough that nobody prioritizes fixing it. Over time, the flaky test suite grows, CI run times increase, and the feedback loop that makes CI valuable in the first place breaks down.
The existing tools address parts of this problem. CI platforms like CircleCI and GitHub Actions run the builds. Test management tools track flaky tests. Monitoring tools alert on failures. But none of them fix the problems. They just report them. The actual diagnosis and repair still falls on human engineers.
Mendral, backed by Y Combinator, is building an AI DevOps engineer that does the fixing. It watches your CI pipeline, diagnoses failures, identifies root causes, and opens pull requests with proposed solutions. The pitch: five minutes to install, first fix within hours.
The Micro: Built by the People Who Built Docker
The founding story here is remarkable. Sam Alba was Docker’s first hire and served as VP of Engineering. Andrea Luzzardi wrote Docker’s first lines of code. Together, they co-founded Dagger, which went through YC’s W19 batch. These are not newcomers to DevOps tooling. They built the containerization infrastructure that most modern CI systems run on.
That pedigree matters because Mendral needs deep understanding of CI systems, build processes, test frameworks, and deployment pipelines to diagnose failures accurately. The founders have literally been inside the machinery for over a decade.
The product works as a GitHub App with no infrastructure setup required. When a CI job fails, Mendral investigates the failure, identifies the root cause, and opens a pull request with a fix. It handles flaky tests, slow builds, broken releases, and even responds to code review feedback to iterate on its fixes.
The customer list already includes PostHog, Metabase, Luminai, and Inngest, all of which are well-respected engineering organizations. If those teams trust Mendral with their CI pipelines, the product is doing something right.
The competition includes tools like Trunk, which focuses on merge queues and flaky test management, and Buildkite, which optimizes CI performance. But those tools manage the CI process. Mendral is trying to eliminate the human work of fixing failures entirely. That is a fundamentally different value proposition.
The risk is accuracy. If Mendral opens PRs with incorrect fixes, it creates more work, not less. Engineers will stop trusting the suggestions and ignore them. The bar for correctness is high because every bad fix erodes confidence in the system.
The Verdict
Mendral is going after a problem that every engineering team recognizes but nobody has truly solved. CI maintenance is grunt work that absorbs enormous amounts of engineering time, and automating it would free up significant capacity.
At 30 days: what percentage of CI failures is Mendral successfully diagnosing and fixing without human intervention? Even a 30% autonomous fix rate would be meaningful given the volume of CI failures at most companies.
At 60 days: how are the fixes being received by engineering teams? Are PRs being merged as-is, or are they requiring significant modification? The merge-without-changes rate is the real quality metric.
At 90 days: are customers seeing measurable reductions in time-to-green-build and developer hours spent on CI issues? The business case is straightforward: less time on CI means more time on product.
The founders’ background gives me confidence that Mendral understands CI systems at a depth that competitors will struggle to match. Docker literally changed how the industry builds and deploys software. If anyone can build an AI that fixes CI, it is the team that built the infrastructure CI runs on.