Hacker Newsby bulba4aurbuilt with AIscored by google/gemini-2.5-flash

KeelTest – AI-driven VS Code unit test generator with bug discovery

Opportunity

AI-buildable

Traction

Creativity

The take

effort: ~1-2 months

KeelTest is a VS Code extension that generates and executes pytest unit tests for Python code, aiming to overcome the 'AI hallucination' problem where AI agents create failing or self-passing tests. It uses static analysis to plan tests, generates them, runs them in a sandbox, and attempts to self-heal failures or flag bugs in the source code. It's in alpha stage, currently supporting Python with Poetry/UV/pip.

Demand & gap

Painkillerdemand 67

Demand

Will. to pay

Gap

Buyer

Developers

The gap in what exists: The gap is for an AI-driven test generator that produces reliable, runnable tests and provides actionable feedback, as existing AI agents often hallucinate or struggle with execution.
Wedge to win: Target individual Python developers and small teams frustrated by unreliable AI-generated tests, offering a solution that reliably generates and validates tests, saving debugging time.
Reputation value (worth doing free for proof) — 70/100: Building a reliable, open-source-friendly (even if backend is proprietary) VS Code extension that solves a clear pain point for AI coding tool users would establish strong credibility as an AI tool builder.
Likely monetization: usage-based API / subscription (tests generated files/month)
Incumbents to beat: CursorClaude CodeGitHub Copilot

Deliver it

A starter prompt for Claude Code, what you'll need, and how to reach them.

You are an expert full-stack developer. I need to build a robust VS Code extension, similar to KeelTest, that generates and executes pytest unit tests for Python code. My goal is to reliably generate valid, runnable tests and provide intelligent feedback when tests fail, either fixing the test or flagging a bug in the source code. I will use Next.js 16 App Router, React 19, Tailwind v4 for the web components (if any), and a Python backend (Flask/FastAPI) for test generation and execution, deployed on Vercel with Neon Postgres. The core logic will be in Python, and the VS Code extension will be TypeScript/JavaScript.

Focus on the MVP for Python/pytest:
1. **Python Backend (Core Logic):** Create a Python service (e.g., FastAPI) that exposes an API endpoint to:
a. Receive Python source code for a function/class.
b. Perform static analysis to identify dependencies and mockable components.
c. Generate `pytest` unit tests for the provided code, focusing on common edge cases and valid assertions.
d. Execute these generated tests in an isolated, sandboxed environment (e.g., using `subprocess` with `venv` or Docker).
e. Analyze test failures: distinguish between errors in test generation and potential bugs in the original source code.
f. Return the generated tests, execution results (pass/fail, error messages), and suggested fixes/bug flags.
2. **VS Code Extension (Frontend/Glue):** Develop a basic VS Code extension:
a. Add a command accessible via the command palette to 'Generate Tests'.
b. When invoked, the extension should identify the currently active Python file and selected function/class.
c. Send the selected code to the Python backend API.
d. Display the generated tests and execution results within a VS Code webview or output channel.
e. Provide a basic mechanism to 'Apply Fix' if the backend suggests a test fix or 'Report Bug' if a source bug is flagged.

Start by outlining the FastAPI service for test generation/execution and the basic VS Code extension structure to call it. Assume the Python environment is set up with `pytest` and `ast` for static analysis. Prioritize secure sandboxed execution. What's the minimum viable set of files and code snippets to demonstrate generation, execution, and basic feedback?

How you'd build it

1Develop a core Python test generation engine that takes a code snippet and returns pytest code, focusing on robust static analysis and dependency mapping.
2Integrate a 'sandbox' execution environment to run generated tests and capture results/failures, including stdout/stderr.
3Implement a 'self-healing' mechanism that analyzes test failures and attempts to refine the generated test code or identifies potential bugs in the source.
4Build a VS Code extension (using TypeScript/JavaScript) that provides a UI to select code for testing, triggers the Python backend, and displays results/fixes.
5Implement rate limiting and API key management for the hosted test generation service, offering a free tier with usage limits.

Risks & moats

The 'bug discovery' and 'self-healing' aspects are hard; false positives or infinite loops could quickly frustrate users.
Reliably handling complex codebases, monorepos, and various project structures will be a significant engineering challenge.
Competing with direct AI agent capabilities (e.g., Cursor's built-in test generation) requires a demonstrable quality/reliability edge.
Ensuring the sandbox execution is secure and isolated is crucial, especially when running arbitrary generated code.

Market it to your portfolio

fit 65

Forge KitMCP Kitaimon

Reach out to developers and AI engineers who are deep into AI coding tools and agent development, showcasing how KeelTest solves a common pain point with AI-generated tests, which aligns with the operator's AI tooling and automation products.

Original context

I built this because Cursor, Claude Code and other agentic AI tools kept giving me tests that looked fine but failed when I ran them. Or worse - I'd ask the agent to run them and it would start looping: fix tests, those fail, then it starts "fixing" my code so tests pass, or just deletes assertions so they "pass". Out of that frustration I built KeelTest - a VS Code extension that generates pytest tests and executes them, got hooked and decided to push this project forward... When tests fail, it tries to figure out why: - Generation error: Attemps to fix it automatically, then tries again - Bug in your source code: flags it and explains what's wrong How it works: - Static analysis to map dependencies, patterns, services to mock. - Generate a plan for each function and what edge cases to cover - Generate those tests - Execute in "sandbox" - Self-heal failures or flag source bugs Python + pytest only for now. Alpha stage - not all codebases work reliably. But testing on personal projects and a few production apps at work, it's been consistently decent. Works best on simpler applications, sometimes glitches on monorepos setups. Supports Poe

KeelTest – AI-driven VS Code unit test generator with bug discovery

The take

Demand & gap

Deliver it

How you'd build it

Risks & moats

Market it to your portfolio

Original context

You may also want to look at

Prerequisites — cost & what to learn

Setup steps

Reach them