KeelTest is a VS Code extension that generates and executes pytest unit tests for Python code, aiming to overcome the 'AI hallucination' problem where AI agents create failing or self-passing tests. It uses static analysis to plan tests, generates them, runs them in a sandbox, and attempts to self-heal failures or flag bugs in the source code. It's in alpha stage, currently supporting Python with Poetry/UV/pip.
A starter prompt for Claude Code, what you'll need, and how to reach them.
You are an expert full-stack developer. I need to build a robust VS Code extension, similar to KeelTest, that generates and executes pytest unit tests for Python code. My goal is to reliably generate valid, runnable tests and provide intelligent feedback when tests fail, either fixing the test or flagging a bug in the source code. I will use Next.js 16 App Router, React 19, Tailwind v4 for the web components (if any), and a Python backend (Flask/FastAPI) for test generation and execution, deployed on Vercel with Neon Postgres. The core logic will be in Python, and the VS Code extension will be TypeScript/JavaScript.
Focus on the MVP for Python/pytest:
1. **Python Backend (Core Logic):** Create a Python service (e.g., FastAPI) that exposes an API endpoint to:
a. Receive Python source code for a function/class.
b. Perform static analysis to identify dependencies and mockable components.
c. Generate `pytest` unit tests for the provided code, focusing on common edge cases and valid assertions.
d. Execute these generated tests in an isolated, sandboxed environment (e.g., using `subprocess` with `venv` or Docker).
e. Analyze test failures: distinguish between errors in test generation and potential bugs in the original source code.
f. Return the generated tests, execution results (pass/fail, error messages), and suggested fixes/bug flags.
2. **VS Code Extension (Frontend/Glue):** Develop a basic VS Code extension:
a. Add a command accessible via the command palette to 'Generate Tests'.
b. When invoked, the extension should identify the currently active Python file and selected function/class.
c. Send the selected code to the Python backend API.
d. Display the generated tests and execution results within a VS Code webview or output channel.
e. Provide a basic mechanism to 'Apply Fix' if the backend suggests a test fix or 'Report Bug' if a source bug is flagged.
Start by outlining the FastAPI service for test generation/execution and the basic VS Code extension structure to call it. Assume the Python environment is set up with `pytest` and `ast` for static analysis. Prioritize secure sandboxed execution. What's the minimum viable set of files and code snippets to demonstrate generation, execution, and basic feedback?Reach out to developers and AI engineers who are deep into AI coding tools and agent development, showcasing how KeelTest solves a common pain point with AI-generated tests, which aligns with the operator's AI tooling and automation products.
I built this because Cursor, Claude Code and other agentic AI tools kept giving me tests that looked fine but failed when I ran them. Or worse - I'd ask the agent to run them and it would start looping: fix tests, those fail, then it starts "fixing" my code so tests pass, or just deletes assertions so they "pass". Out of that frustration I built KeelTest - a VS Code extension that generates pytest tests and executes them, got hooked and decided to push this project forward... When tests fail, it tries to figure out why: - Generation error: Attemps to fix it automatically, then tries again - Bug in your source code: flags it and explains what's wrong How it works: - Static analysis to map dependencies, patterns, services to mock. - Generate a plan for each function and what edge cases to cover - Generate those tests - Execute in "sandbox" - Self-heal failures or flag source bugs Python + pytest only for now. Alpha stage - not all codebases work reliably. But testing on personal projects and a few production apps at work, it's been consistently decent. Works best on simpler applications, sometimes glitches on monorepos setups. Supports Poe
Reply in the Hacker News thread, linking to your initial prototype or demo.
“I've started building a prototype of an AI-driven VS Code test generator, inspired by your KeelTest, and have addressed some of the 'AI hallucination' issues with a reliable execution and feedback loop. I'd love to share my progress and get your thoughts on potential collaboration or further development.”
Open the original ↗