Skip to main content

Command Palette

Search for a command to run...

I Built ChatGPT for MongoDB in 5 Days (And Open-Sourced It)

How I turned our database into something our team can actually talk to

Published
6 min read
I Built ChatGPT for MongoDB in 5 Days (And Open-Sourced It)

"Show me all React developers who interviewed last month with evaluation scores above 85."

At our recruitment SaaS, questions like this were becoming a daily headache. Every time someone needed insights from our MongoDB database, it meant an engineering ticket. We were getting 10+ of these requests daily.

So I spent 5 days building our own "ChatGPT for MongoDB" - a system that lets our team ask database questions in plain English and get instant answers.

It worked so well that I decided to open-source the whole thing.

The Problem That Pushed Me to Build This

I'm building a SaaS for recruitment. Our users manage thousands of job applications, interviews, and candidate profiles in MongoDB.

The constant requests kept coming:

  • "Show me candidates with React experience from recent applications"
  • "Which interviews had the highest scores?"
  • "Find applicants who might be good fits for this role"

Each question meant our engineering team had to:

  1. Understand what they actually wanted
  2. Write a custom MongoDB query
  3. Deploy it and explain the results
  4. Repeat for the next question

The numbers were brutal:

  • 10+ query requests per day
  • 1-2 hours of dev time each day
  • Product development was slowing down

What I Built: ChatGPT, But for Our Database

Instead of writing complex MongoDB queries like this:

db.applicants.aggregate([
  { $match: { "skills": "React", "appliedDate": { $gte: lastMonth } } },
  { $lookup: { from: "interviews", localField: "_id", foreignField: "applicant" } },
  { $match: { "interviews.score": { $gte: 85 } } }
])

Our team can now just ask:

"Show me React developers who interviewed last month with scores above 85"

The system understands the question, plans the right queries, and responds conversationally with the data and insights.

The Technical Challenges I Had to Solve

Building a reliable "ChatGPT for databases" meant solving several problems:

1. The Schema Problem

MongoDB schemas change constantly. Hardcoded mappings break within days.

My solution: Built dynamic schema introspection that reads our Mongoose models in real-time, so the AI always knows our current database structure.

2. Complex Query Planning

Simple questions like "how many users?" are easy. But "show me React developers who aced their interviews" requires understanding relationships across multiple collections.

My solution: Used LangGraph to build an AI agent that can reason through multi-step database operations, just like a human developer would.

3. AI Hallucination Prevention

LLMs love making up field names and assuming data that doesn't exist in your actual database.

My solution: Schema-first approach where the AI always checks the real database structure before building any queries.

4. Conversation Flow

Real usage isn't one-off questions. It's "show me top candidates" followed by "now show me their interview scores."

My solution: Redis-backed memory so the AI remembers context across the entire conversation.

The Architecture: How It Actually Works

Natural Language Question
        ↓
🤖 AI Agent (Claude 3.5)
        ↓
📋 Dynamic Schema Reader (checks current Mongoose models)
        ↓
🔧 MongoDB Query Tools (find/aggregate/count)
        ↓
💾 Redis Memory (maintains conversation context)
        ↓
📄 Smart Response (data + insights + explanations)

The Smart Schema System

Every query starts here:

// Reads our actual Mongoose models at runtime
const schema = extractCompleteMongooseSchema(ApplicantModel);
// Result: "skills: Array<String>, resumeAnalysis.score: Number(0-100)"

The AI knows exactly what fields exist, their types, and their constraints.

AI Agent Reasoning

Watch how the AI thinks through a complex question:

User: "Show me our top candidates"
AI: 🔍 Let me check the applicants schema first...
AI: 💭 Found 'compatibilityScore' field (0-100), I'll sort by that
AI: ⚡ Running query: find({}, {sort: {compatibilityScore: -1}, limit: 10})
AI: 💬 "Here are your top 10 candidates, mostly senior developers..."

What I Learned Building This

Claude beats GPT for database work:

  • Claude 3.5 is much better at complex reasoning
  • GPT-4 makes up field names more often
  • Switching to Claude cut our error rate significantly

Token optimization matters:

  • Started by sending all schemas with every query (expensive!)
  • Now the AI only asks for schemas it actually needs
  • Cut token costs by 40%

Error handling is crucial:

  • Complex aggregations sometimes fail
  • Built smart fallbacks: try aggregation → fallback to simple find()
  • If a field doesn't exist, re-check schema and retry

Conversation memory changes everything:

  • Users ask 3-4 follow-up questions on average
  • "Now show me their interview scores" should just work
  • Redis sessions make it feel like talking to a person

The Results That Matter

After deploying our "ChatGPT for MongoDB":

Engineering requests: 10/day → 0
Query response time: 2 hours → 30 seconds
Team productivity: Significantly improved
New insights: Users ask questions they never asked before

More importantly, our non-technical team members became confident exploring data themselves.

Open Source: Try Your Own ChatGPT for MongoDB

This solved our recruitment SaaS problem, but I realized every company with MongoDB probably faces similar challenges.

So I've open-sourced the complete system:

🔗 GitHub: mongodb-nl-query-demo

What's included:

  • Full working system with demo e-commerce data
  • AI agent setup and prompt engineering
  • Dynamic schema introspection code
  • Redis conversation memory
  • Complete adaptation guide for your database

Built for real use:

  • TypeScript throughout for reliability
  • Production error handling and recovery
  • Rate limiting and security considerations
  • Token optimization for cost control

Quick Start: Get Your Own Running

# Clone and set up
git clone https://github.com/salmankhan-prs/mongodb-nl-query-demo
cd mongodb-nl-query-demo
pnpm install

# Add your API keys
cp .env.example .env
# Edit .env with MongoDB URI and Anthropic API key

# Load demo data and start
pnpm seed
pnpm start:dev

# Test it out
curl -X POST http://localhost:3000/api/query \
  -H "Content-Type: application/json" \
  -d '{"query": "Show me all users from USA"}'

Try questions like:

  • "How many products do we have in each category?"
  • "Show me customers who spent the most money"
  • "Which orders were delivered successfully?"

Adapt It to Your Database

The magic is in the dynamic schema reading. To use your own data:

  1. Replace the models in src/models/ with your Mongoose schemas
  2. Update collection names in src/types/index.ts
  3. Run the schema generator: pnpm generate:schemas
  4. Start asking questions about your actual data

The system automatically discovers your field types, relationships, constraints, and enum values.

Why This Actually Matters

When anyone on your team can ask database questions directly:

  • Decisions happen faster (no engineering bottlenecks)
  • More insights get discovered (easier to explore data)
  • Engineering focuses on features (not custom queries)
  • Data becomes accessible (non-technical users gain confidence)

The Tech Stack That Worked

  • AI Model: Claude 3.5 Sonnet (superior reasoning for databases)
  • Agent Framework: LangChain + LangGraph
  • Memory: Redis for fast session storage
  • Backend: Express + TypeScript
  • Database: MongoDB + Mongoose (enables dynamic introspection)

What's Next

I'm excited to see what people build with this. Some ideas for extensions:

  • Write operations (INSERT, UPDATE, DELETE with safety checks)
  • Web interface (React app for non-technical users)
  • Advanced analytics (trend analysis, predictive insights)

Try It Out

This represents 5 days of focused work solving a real problem we faced every day. If you're dealing with similar database query bottlenecks, maybe it'll help you too.

GitHub: mongodb-nl-query-demo

Questions? Reach out on Twitter or LinkedIn. I'd love to hear what you build with it.

Built this because we needed it. Sharing it because you might too.