Cross-Model Interlocutor
I got tired of catching my own blind spots. So I built a tool that lets Claude ask ChatGPT to poke holes in its work.
The problem: I was building an AI exposure analysis for a consulting client — classifying 400 job titles by how much AI could automate them. Claude produced clean, confident output. I shipped it. A colleague ran it on a different dataset and found it was silently misclassifying edge cases that happened to be the most important ones.
The root cause wasn’t that Claude was wrong. It’s that I had no second opinion. I was reviewing AI output with the same brain that wrote the prompt. My blind spots were Claude’s blind spots.
So I built an MCP server that gives Claude two tools: ask_chatgpt sends Claude’s work to ChatGPT with a critique prompt, and compare_approaches gives both models the same task independently and returns a structured diff of where they agree and disagree.
The interpretation is simple: when both models agree with high confidence, use it. When they disagree, that disagreement IS the finding — it tells you exactly which inputs need human review. On the tariff case study, model disagreement on HS code classification identified 34 products that would have been misclassified. That’s not a failure of AI — it’s a feature of having two models check each other.
The honest part: you don’t need this for most tasks. If you’re building a website or writing a script, single-model output is fine. This matters when the cost of being wrong is high and you can’t easily tell by looking — data classification, financial calculations, anything going into a client deliverable.
// MCP config — add to .claude/settings.json:
{
"mcpServers": {
"chatgpt-interlocutor": {
"command": "npx",
"args": ["-y", "chatgpt-interlocutor"],
"env": { "OPENAI_API_KEY": "your-key" }
}
}
}
// Then in any Claude Code session:
// "Ask ChatGPT to critique this HS code classification"
// "Compare approaches: have both models classify these 20 products"
// Claude calls the MCP tools automatically