AIMarginal Revolution4 days ago

How well does current AI find errors in economics papers?

1 min read

The author tested Gemini, Refine, Claude, and ChatGPT Pro on four published economics papers with known errors. ChatGPT Pro performed best, occasionally constructing counterexamples and corrected proofs. No model located a true error without substantial human guidance. The author argues a competent human paired with a frontier model can outperform current peer review.

Level

Hype check

Tap to vote and see what everyone thinks.

#ai #economics #peer review

How well does current AI find errors in economics papers?

More to chew on!