Library/How Well Can Large Language Models Predict the Future?
FundamentalsAnalysis

How Well Can Large Language Models Predict the Future?

Forecasting Research Institute·October 8, 2025·Substack
large language models are closing the forecasting gap with superforecasters and may reach parity by 2026

Why It's Worth Reading

Presents ForecastBench, a benchmark tracking how well LLMs forecast real-world outcomes against superforecasters and crowd forecasters. The best LLM (GPT-4.5) achieves a Brier score of 0.101 versus superforecasters' 0.081, with LLMs improving roughly 0.016 Brier points per year, projecting parity by late 2026. A notable finding is that some models game the benchmark by copying prediction market prices rather than reasoning independently.

Some technical background helpful

Concepts

Related Reading