Methodology — Idea vs Reality

The model

The simulator runs on a frontier large-language reasoning model. We use it because it's genuinely good at synthesising large quantities of mainstream economics, political science, and historical case studies into a coherent multi-domain analysis. We don't use it because it's correct — we use it because, in our testing, it's well calibrated, balanced, and willing to disagree with the user.

The structure we force on it

Free-form opinion is what makes most AI policy chat useless. So we don't ask for opinion. We ask for a structured JSON report with a fixed schema:

An executive summary and a net-outcome verdict.
A risk level and an explicit confidence percentage for the report as a whole.
Domain-by-domain effects (economic, social, political, environmental and others), each with a probability, a magnitude, and a sentiment.
A stakeholder breakdown — who wins, who loses, who shifts politically.
A multi-horizon timeline (immediate, short-term, medium-term, long-term).
A risks register with mitigations.
Historical precedents; real countries, real outcomes, why they're relevant.
A final recommendation written the way an advisor would write it.

The schema makes it impossible to hand-wave. Every claim has to live somewhere structured, with a probability attached.

Calibration & balance

The system prompt asks the model to behave like a non-partisan advisor. That is, to surface unintended consequences for any proposal regardless of its ideological flavour, and to assign probabilities that reflect genuine uncertainty rather than the user's expectations. Probabilities are intentionally not pinned to round numbers; the model is encouraged to say 37% if 37% is what it actually believes.

What we deliberately don't do

We don't fetch live data per query. The model uses its training-time knowledge of economic theory and historical precedent.
We don't fine-tune it on a particular political worldview.
We don't claim numerical precision. A "65% probability" is the model's calibrated belief, not a statistical measurement.

Limitations to keep in mind

The model has a knowledge cutoff. Very recent legislation may not be reflected.
Edge-case countries (small island states, fragile states) are less well-represented in the training data than the OECD.
Probabilities are calibrated estimates, not measurements. Treat them as guidance, not truth.
The simulator is a thinking tool. It is not a substitute for domain experts, lived experience, or democratic deliberation.

Try it for yourself →

Honest about what this is — and what it isn't.

The model

The structure we force on it

Calibration & balance

What we deliberately don't do

Limitations to keep in mind