New Data, New Results? How Data Vintaging Affects the Replicability of Research
Macroeconomic variables like unemployment, inflation, trade, or GDP are not set in stone: they are preliminary estimates that are constantly revised by statistical agencies. These data revisions, or data vintages, often provide conflicting information about the size of a country’s economy or its level of development, reducing our confidence in established findings. Would researchers come to different conclusions if they used different vintages? To answer this question, I survey all articles published in a top political science journal between 2005 and 2020. I replicate three prominent articles and find that the use of different vintages can lead to different statistical results, calling into question the robustness of otherwise rigorous empirical research. Given that much of this research is used to inform policy decisions, the discrepancies I identify can have wide-ranging real-world implications, leading to the adoption of policies based on outdated estimates.
GDP Growth in 2000 for Selected Countries: Difference in Values Reported by Different WDI Releases