Recommendations for Increasing the Transparency of Analysis of Preexisting Data Sets

Sara J Weston; Stuart J Ritchie; Julia M Rohrer; Andrew K Przybylski

doi:10.1177/2515245919848684

Recommendations for Increasing the Transparency of Analysis of Preexisting Data Sets

Adv Methods Pract Psychol Sci. 2019 Sep;2(3):214-227. doi: 10.1177/2515245919848684. Epub 2019 Jun 11.

Authors

Sara J Weston¹, Stuart J Ritchie², Julia M Rohrer^{3

4

5}, Andrew K Przybylski^{6

7}

Affiliations

¹ Department of Psychology, University of Oregon.
² Social, Genetic and Developmental Psychiatry Centre, King's College London.
³ Department of Psychology, University of Leipzig.
⁴ International Max Planck Research School on the Life Course, Max Planck Institute for Human Development.
⁵ German Institute for Economic Research (DIW Berlin), Berlin, Germany.
⁶ Oxford Internet Institute, University of Oxford.
⁷ Department of Experimental Psychology, University of Oxford.

Abstract

Secondary data analysis, or the analysis of preexisting data, provides a powerful tool for the resourceful psychological scientist. Never has this been more true than now, when technological advances enable both sharing data across labs and continents and mining large sources of preexisting data. However, secondary data analysis is easily overlooked as a key domain for developing new open-science practices or improving analytic methods for robust data analysis. In this article, we provide researchers with the knowledge necessary to incorporate secondary data analysis into their methodological toolbox. We explain that secondary data analysis can be used for either exploratory or confirmatory work, and can be either correlational or experimental, and we highlight the advantages and disadvantages of this type of research. We describe how transparency-enhancing practices can improve and alter interpretations of results from secondary data analysis and discuss approaches that can be used to improve the robustness of reported results. We close by suggesting ways in which scientific subfields and institutions could address and improve the use of secondary data analysis.

Keywords: bias; file drawer; p-hacking; panel design; preexisting data; preregistration; reproducibility; secondary analysis; transparency.

Grants and funding

R01 AG018436/AG/NIA NIH HHS/United States