Best practices for building and curating databases for comparative analyses

Publication
Journal of Experimental Biology

Abstract

Comparative analyses have a long history of macro-ecological and -evolutionary approaches to understand structure, function, mechanism, and constraint. As the pace of science accelerates, there is ever-increasing access to diverse types of data and open-access databases that are enabling and inspiring new research. Whether conducting a species-level trait-based analysis or a formal meta-analysis of study effect sizes, comparative approaches share a common reliance on reliable, carefully-curated databases. Unlike many scientific endeavors, building a database is a process that many researchers undertake infrequently and in which we are not formally trained. This commentary provides an introduction to building databases for comparative analyses and highlights challenges and solutions that the authors of the commentary have faced in their own experiences. We focus on four major tips: 1) carefully strategizing the literature search; 2) structuring databases for multiple use; 3) establishing version control within (and beyond) your study; and 4) the importance of making databases accessible. We highlight how one’s approach to these tasks often depends on the goal of the study and the nature of the data. Finally, we assert that the curation of single-question databases has several disadvantages: it limits the possibility of using databases for multiple purposes and decreases efficiency owing to independent researchers repeatedly sifting through large volumes of raw information. We argue that curating databases that are broader than one research question can provide a large return on investment, and that research fields could increase efficiency if community curation of databases was established.

Next
Previous

Related