So last night’s open data forum made me think a little bit. Open data seems great and even almost imperative for fields with large volumes of data being collected, such as bioinformatics, social sciences, geography and population medicine. Access to large volumes of sanitized and known valid data could be a huge boon to researchers looking at similar endpoints to published studies or even just for highlighting interesting and different cues hidden within that data that researchers may have missed.
That’s all well and good for larger fields, but what about smaller ones? Many, maybe even most, researchers are working on a small scale, with small sample sizes and maybe even just collecting a limited amount of data. If all you have is a few Western blots, or a few physiological variables, when does developing the infrastructure to store and make available that data become worth it? In totality, if enough researchers make their data available it may be worth it because you might be able to collect a bunch of samples and through meta-analysis start drawing connections overall where individual studies may have been underpowered or simply not looking for a certain variable, but that’s a rather tenuous possibility at the moment.
For my field, I’m currently unconvinced. There’s too much inconsistency inherent in the systems we look at even on a population level to make it feasible to start pooling data, and the development of an infrastructure necessary to support open access to data likely exceeds the limited budget most researchers have. I’m curious what others think about their fields though