Toward new data and information management solutions for data-intensive ecological research
Ecosystem health is deteriorating in many parts of the world due to direct and indirect anthropogenic pressures. Generating accurate, useful, and impactful models of past, current, and future states of ecosystem structure and function is a complex endeavor that often requires vast amounts of data from multiple sources and knowledge from collaborating researchers. Ecological data collection has improved rapidly over the past few decades due to the development, innovation, and large scale deployment of automated sensors, which are capable of measuring a gamut of ecosystem properties over broad spatiotemporal scales. Although complex ecosystem models and analyses are increasingly parameterized with data from such sensors, the challenges of managing, analyzing, and sharing large data sets remain for this field of research. The goals of this research were to: 1) better identify and understand challenges that academic ecological research groups face when incorporating automated sensors into their research, and 2) improve capacities for the fusion and analysis of multifarious ecological data from multiple sources. To address the first goal, a nationwide survey of ecologists was conducted to elucidate how academic research groups are deploying sensors, managing sensor data, collaborating with major research networks, and publishing their data, results, and other findings. The survey feedback from over 100 research groups from 82 academic institutions showed that academic ecological research groups are collectively using thousands of sensors in the field - a number comparable to a large research network - and would like to more than double their sensor use. However, in addition to being limited by funding, they also identified that they are limited in information management knowledge and tools that would help them make their data permanently archived and made available for reuse. To address the second goal, a case study was performed to explore, identify, and prototype solutions to challenges faced by typical academic ecological research groups when streamlining data processing and management. By reviewing the operations of the heavily-instrumented UTEP Systems Ecology Lab research site at the Jornada Experimental Range, NM, a need was identified for a web-based information management system that allows for interaction with spatial layers, imagery, and time-series data. Working collaboratively with a team of ecologists and computer scientists, a prototype web mapping and information system was developed using several free and/or open-source products that are freely available for use and modification by the ecological community. The system consists of i) a generic database well suited for storing and fusing multifarious ecological datasets from multiple sources and for supporting multiple interest areas and personnel; ii) a web-mapping application that allows users to query and dynamically view and interact with a variety of spatial data imagery and time-series data; and iii) specialized, open-source data analysis software (written in R, a programming language familiar to many ecologists) that can be implemented within the information management framework. The 'long-tail' of ecological research, where many small research groups collectively make a large contribution to the body of knowledge that help us understand, manage, and adapt to our changing earth system is steadily becoming more data-intensive. This research highlights and addresses some of the challenges that need to be overcome by the academic ecological research community to make their data reusable for collaborative science. The dissertation concludes by discussing future research challenges associated with the management of large, multifarious ecological data and connecting the activities of numerous but relatively small academic research groups to national research efforts.
Biogeochemistry|Information Technology|Environmental science
Laney, Christine Marie, "Toward new data and information management solutions for data-intensive ecological research" (2013). ETD Collection for University of Texas, El Paso. AAI3609494.