This is the old site for the DataStaR project. We have changed the focus of the DataStaR project to that of a data set registry. More details can be found here.
Welcome to DataStaR, an experimental Data Staging Repository hosted by Albert R. Mann Library, at Cornell University.
The purpose of DataStaR is to support collaboration and data sharing among researchers during the research process, and to promote publishing or archiving data and high-quality metadata to discipline-specific data centers, and/or to Cornell's own digital repository (eCommons).
The project includes two main proposed innovations. The first is the development of a metadata management architecture which would allow managers of data staging repositories to approach heterogeneous data and metadata in a more flexible way while still leveraging the significant investment that has already been made in discipline-specific metadata schemas. The second is a model for a local data staging repository that provides data curation services early in the research cycle, and then promotes the transmission of data to repositories better suited for long-term curation and preservation, thereby improving access to research data sets.
The conceptual model above shows the movement of metadata and data from individuals and research groups to systems supporting sharing with collaborators, and eventually with the public. The staging area represents local infrastructure to support sharing within a group of researchers. When data and metadata are ready for public release, they may be submitted to an institutional repository and/or discipline-specific repositories, which may in turn expose their content for harvesting by other repositories. In this particular example, the National Biological Information Infrastructure (NBII) harvests metadata submitted to the Knowledge Network for Biocomplexity (KNB), and Geospatial One Stop (GOS) harvests metadata from the Cornell Geospatial Information Repository (CUGIR). Institutional repositories may be indexed by web search engines.
The DataStaR team is developing a technical architecture for a local data staging repository where a researcher can:
This work extends activities initiated under NSF grant 0437603 (Small Grant for Exploratory Research), "Planning Information Infrastructure through a New Library Research Partnership." In that grant, a conceptual model was developed for library-laboratory collaborations in the arena of data curation, which is described more fully in our final report. In the DataStaR projcet, continued work with two research groups as well as additional partners is proposed to provide local assistance with research collaboration and data curation during the research process, using the proposed local data staging repository as a platform. Ultimately, the intent is to pass "publication-ready" data sets on to domain-specific repositories, or to Cornell's institutional repositories, as appropriate. If successful, this work will serve as a model for academic libraries to provide a data staging repository for use by researchers at their institutions. The model leverages the ability of a researcher's local institution to provide accessible support and services related to research data, early in the research process, and serves to promote the deposition of data in domain-specific repositories, thus making data available to the larger research community.
DataStaR team members:
Cornell researchers with data to share with collaborators or make publicly available are invited to contact the DataStaR team.
This material is based upon work supported by the National Science Foundation under Grant No. III- 0712989.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.