Requirements for Data Management and Sharing
Plan to manage your data in a way that makes meeting the eventual requirements for data management and sharing easier. Thinking about how you will collect, organize, manage, store, secure, back up, preserve, and share your data makes it easier to meet future data sharing requirements.
The NIH requires investigators seeking $500,000 or more of direct costs in a year to include a description of how research data will be shared or why sharing is impossible, to comply with the agency’s data sharing policy.
- NIH Data Management and Sharing Frequently Asked Questions.
- NIH guidelines on data management plans.
- Genomic Data Sharing (GDS) Policy
- NIH Data Management Plan Template
The NSF requires that all requests for funding contain a data management plan (DMP) of no more than two pages addressing how the proposed project will comply with the agency’s data sharing policy. This document must outline the plan for data management or a justification as to why there is no need for such a plan.
- NSF Data Management and Sharing Frequently Asked Questions.
- NSF guidelines on data management plans.
- NSF Data Management Plan Templates
Sub-directorate guidances (not a comprehensive list):
Many other agencies and funders require descriptions of how data resulting from the funded project will be managed, preserved, and shared:
- DOE-Department of Energy Statement on Digital Data Management, Data Management Resources
- IES-Institute of Education Sciences (US Dept of Education)
- IMLS-Institute of Museum and Library Sciences Projects that Develop Digital Products
- NASA Earth Sciences
- NEH-National Endowment for the Humanities Office of Digital Humanities
- NIJ-National Institute of Justice Data Resources Program
- AHA-American Heart Association Open Science Policy
- Alfred P. Sloan Foundation-Specifically, in the Information Products appendix
- Gordon & Betty Moore Foundation
Journals such as Nature, Science, PNAS, and PLoS ONE are among those requiring that data underlying articles they publish be made available. Journals published by the Ecological Society of America such as Ecology and Ecosphere also have requirements for data sharing. BMJ also requires that all clinical trials be prospectively registered, in accordance with the International Committee of Medical Journal Editors recommendations, and that patient-level drug and device trial data be available upon reasonable request.
In the social sciences, there is a move for journals to endorse the Data Access and Research Transparency joint statement. Endorsers include: the American Political Science Review, the American Journal of Political Science, Comparative Political Studies, Political Analysis, and the Journal of Conflict Resolution.
Columbia University Requirements
Responsibility: Columbia’s Retention and Access to Research Data Policy (Columbia UNI required) names the Principal Investigator (PI) of a research project as responsible for determining what data need to be retained and for setting up systems for organizing and archiving project data.
Data Retention: The Retention and Access to Research Data policy is on the website of the Office of the Executive Vice President for Research. The policy includes the following statement: “Research data must be archived for a minimum of three years after the final project close-out, with original data retained wherever possible.” Other policies and laws requiring a longer period of data retention, such as the Health Information Portability and Accountability Act (HIPAA), may apply.
Data Ownership: Though sponsors grant research funds to the Trustees of Columbia University, usually the PI acts as steward of the research data and makes decisions on its use and distribution within the parameters of sponsor and Columbia guidelines. See the Intellectual Property section under “Obligations and Responsibilities of Officers of Instruction and Research” in the Faculty Handbook for more information on Columbia policies. Make sure you are aware of your obligations under these policies and those of your research sponsor.
Data Portability: You should not assume you can take data with you. Whether or not you can do so depends on many factors, including the status of the project and the policies of the research sponsor and Columbia. Contact Sponsored Projects Administration for more information. If you are a post-doctoral researcher, the answer to this question depends on the specific circumstances of your research. Talk to your work supervisor or contact the Office of Postdoctoral Affairs.
Use these templates to help create DMPs for your proposals. The templates contain suggested items to consider; not all questions will be appropriate or relevant to all projects. To create your Data Management Plan, select the appropriate template from the list below. Most templates have five or six main sections (indicated by bold headings). We suggest that you try to address the topic of each section. You will usually find more specific questions underneath the bold headings to help guide your writing. You can:
- Start by answering the questions within each section (under each section description).
- After answering all the relevant questions, remove the questions, leaving just your answers.
- Modify the answers into prose that makes sense as a paragraph below each Roman numeral header (include the bold text as the header to each of your sections in your Data Management Plan).
Your completed DMP will have the section headers followed by prose paragraphs describing how you plan to address each point.
If you’re looking for templates with less guidance check out the DMPTool.
The DOE requires that requests for funding contain a data management plan (DMP).
- DOE-General DMP template
- The following Office of Science Program Offices have additional requirements:
The NIH requires a paragraph following the Research Plan Section to describe your data sharing plan: Office of Extramural Research guide.
Additionally, if you are working with large scale genomic data such as genome-wide association studies (GWAS), single nucleotide polymorphisms (SNP) arrays, and genome sequence, transcriptomic, epigenomic, and gene expression data you must also adhere to the Genomic Data Sharing (GDS) Policy. This requires both a brief description of your data sharing plan in your proposal, and a more detailed just-in-time data sharing plan. The expectations for both of these are still evolving, but you should be prepared to address the considerations listed below.
The NSF requires that all requests for funding contain a data management plan (DMP) of no more than two pages.
- Guideline for American Heart Association Data Sharing Plan
- NEH-Office of Digital Humanities*
- IMLS-Projects that Develop Digital Products
* Adapted from work made available under the terms of the Creative Commons Attribution-ShareAlike 3.0 license, © 2012 by the Rector and Visitors of the University of Virginia.
Writing a DMP
Run through this Data Management Planning Checklist. You should expect to address the following points in any data management plan:
- What data types, from what sources, in what formats will this project produce? How much of it will there be?
- How will you describe or document your data? Are there standards you will be using for this?
- Will you be sharing your data? Do you have the rights to share the data? What did you tell the IRB?
- How often do you need to backup your files? How do you need to be able to access your files? How many backups will you have?
- How much storage space do you need? What is your budget for your storage?
- Where are you going to archive or store the data? and how will it be accessed?
- What are the roles and responsibilities around all of these things? i.e., Who’s going to be doing all this?
How to share
Requirements for sharing have been addressed above.
Considerations for sharing are addressed in Sharing.
Getting your data ready to share is addressed in Working.
Where to store your data for sharing is addressed in Finalizing.
These are all aspects of “How to Share”