Research Data

Research Data


On Campus Resources

Columbia Research Data Consulting Services: Research Data Services

Creating a Data Management Plan: Research and Data Integrity Program (ReaDI)

Data Security: Columbia Research

Funder Requirements: Office of Sponsored Projects Administration

Public Access Mandates: Columbia Research

Research Data Storage: Columbia Research

Research Integrity: ReaDI Program

Other Resources


Joint Data Archiving Policy

NIH Data Sharing Policy

NSF Data Sharing Policy and Data Management Plan Requirements

data scrabble

Why Share Data?

Increase the visibility of your research: Making your data available to other researchers through widely-searched and well-indexed repositories can help you demonstrate continued use of your data and make an argument for the relevance of your research.

Facilitate discovery: Enabling other researchers to use your data reinforces open scientific inquiry and can lead to new and unanticipated discoveries. Opening up your research data prevents duplication of effort—allowing the research community to focus on results, rather than on replicating data collection that has already been done.

Satisfy funder & journal requirements: Many funding agencies and some journals now require that researchers deposit data they collect during their research in an open-access repository. For more information about funder requirements, see the Office of Sponsored Projects Administration.

Establish priority: Data posted online can be timestamped to establish the date they were produced, blocking “scooping” tactics.  

Speed up essential research: Open data sharing can accelerate discovery rates and foster further inquiry.

Adapted from: MIT and UW

The FAIR Data Principles

FAIR data are data that are findable, accessible, interoperable, and reusable—and that can be acted upon by machines. The following is adapted from the in-depth documentation at

FAIR Data Principles


Metadata and data should be easy to find for both humans and computers.

  • (Meta)data are assigned a globally unique and persistent identifier
  • Data are described with rich metadata
  • Metadata clearly and explicitly include the identifier of the data they describe
  • (Meta)data are registered or indexed in a searchable resource



Once the user finds the required data, they need to know how can they be accessed, possibly including authentication and authorisation.

  • (Meta)data are retrievable by their identifier using a standardised communications protocol
  • The protocol is open, free, and universally implementable
  • The protocol allows for an authentication and authorisation procedure, where necessary
  • Metadata are accessible, even when the data are no longer available



The data usually need to be integrated with other data, applications, or workflows.

  • (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation
  • (Meta)data use vocabularies that follow FAIR principles
  • (Meta)data include qualified references to other (meta)data



The ultimate goal of FAIR is to optimise the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings.

  • (Meta)data are richly described with a plurality of accurate and relevant attributes
  • (Meta)data are released with a clear and accessible data usage license
  • (Meta)data are associated with detailed provenance
  • (Meta)data meet domain-relevant community standards