NIH Policy on Data Management and Sharing Plan (2023)

The National Institute of Health (NIH) implemented a new Data Management and Sharing Policy (DMS Policy), which requires ALL new and competing proposals/renewals that will generate scientific data to submit a detailed plan outlining how data will be stored, protected, and ultimately shared.

The DMS Policy will have two main requirements:

  1. Submission of a data management and sharing plan (recommended two pages). Research proposals without a Plan will not be considered for funding.
  2. Compliance with the approved plan. Failure to provide updates in grant reporting may result in enforcement actions, including the addition of special terms and conditions or award termination. Failure to deposit data after the end of the funding period may negatively influence future opportunities.

Prospective grant applicants will need to submit their DMSP along with their application. The DMSP will be assessed by NIH Program Staff (peer reviewers will also have the opportunity to comment on the proposed data management budget). The Institute, Center, or Office (ICO)-approved plan becomes a Term and Condition of the Notice of Award. However, researchers will have the ability to update and amend their DMSP as their research plan changes and evolves.

The effective date of the DMS Policy is January 25, 2023, and applies to:

  • Competing grant applications and proposals that are submitted to the NIH for the January 25, 2023 deadline, and subsequent receipt date
  • Proposals for contracts that are submitted to NIH on or after January 25, 2023
  • Other funding agreements (e.g., Other Transactions) that are executed on or after January 25, 2023, unless otherwise stipulated by the NIH
  • Please see a complete list of NIH activity codes subject to the DMS Policy

DMSPs should be no more than two pages long and must address the following elements:

  • Data Type: Description of the data that will be generated, managed, preserved, and ultimately shared.
  • Related Tools, Software and Code: Explanation of any specialized or custom tools and software needed to access or manipulate shared data.
  • Standards: Detailed description of data standards applied to shared data and associated metadata, if applicable.
  • Data Preservation, Access, and Timelines: List of repositories, data lakes/warehouses where data will be archived, as well as documentation on how data will be discoverable and accessed, and when and how long it will remain available.
  • Data Access, Distribution, and Reusage Considerations: Explanation of any potential issues and restrictions affecting access, distribution and reusage of data.

Oversight of Data Management and Sharing --Explanation of how compliance and adherence will be monitored and managed. During the funding period, compliance with the Plan will be determined by the NIH ICO. Compliance with the Plan, including any Plan updates, may be reviewed during regular reporting intervals (e.g., at the time of annual Research Performance Progress Reports (RPPRs)).

The DMS Policy does not currently mandate any particular repository for data, but some individual NIH institutes have specified required repositories. In general, researchers are encouraged to deposit their data into repositories that support effective data discovery and reuse. The NIH does provide guidance for selecting a repository and provides a list of recommended domain-specific and generalist data repositories. Researchers are expected to discuss their plan for preservation of access to data resulting from the project.

Data management and sharing costs are allowable costs that can be included in grant budgets. These can include personnel costs for data management and sharing activities, curation costs, data deposit fees, and long-term data preservation costs. Costs for data storage beyond the period of the grant may be paid but they must be paid during the award period. Costing approaches continue to be the subject of discussion and more information may become available in the future.

Budgeting for Data Management and Sharing

Investigators may request funds towards data management and sharing in the budget justification section of their application.

Allowable Costs

  • Curating data
  • Developing supporting documentation
  • Formatting data according to accepted community standards, or for transmission to and storage at a selected repository for long-term preservation and access
  • De-identifying data
  • Preparing metadata to foster discoverability, interpretation, and reuse
  • Local data management considerations, such as unique and specialized information infrastructure necessary to provide local management and preservation (for example, before deposit into an established repository).
  • Preserving and sharing data through established repositories, such as data deposit fees (If the Data Management & Sharing (DMS) plan proposes deposition to multiple repositories, costs associated with each proposed repository may be included).

NOTE: Note that all allowable costs submitted in budget requests must be incurred during the performance period, even for scientific data and metadata preserved and shared beyond the award period.

Unallowable Costs | Budget requests must NOT include

  • Infrastructure costs that are included in institutional overhead (for instance, Facilities and Administrative costs)
  • Costs associated with the routine conduct of research, including costs associated with collecting or gaining access to research data.
  • Costs that are double charged or inconsistently charged as both direct and indirect costs
  • Scientific Data: The recorded factual material commonly accepted in the scientific community as of sufficient quality to validate and replicate research findings, regardless of whether the data are used to support scholarly publications. Scientific data do not include laboratory notebooks, preliminary analyses, completed case report forms, drafts of scientific papers, plans for future research, peer reviews, communications with colleagues, or physical objects, such as laboratory specimens.
  • Data Management: The process of validating, organizing, protecting, maintaining, and processing scientific data to ensure the accessibility, reliability, and quality of the scientific data for its users.
  • Data Sharing: The act of making scientific data available for use by others (e.g., the larger research community, institutions, the broader public), for example, via an established repository.
  • Metadata: Data that provide additional information intended to make scientific data interpretable and reusable (e.g., date, independent sample and variable construction and description, methodology, data provenance, data transformations, any intermediate or descriptive observational variables).

NIH Resources

Repositories

Columbia University Resources

Other Resources

Data Management Help and Consultations

Yes.  DMPTool is a guided tool designed to help researchers draft data management and sharing plans.  Columbia is a DMPTool partner institution.  Use your [email protected] email address to access the tool and Columbia-specific guidance through single sign-on (SSO). 

Please note:  Logging in using other types of email addresses, such as [email protected] or [email protected], will prevent you from accessing useful Columbia-specific information to assist you in completing your plan. Simply use [email protected] and select “Sign in with Institution (SSO).” You may log in with your usual Columbia credentials to access Columbia-specific information.

 

Yes. Email [email protected] to request a consultation. 

NIH has extensive resources at sharing.nih.gov. Visit data.research.columbia.edu for Columbia-specific information about data management and sharing, including a dedicated page regarding NIH’s specific policy.

Choosing a Repository

  1. Check the funding opportunity announcement (FOA) to see whether it specifies depositing data in a specific repository. 
  2. If the FOA does not specify a repository, check whether the NIH institute to which you’re applying has its own policy for data management and sharing. Examples of Institutes with their own data sharing policies include NIMH and NIAAA.
  3. If the NIH institute does not mandate a particular repository, review the list of repositories supported by NIH to see whether any are appropriate for your research data.
  4. If there is no particular repository that is appropriate, consider a generalist repository. One such repository is Dryad. Columbia affiliates can deposit data in Dryad at no cost. There is an upload limit of 300 GB per publication through the web interface. Submitters should contact Dryad for assistance with larger datasets.

Note that you may choose multiple repositories to accommodate different data types, if appropriate.

More information is available on NIH’s Selecting a Data Repository webpage.

The CU Data Storage Finder summarizes Columbia-supported data storage and sharing resources, including information about whether they are approved for sensitive data, what costs you might incur, and other important information.

Yes.  Columbia has an enterprise license to LabArchives, an electronic lab notebook and research collaboration tool.  LabArchives is free to anyone with a Columbia UNI.  See the CU Data Storage Finder for more options.

No.  Although it has the word “Lab” in its name, LabArchives is a collaboration tool that may be useful for any team that needs to share data or documents, including researchers in any discipline and also administrators and educators.

Training for Researchers

Yes, the Office of Research Compliance and Training and Libraries have created a 12-minute video tutorial that demonstrates how to use DMPTool.

Yes.  Slides are available on the Research Data at Columbia webpage.

The Office of Research Compliance and Training has created the following optional trainings available in Rascal:

  • TC2650 | Best Practices for Data Management When Using Instrumentation
  • TC2651 | Good Laboratory Notebook Practices
  • TC3250 | Guidelines on the Organization of Samples in a Laboratory

In addition, there is also a checklist for keeping a lab notebook and a data-to-figure mapping tool that researchers can use to document the underlying raw and processed data for published figures. You may also request an in-person training for your team by emailing [email protected].

Writing a Data Management Plan

Yes.  DMPTool is a guided tool for drafting data management and sharing plans.  Columbia is a DMPTool partner institution.  Log in with your [email protected] email address and use the single-signon option to access Columbia-specific guidance in DMPTool.

No. NIH has explained that “These sample DMS Plans are provided for educational purposes to assist applicants with developing Plans but are not intended to be used as templates and their use does not guarantee approval by NIH. Note that the sample DMS Plans provided below may reflect additional expectations established by NIH or specific NIH Institutes, Centers, or Offices that go beyond the DMS Policy.” 

Your data management and sharing plan needs to be specific to your project and consistent with Columbia’s resources, policies, and practices.  DMPTool is a good starting point and contains Columbia-specific guidance.

Oversight and Compliance

In general, the principal investigator is responsible for compliance with the data management and sharing plan, although some responsibility may be delegated to, e.g., a data manager or lab manager. 

In addition, several offices within the Office of the Executive Vice President for Research can provide support to the principal investigator on request, to help ensure compliance.  These include the Office of Research Compliance and Training, Sponsored Projects Administration, and the Human Research Protection Office. 

The following is template language that can be adapted for the needs of particular project and included in Element 6 of the data management and sharing plan:

"Prof. ______ will be responsible for ensuring compliance with the data management plan described here, and, if needed, will seek support from relevant offices within Columbia’s Office the Executive Vice President for Research, which has overall responsibility for the University’s research enterprise."

Budget Issues

NIH’s most recent FAQ specifies that all costs for data management and sharing activities, including salary and fringe corresponding to the time it takes personnel to undertake data activities (e.g., formatting curating, developing supporting documentation), must be included in the single line item on the R&R Budget Form in Section F, Other Direct Costs.  NIH states:

Do not include personnel costs related to data management and sharing activities in section A., Senior/Key Person or Section B. Other Personnel.

Supporting details, including a breakdown of any personnel effort, must be included in the budget justification.

Examples of costs that can be directly charged to your NIH project include:  special services from a consultant or data manager, specific to the particular project; data curation costs; and repository fees. See NIH’s listing of allowable costs.

Once your project ends, you can no longer charge the project for data management or any other expense. Per NIH, “Note that all allowable costs submitted in budget requests must be incurred during the performance period, even for scientific data and metadata preserved and shared beyond the award period.” However, if the service provider allows it, you can pre-pay expenses and charge them to the project before it ends.

Timing for Data Sharing

No. Data needs to be shared at the time of an associated peer-reviewed publication or by the end of the performance period of the grant, whichever comes first. Since preprints are not considered to be peer-reviewed publications, they do not trigger the data-sharing requirement.

Human Subjects Research Data

Yes, the IRB has developed guidance and sample language that is posted on the HRPO website [link] and available in the Rascal Consent Form module.  This language is based on the sample future use and data sharing language provided by the NIH in the resource, Informed Consent for Secondary Research with Data and Biospecimens.

Information is considered sensitive if the loss of confidentiality, integrity, or availability could be expected to have a serious, severe, or catastrophic adverse effect on organizational operations, organizational assets, or individuals. [Source: Guide for Identifying and Handling Sensitive Information at the NIH] NIH expects scientific data to be shared to the extent possible, but acknowledges that privacy, security, informed consent, and proprietary issues must be considered. These are particularly relevant when the data include sensitive human subjects research data such as certain genomic data and data about individuals belonging to smaller populations and minority groups who may be more likely to be re-identified and potentially experience greater harm in the event that their data are re-identified.  If protective measures such as having a Certificate of Confidentiality or deidentification of the data and biospecimens would not be sufficient to safeguard confidentiality of research participants, and they would be at greater risk of harm as a result, the data management and sharing plan should describe such limitations.

If the research is currently NIH funded, enrollment is open and a competitive renewal is anticipated, the consent forms should be revised to include future use and data sharing information that meets the requirements of the NIH Data Management and Sharing Policy, if not already included.  At such time as the renewal is submitted, the Policy requirements apply. It is good practice and will allow the greatest flexibility moving forward if all consent forms, regardless of current funding status, include future use and data sharing information. This is particularly relevant if future NIH funding is anticipated.

No, but both invoke requirements for sharing of data so consent requirements are similar.  Consent forms for research to which both policies apply must include language that will satisfy the requirements of each policy.  The IRB has prepared sample language that can be used in this situation.

IP-Related Data Sharing

Yes, sharing your data before filing a patent application will jeopardize Columbia’s ability to obtain patent protection worldwide.  Accordingly, we suggest contacting CTV before sharing your data to discuss whether filing a patent application would be worthwhile.

In general, allow for up to about four months for Columbia to obtain patent protection.  If you expect a potentially patentable invention to result from your project, you should build this time into your DMSP.  You may want to consider including language such as:  “In the event that patentable intellectual property results from this project, Columbia University may require some additional months to protect the intellectual property before sharing the relevant data, in accordance with University policy and the Bayh-Dole Act.”

If it turns out that you need to delay data sharing in order to protect intellectual property and you have not included the above language in your DMSP, you may need to revise your DMSP.  Contact your SPA Project Officer to facilitate a request to your NIH program official to modify the approved DMSP to accommodate intellectual property protection, if necessary.  Any revision to your DMSP requires NIH prior approval.  Also, bear in mind that, as mentioned above, NIH expects data to be shared at the time of publication or at the end of the project (including the first no-cost extension), whichever is sooner. 

Finally,  if your project generates genomic data, it’s important to note that NIH strongly discourages the use of patents as a means to block sharing genomic or phenotypic data. As such, researchers are expected to share these data types without restrictions.

 

  • Session 1: Thursday, October 27, 2022; 12:00pm – 1:00pm EST | Overview of the Policy (Slides now available)
  • Session 2: Friday, November 18, 2022; 2:00pm – 3:00pm EST | Budgeting for the DMS Plan (Slides now available)
  • Session 3: Thursday, December 1, 2022; 12:30pm – 1:30pm EST | DMP Tool Demo (Slides now available)
  • Session 4: Wednesday, January 11, 2023; 1:00pm - 2:30pm EST | Recap, Consent Form Guidance, and new FAQs (Slides now available)

Note: This page is intended to inform the Columbia University research community about the new NIH policy and to highlight appropriate information and resources, as they become available.