Data Management

Data management describes the actions taken during the different stages of the data lifecycle which define how data are collected, stored, secured, and disseminated. Data management best practices are defined by discipline, PI, or project.

A fundamental understanding of data management helps when writing a Data Management Plan (DMP), in addition to ensuring data accessibility and integrity. To learn more about writing a DMP, including templates and what federal agencies require, visit the Writing a Data Management Plan Webpage.

Data Management Questions

Data management begins with asking the right questions as to how data will be collected, stored, shared and organized. Below is a list of questions that can help you get started:

  • What types of data are collected?
     
  • How much data (file size) will be collected?
     
  • How quickly will data accumulate?
     
  • What are the likely file formats?
     
  • How unique are the data and how often will backups be performed?
     
  • Will the data be collected from a third-party source?
     
  • What data tools are available?
     
  • Are the data part of a collaboration that needs to be shared regularly and frequently?
     
  • Who needs access to the data?
     
  • How long will the data need to be kept?
     
  • What are the data retention policies of the funder, journal, Columbia University?
     
  • Who owns the data?
     
  • What is classification of the data and what security measures need to be put in place?
     
  • Will the data be shared to a public database?
     
  • What sort of problems have been encountered previously with managing data?
     
  • What kind of DMP does the funder require?

Data Management Resources

Tutorials and Guidelines
  • The ReaDI Program has created several tutorials (below) and identified guidelines to aid in the management of data during the collection phase of research. The ReaDI Program is available for data management consulting and presentations (Columbia researchers only).
Columbia University Data Management Consulting Services
  • Statistical Analysis Center Data Management Services are available to anybody at Columbia. They are able to help with all aspects of data management, including administrative systems. Their services include: 
    • Case report form design 
    • Database design 
    • Database hosting 
    • Custom user interface design (web, desktop, telephone, etc.) 
    • Data system design (data for analysis, logistical data, personnel data, financial data, etc.) 
    • Report design 
    • Database querying and data set generation 
    • RedCap host and development
       
  • Research Data Services jointly supported by CUIT and Columbia Libraries, is available available to help with many aspects of the research data lifecycle including research data management, finding data, recommendations for cleaning and understanding data, mapping and visualizing your data.
     
  • Irving Institute for Clinical and Translational Research offers free one-hour consultation to discuss data management requirements, help design a data management plan with associated budget requirements or provide guidelines for moving data into a properly formatted, secure environment.