Data Management and Security

Research sponsors, scholarly journals, and the general public are demanding greater access to research data, especially if the data has been collected with public funds. This new focus on data accessibility means that effective data management, which has always been a crucial aspect of the research process, has new urgency for researchers and research institutions.

""

Data management are the processes that describe the methods in which data will be collected, stored, secured, and disseminated.

A data management plan has the following components (from NISO):

  1. Description of types of data collected and/or generated
  2. Standards that will be used for data and their metadata
  3. Description of policies regarding all data
  4. Plans for archival and preservation
  5. Description of resources required for data management including software, hardware, budget considerations and personnel

Careful planning for data management can help researchers fulfill the requirements of their sponsors, and increase the accessibility, usability, and impact of their work. See below for several resources to assist researchers fulfill data management and sharing requirements.

Introduction to Data Management Resources

An important part of the research data collection process is developing a system that will describe how files will be named, laboratory notes maintained, etc.

It is important to understand the type of data you are collecting, which will aid in the decision on how it should be stored.

There are two different types of data during a research lifecycle:

Archived Data: Data that are no longer being used (i.e. old data sets, published data from many years ago, etc.) but want to maintain in long-term storage

Working Data: Data that are produced for publications, grant submissions, presentations, etc. but have not been formally published

Ideally, there should be three copies of data:

  1. The original copy
  2. A copy nearby
  3. A copy in a geographically different location
Resources on Data Storage

PDF of options available for data storage has been compiled by the ReaDI Program

Data Storage Options Table

Some research data are highly sensitive, such as Protected Health Information (PHI) including names or addresses associated with clinical information, or Personally Identifiable Information (PII) such as Social Security numbers, credit card numbers, or personal financial data. The release of such data can lead to harm such as privacy violations, identify theft, financial liability for the University, and in some cases, individual liability for the person who released the data. 

All researchers should be aware that sensitive information is highly regulated by federal laws, such as HIPAA and HITECH, and by University policy, such as the Electronic Information Resources Security Policy.  As the Policy states: "Individuals who access or control University electronic information resources must take appropriate and necessary measures to ensure the security, integrity, and protection of these resources, using appropriate physical and logical security measures."

Breaches and even suspected breaches must be reported to the Information Technology Security and Policy Office and to the local system administrator.  At CUMC, breaches must be reported to the CUMC Privacy and Information Security Officers at hipaa@columbia.edu.  Anyone with questions concerning Protected Health Information privacy or security requirements and HIPAA policies should visit the CUMC HIPAA webpage.

Resources on Data Security at Columbia University

Many funders require publications and data be made publicably accessible.

There are a number of ways to share your data in order to make it available to the scholarly community and the broader public.  Check out the links below to find more information about data storage and publication.