ReaDI Program

The ReaDI Program

Resources for the Research Lifecycle

Learn More!

About

In 2014, the University launched the Research and Data Integrity (ReaDI) Program to enhance research integrity, data management, and data quality across the institution. Housed within our Office of Research Compliance and Training, the ReaDI Program engages proactively with Columbia’s research community in three ways:

It maintains a wide-ranging, web-based repository of essential resources and tools to support robust science across the research lifecycle, from experimental design through data collection and management, to analysis and publication. As Monya Baker points out in her 2016 Nature article, finding these types of resources challenges many researchers, but the ReaDI Program offers a one-stop-shop for authentication methods, information on statistical consulting services, literature on reproducibility, lab management tools, and many other items. New resources are routinely added, and existing resources are continuously updated. The ReaDI resources are openly available for use by any institution.

It provides outreach, training and courses on topics including safeguarding research and data integrity, and rigor and reproducibility. The ReaDI Program proactively reaches Columbia’s graduate students and postdoctoral researchers at resource fairs, orientation presentations, and department-specific seminars and Journal Clubs focused on the critical evaluation of the literature. In 2024, the program launched the first-of-its-kind web-based training on proper handling of digital scientific images, with strong uptake and evaluations.

It offers individualized consultations on data management and good laboratory practices. These consultations are available to principal investigators at all levels and are customized to meet the principal investigator’s needs and to maximize efficiency and research quality.

How to Navigate

Each tab represents a distinct phase of the Research Lifecycle. In each section, you will find a wealth of resources tailored to support you at that specific stage of your research journey. Click on the short video below for a walkthrough!

The ReaDI Program

Transcript

The Readi Program streamlines the research process with a comprehensive workflow designed for all researchers.
By offering tools and resources for every stage of the research lifecycle, Ready ensures you have the support you need—from launching your project to disseminating the final results.
It all starts with “Get Started,” where you focus on building a solid foundation for your research.
This includes assembling the right team, ensuring everyone has the necessary training, and creating standard operating procedures that guide daily activities.
By clearly defining roles, expectations, and SOPs at the outset, you establish a culture of rigor and reproducibility that sets the stage for success.
Investing time here helps prevent future challenges, and paves the way for a seamless research journey.
Next comes “Propose & Plan,” which emphasizes writing clear, robust research proposals and—crucially—developing data management plans.
Good data management goes beyond simple storage.
It includes organizing, documenting, and securing your data to support reproducibility and prevent issues like data loss or misinterpretation.
After carefully planning your research, it’s time to “Execute.”
In this phase, you’ll run experiments, collect and organize data, and, if needed, consult with experts on statistical analysis to maintain high-quality outcomes.
Staying on top of data management here is especially crucial—proper documentation and version control can prevent errors and make your results more reproducible.
In “Disseminate & Preserve,” the focus is on maximizing the impact of your findings while ensuring long-term accessibility.
This phase goes beyond simply choosing the right journals—it covers best practices for preparing manuscripts, formatting data for public release, and adhering to open-access or funder requirements.
Finally, the "End of the Line" phase provides tools and resources to ensure projects conclude ethically and compliantly.
For more resources and support, researchers are encouraged to contact [email protected].

1. Get Started

Beginning your Research Journey

This section lays the groundwork for initiating a successful research project by focusing on team formation and preparation. It includes guidance on hiring the right people with the necessary skills and expertise. It also covers the essential training for team members on rigor and reproducibility, ensuring that the team understands the principles of research integrity and collaboration. Finally, it provides tools for the PI to develop and communicate their own lab procedures for all team members to follow. This section is crucial for setting the stage for a successful research project by building a strong, well-prepared team.

Instilling Scientific Rigor in the Lab (Neuroline)
How to Build a Lab Culture that Promotes Scientific Rigor (Neuroline)
Enhancing Reproducibility Using Interprofessional Team Best Practices (J Clin Transl Sci. 2021)
Catalyzing Communities of Research Rigour Champions (Brain Communications 2024)
On Being a Scientist: Guides to Responsible Conduct in Research (Committee on Science) | An overview of professional standards of science and explains why adherence to those standards is essential for continued scientific progress.

Training Finder Tool | The Finder creates a personalized chart of required and recommended training courses, with links to the training and the responsible offices. The identified courses can be added to your Rascal Training To-do List.
Recommended Training
- Responsible Conduct of Research (RCR) | Rascal Course TC0094
- Good Laboratory Notebook Practices | Rascal Course TC2651
- Responsible and Ethical Conduct of Research (RECR) for Faculty and Other Senior Personnel | Rascal Course TC7000
- Recognizing Influences and Biases in Research | Rascal Course TC4900
- Robust Science: Problems and Solutions | Rascal Course TC4901
- Best Practices for Data Management When Using Instrumentation | Rascal Course TC2650
- Guidelines on the Organization of Samples in a Laboratory | Rascal Course TC3250
- TC7350Handling Digital Scientific Images Dos & Don’ts | Rascal Course TC7350

Whether you call it a lab manual, an onboarding document, or standard operating procedures, a written document can streamline research processes and ensure high-quality research. The document should include your expectations for conducting research (including quality controls) and maintaining data. Other principal investigators have found these documents helpful for communicating with their group members. Free SOP consulting and drafting services are available for Columbia PIs.

The webpage by Organizo LLC offers a suite of practical tools and templates designed to streamline organization in research labs. From managing inventories and orders to tracking applications and outlining team responsibilities, these resources aim to improve efficiency and clarity in daily operations. All documents are downloadable and customizable, making them versatile for various professional and lab settings.

Organizational Spreadsheets

Ordering Spreadsheet: Tracks order statuses to help lab members plan their work based on item arrivals. Can be customized for any group ordering system.
-80 Freezer Inventory: Simplifies freezer organization for -80°C storage and can be adapted for other temperature-controlled storage (-20°C, 4°C).
Antibody Inventory: Helps locate lab items like antibodies quickly with a system to define storage locations and contents.

Application Management

Application Tracker: A spreadsheet for managing deadlines and requirements for funding applications, shareable with institutional grant teams.

Human Resource Tools

Employee Expectations: A template outlining job expectations for lab aides, customizable for various research roles.
Lab Responsibilities: A document detailing responsibilities for different roles (e.g., Lab Aide, Research Technician, Animal Technician) to ensure smooth lab operations.

2. Propose & Plan

Preparing for all Aspects of your Research Project

In this section, the focus is on the critical planning stages of your research project. It includes resources and guidance on writing strong research proposals, designing robust research methodologies, and creating comprehensive data management plans. Additionally, it covers the importance of experimental design to ensure your study is well-structured and scientifically sound. This section helps researchers lay a solid foundation for their projects by carefully planning and proposing their research strategies.

Secondary Analysis
- Guidance on Secondary Analysis of Existing Data Sets (University of Connecticut)
- Secondary Analysis – A How-to Guide (J Adv Pract Oncol. 2019)
- Protecting against researcher bias in secondary data analysis: challenges and potential solutions (Eur J Epidemiol. 2022)
Research Hypothesis
- Step-by-Step Guide: How to Craft a Strong Research Hypothesis
- Formulating Hypotheses for Different Study Designs (J Korean Med Sci. 2021)
- Research Proposal Checklist (Texas A&M)

Research Rigor and Reproducibility: Columbia's Leadership in Research Quality and Gold Standard Science: This "Facilities and Resources" insert for grant applications highlights the ReaDI Program's training courses, online resources, and one-on-one consultations, along with Columbia's triennial symposium series that brings together researchers to discuss reproducibility and emerging challenges in research integrity.
Scientific Rigor and Reproducibility Toolbox (American Physiology Society Science Policy Committee)
Rigor of Prior Research:
- 6 WAYS To Access Rigor of Prior Research
- Six red flags for suspect work
Authentication Plans
- General guidelines for authentication plan (NIH)
- Authentication plan examples (NIH)
- Antibody Validation
  - Best Practices for Antibodies (Sigma Aldrich)
  - Antibody Validation (The Human Protein Atlas)
  - Common pitfalls when working with antibodies & common practices for validating antibodies (Biotechniques, 2010)
  - CiteAb - Database that ranks antibodies by the number of times they’ve been cited in publications
- Cell Line Validation
  - Cell Line Checklist for Manuscripts and Grant Applications (ICLAC)
  - Register of Misidentified Cell Lines (ICLAC)
  - Guidelines for the use of cell lines in biomedical research (Br J Cancer. 2014)
  - Standards for Cell Line Authentication and Beyond (PLoS Biol. 2016)
- Sex as a Biological Variable (SABV)
  - Sex as a Biological Variable – A Primer (NIH)
  - Gender Decision Tree; SABV in Biomedicine Checklist
- Sample Authentication Plans: Example 1 (NIH); Example 2 (IDEXX Bioresearch)

Training and Other Resources for Rigor and Reproducibility (NIH): Resources and training on many aspects of rigor and reproducibility, including sex as a biological variable, research methods, reviewer guidance and more.
Scientific Rigor Examples (NIH)
Resources and Tools for Rigorous Experimental Design (NIH)
- The Experimental Design Assistant – EDA
- EQUATOR Network Reporting Guidelines
Experimental Design Checklist (From Casadevall and Fang, How-To Guide)
Let’s Experiment: A Guide for Scientists Working at the Bench (iBiology) | Scientists from a variety of backgrounds give concrete steps and advice to help you build a framework for how to design experiments in biological research.
6 Rules of Thumb for Determining Sample Size and Statistical Power (The Abdul Latif Jameel Poverty Action Lab)
Sample Size and Power Estimate Online Calculator | Free, Online, Easy-to-Use Power and Sample Size Calculators.

It's never too early to start thinking about how your data will be managed throughout your research. Planning ensures that your data is collected, organized, and stored in a way that maximizes its value and utility, both for your current projects and future research. By establishing clear protocols for data handling from the outset, you can avoid common pitfalls like data loss, ethical breaches, or difficulties in data sharing and reuse. Thoughtful data management also facilitates collaboration, enhances the reproducibility of your work, and ensures compliance with institutional and funding agency requirements. Taking the time now to consider how your data will be handled can save you significant effort later and contribute to the overall integrity and impact of your research.

What types and formats of data will our lab collect?
What ethical considerations must we address when working with human or animal subjects, and what steps will we take to ensure privacy, confidentiality, and compliance?
What documentation and metadata standards will we use to organize and describe our data?
Who will need access to our data, and how can we ensure it is usable for future research or collaboration?
What access restrictions will we apply to protect sensitive data?
Where will we store our data during and after our research projects?
What are the projected costs associated with managing, documenting, storing, and preserving our lab's data?

3. Execute

Collecting and Analyzing Data

This section covers the execution phase of your research project. It provides effective practices for managing protocols and collecting data, ensuring data integrity is maintained throughout the process. It also includes resources on statistical analysis and interpretation to help make sense of the data collected. Furthermore, it emphasizes the importance of handling and storing research data correctly to preserve its quality and reliability. This section is critical for the accurate and ethical execution of research activities.

Protocols.io | A free, up-to-date, crowd-sourced protocol repository for researchers.
Protocol Exchange from Nature Protocols | Protocol Exchange is an open repository of community-contributed protocols sponsored by Nature Protocols.
Bio-protocol | Bio-protocol is an online peer-reviewed protocol journal. Its mission is to make life science research more efficient and reproducible by curating and hosting high quality, free access protocols.
Current Protocols (Wiley) | The Current Protocols collection includes nearly 20,000 step-by-step techniques, procedures, and practical overviews that provide researchers with reliable, efficient methods to ensure reproducible results and pave the way for critical scientific discovery.
Springer Nature Experiment | The largest available collection of protocols and methods from Nature Methods, Nature Protocols, Nature Research, and Springer Protocols.

De-identification and Anonymization of Data
- Strategies for De-Identification and Anonymization of Electronic Health Record Data for Use in Multicenter Research Studies (Kushida CA et al., 2012)
- A scalable software solution for anonymizing high-dimensional biomedical data (Meurers T et al., 2021)
- De-identification methods for open health data: the case of the Heritage Health Prize claims dataset (El Emam K et al., 2012)
- What do I need to know about protecting study participants' privacy, HIPAA, and subject de-identification for dbGaP data submissions? The database of Genotypes and Phenotypes (dbGaP)
- Protecting privacy using k-anonymity (El Emam K and Dankar FK, 2008)
- 5 steps for removing identifiers from datasets (John Hopkins Sheridan Libraries)
Best Practices for Data and Code Management (2015, Innovations for Poverty Action) | This guide outlines best practices in data and code management. The scope of the guide is to cover the principles of organizing and documenting materials at all steps of the project lifecycle with the goal of making research reproducible.
Guide to writing README for metadata (Cornell University)
Find Files Faster: How to Organize Files and Folders
Best Practices for Data Management when Using Instrumentation | Tips and effective practices for collecting, saving, and processing data collected from instruments. Also available on Rascal Training - TC2650.

Notebooks
- Good Laboratory Notebook Practices | Tips and best practices for maintaining a laboratory notebook | Rascal Course TC2651.
- Laboratory Notebook Checklist | Adapted from the tutorial on Good Laboratory Notebook Practices, this checklist can be used by researchers to manage their notebooks or by PIs who may review group members' notebooks.
- Electronic Laboratory Notebook (ELN) | Columbia University provides a free Electronic Research Notebook service for researchers, instructors, and students. This service is provided by LabArchives. LabArchives may be used on all CU campuses and is approved for PII, RHI, and PHI. It is registered in RSAM #5644.
Code management and sharing
- Best Practices for Writing Reproducible Code (Workshop from Utrecht University) | Workshop to help researchers not only to make your work reproducible but also to increase the efficiency of their workflow.
- How to connect Figshare with your GitHub account
- Issuing a persistent identifier for your GitHub repository with Zenodo
- Connect GitHub to an OSF Project
- TL;DR Legal - Software Licenses in Plain English
- Choose an open-source license

Practical tips combining animal welfare and experimental rigor to improve reproducibility in behavioral neuroscience (Morais Loss C et al., 2021)
Guidelines on the Organization of Samples in a Laboratory | Tips on managing, identifying, and preserving research samples (non-clinical). Also available on Rascal Training - TC3250

Columbia Consulting Services for Statistical Analysis | Services below are provided to Columbia researchers, ranging from no-cost to fee-for-service.
- The Biostatistics, Epidemiology, and Research Design Resource (BERD) | Provides a wide range of design, statistical, and analytical support services to assist CUIMC faculty members in garnering grant support and publishing study results. In conjunction with the Department of Biostatistics of the Mailman School of Public Health, BERD provides support through consultations and educational initiatives.
- Department of Statistics Consulting Services | The Department of Statistics offers free statistical consulting to the Columbia community. Consulting is available by appointment only.
- Columbia University Department of Psychiatry Biostatistics Consulting | Mental Health Data Science provides comprehensive statistical consulting to all Columbia University Department of Psychiatry and NYSPI employees.
Courses and Lectures
- Biostatistics for Clinical Researchers | Part of the “Biostatistics in Action: Tips for Clinical Researcher” lecture series that is sponsored by the Irving Institute for Clinical and Translational Research - Biostatistics, Epidemiology and Research Design resource, which is supported in part by an NIH Clinical and Translational Science Award (CTSA) through its Center for Advancing Translational Sciences (Grant No, UL1TR001873). The speaker, Cody Chiuzan, PhD, is an Assistant Professor in the Department of Biostatistics at the Mailman School of Public Health.
- Statistical Software Mini-Courses | A two-part mini-course on getting started with statistical software. The mini-course covers the basics of statistical programming in R and SAS. Topics include data manipulation, descriptive statistics, and basic analyses. Statistical Software Mini-courses are offered once per year. Open to the Columbia community at no cost.
- Johns Hopkins University Data Science Lab | The major educational initiative of the JHUDSL is to create open-source online courses delivered through a range of platforms, including YouTube, Github, Leanpub, and Coursera.

Trainings | Research Data Services provides a range of services and programs, and host events and workshops to help researchers find, evaluate, understand, steward, and use data.
The Digital Science Center provides a wide range of software to support research and coursework in several science and engineering disciplines. All of the software below is available for use on the computers located in the Science & Engineering Library.

Research Tools and Solutions Supported by Columbia
- LabArchives | Paperless research notebook and lab manual solution for Columbia's researchers.
- GraphPad Prism Discount | Graphing and statistical software for creating publication-quality graphs and analyzing scientific data with t-tests, ANOVA, linear and nonlinear regression, survival analysis.
- SnapGene | Molecular biology software for planning, visualizing, and documenting DNA cloning and PCR; allows feature annotation and primer design.
- ChemDraw | A program to draw structures ChemDraw is the drawing tool of choice for chemists to create publication-ready, scientifically intelligent drawings – ChemDraw Activation Code.
- NVIVO | NVivo is a software program used for qualitative and mixed-methods research. Specifically, it is used for the analysis of unstructured text, audio, video, and image data, including (but not limited to) interviews, focus groups, surveys, social media, and journal articles – activation code .
- CrystalMaker is the most-efficient way to visualize crystal and molecular structures. Its interactive design lets you see the wood for the trees" and build your own visual understanding of complex materials – Crystalmaker License.
- Schrodinger PyMol License Access | Molecular visualization system on an open-source foundation, maintained and distributed by Schrödinger – Schrodinger PyMol License Access.
Security and Privacy
- Globus | Secure, efficient and reliable file transfer service for large, non-sensitive data transfers within Columbia and to external collaborators.
- dWinSCP | Secure FTP program, recommended by CUIT for file transfers to the cunix.cc.columbia.edu server.
- CUSpider | Windows application for scanning for Personally Identifiable Information (PII) such as Social Security numbers
- Malwarebytes | Virus and spyware scanning program.
- Remote Access | Remote access to network files and administrative applications on the Columbia network via VPN and Citrix.
Writing
- Overleaf | Collaborative LaTeX editor for writing, editing and producing research papers and project reports (Overleaf Professional license).
- Turnitin | Plagiarism Detection Services.

8 Types of Research Bias and How to Avoid Them? (appinio.com)
Identifying and Avoiding Bias in Research (Pannucci CJ and Wilkins EG, 2011)
The roles, challenges, and merits of the p-value (Chén OY et al., 2023)
Moving beyond P values in The Journal of Physiology: A primer on the value of effect sizes and confidence intervals (Williams S et al., 2023)
Blind analysis: Hide results to seek the truth (MacCoun R and Perlmutter S, 2015)
The fickle P value generates irreproducible results (Halsey LG et al., 2015)

To mitigate the risk of disruption, it is recommended that all principal investigators develop research continuity plans for their laboratories and research teams.

4. Disseminate & Preserve

Sharing and Storing your Research Outcomes

This section addresses the essential processes of publishing and preserving your research outcomes, including data. It offers guidance on manuscript preparation, including checklists, ethical considerations for digital images, templates for organizing data for publication, and choosing the right repository for your data. It also covers copyright and plagiarism, providing resources to help researchers understand and avoid plagiarism while managing citations effectively. Tutorials on citation management software are included to support proper citation practices.

Checklist for manuscript preparation | This outlines the necessary steps and requirements that authors need to fulfill when submitting their work.
Data handling and figure preparation
- Handling Digital Scientific Images: Dos &Don'ts | The course addresses the ethical considerations and challenges of digital image manipulation in scientific research. It covers the importance of using image editing responsibly to enhance clarity without compromising data integrity.
- Community-developed checklists for publishing images and image analyses | These checklists offer authors, readers, and publishers key recommendations for image formatting and annotation, color selection, data availability, and reporting image-analysis workflows.
- Data-to-Figure Map | This template is designed to aid in the organization of raw and manipulated data files as you prepare for publication or presentations and to fulfill requirements for open access policies.

Columbia University Copyright Advisory Services can address issues surrounding the use of scholarly materials by faculty and students in the course of research, teaching, and communicating scholarship.
Understanding What Plagiarism Is and How to Avoid It
GSAS Resources for Plagiarism Education | Columbia's Graduate School of Arts and Sciences website on academic integrity has compiled several resources for plagiarism education.
Indiana University has created comprehensive tutorials and an exam regarding plagiarism. The exam may be a useful risk assessment tool. ORI's 28 Guidelines at a Glance on Avoiding Plagiarism
Knowing and Avoiding Plagiarism During Scientific Writing (Kumar PM et al., 2014)

Sometimes, plagiarism results from mismanaged or improper citation and source management. Citation management software can help avoid such problems.

Questionnaire to determine the right reporting checklist for your work (EQUATOR library)
Instant feedback for your manuscript (Penelope) - Checks academic manuscripts in Microsoft Word. It assesses structure, declarations, statistics, referencing, and other common reporting errors in seconds.
Principles and Guidelines for Reporting Preclinical Research (NIH)
ARRIVE Guidelines (Animal Research: Reporting of In Vivo Experiments) The ARRIVE guidelines, originally published in PLOS Biology, were developed in consultation with the scientific community as part of an NC3Rs initiative to improve the standard of reporting of research using animals.
Reporting Standards for Research in Psychology (APA)
Recommendations for the Conduct, Reporting, Editing, and Publication of Scholarly Work in Medical Journals (IMCJE)
Research Reporting Guidelines and Initiatives: By Organization - This chart lists the major biomedical research reporting guidelines that provide advice for reporting research methods and findings (NIH - NLM)

What Constitutes Authorship? A discussion document for authorship from COPE
NIH Guidelines for Authorship

Predatory journals and publishers often operate under the auspices of open-access publishing. They charge authors fees without reviewing research for quality or providing editorial and publishing services. Below are questions and resources to help you determine if a journal is predatory.

If the journal is open access, is it registered with the Directory of Open Access Journals?
Does the journal list the names of its editorial and advisory boards?
Are the journal's peer review and editorial policies openly available?
Do you recognize the names of current contributors as scholars in your field?
Do you recognize the journal's publisher? Is this information easy to find? Is that publisher a member of COPE (the Committee on Publication Ethics)?
- Resources
  - Columbia University Libraries Scholarly Communication is available to help! Email questions to: [email protected]
  - Think. Check. Submit . Helps researchers identify trusted journals for their research. Through a range of tools and practical resources, this international cross-sector initiative aims to educate researchers, promote integrity, and build trust in credible research and publications.
- Selected Literature
  - “Predatory” vs trustworthy journals: What do they mean for the integrity of science? by Sacha Boucherie
  - Academics and Scientists: Beware of Predatory Journal Publishers from Federal Trade Commission
  - Beware of Predatory Journals by Anders Rydholm
  - How I became Easy Prey to a Predatory Publisher by Alan H. Chambers

In February 2013, the White House Office of Science and Technology Policy (OTSP) issued a memo with the purpose of increasing access to federally funded research. This memo required any Federal agency that awards at least $100 million/year in support of research to develop a plan that would increase public access to publications and data resulting from federally-funded projects. In response to this memo, a number of public and private funders have established new requirements for researchers.

Items to Consider in Data Sharing Plan*:

What data will be shared?
Who will have access?
Where will shared data be located?
When will data be shared?
How will the data be located and accessed?

Additional information about the requirements for NIH and NSF are available here:

For more information please visit the Public Access Mandates and Resources page.

*Source: NIH Data Sharing Plan. Individual funders may have different requirements for data sharing plans.

Public Access Mandates and Resources

AHRQ
- Publication Access: PubMed
- Data Access: AHRQ will coordinate upon receipt of funding
- Documentation: Policy
CDC
- Publication Access: PubMed
- Data Access: Repository of choice* (special considerations for sensitive data)
- Documentation: Plan
DoD
- Publication Access: DTIC
- Data Access: Repository of choice*
- Documentation: Policy, Plan
DOE
- Publication Access: PAGES
- Data Access: OpenEI
- Documentation: Plan
DOT
- Publication Access: National Transportation Library (NTL)
- Data Access: Data Repository conformant with DOT Public Access Policy
- Documentation: DOT's What You Need to Know
FDA
- Publication Access: PubMed
- Data Access: Discipline specific repository
- Documentation: Plan, OpenFDA
NASA
- Publication Access: NASA PubSpace
- Data Access: Repository of choice*
- Documentation: Plan, NASA Public Access to Results
NIH
- Publication Access: PubMed
- Data Access: Appropriate NIH repository (ex: NLM, NCBI, BMIC)
- Documentation: Plan, Public Access Policy Data Sharing Policy, NIH Sharing Policies and Guidance, CU's Resource Page
- Note: Institutes, centers, and offices may have different data sharing policies
NIAAA
- Publication Access: PubMed
- Data Access: Human subjects data: NIAAA DA
- Documentation: Data Sharing Policy: NOT-AA-19-020
NIMH
- Publication Access: PubMed
- Data Access: NIMH NDA
- Documentation: Data Sharing Policy: NOT-MH-19-033
NIJ
- Publication Access: NACJD
- Data Access: Data Archiving Plans
- Note: Not subject to OTSP memo
NIST
- Publication Access: PubMed
- Data Access: NIST data through data.gov
- Documentation: NIST Plan, NIST Public Access
NOAA
- Publication Access: NOAA Institutional Repository
- Data Access: NOAA Data Catalog
- Documentation: Plan, National Centers for Environmental Information
NSF
- Publication Access: NSF-PAR, Preparing for manuscript deposit
- Data Access: Appropriate repository by research field as described in data management plan*, Dissemination and Sharing of Research Results FAQs
- Documentation: Plan, Public Access, Public Access FAQs
ASPR (The Office of the Assistant Secretary for Preparedness and Response)
- Publication Access: PubMed
- Data Access: Repository of choice as described within data sharing plan*
- Documentation: Plan
USAID
- Publication Access: Development Experience Clearinghouse (DEC)
- Data Access: Development Data Library (DDL)
- Documentation: Plan, Data policy FAQ
USDA
- Publication Access: PubAg
- Data Access: Repository of choice as described within data management plan*, Ag Data Commons Beta Site
- Documentation: Plan
USGS
- Publication Access: USGS Publications Warehouse
- Data Access: USGS accepted data repository
- Documentation: Public Access, USGS Data Management resources
VA
- Publication Access: PubMed
- Data Access: N/A
- Documentation: Public Access
Notes:
- A list of possible data repositories is available through re3data.org
- Instructions for uploading to PubMed are available

US Private Funders

Alfred P. Sloan
- Publication Access: "Information Products" to be disseminated as outlined by "IP Plan"
- Data Access: "Information Products" to be disseminated as outlined by "IP Plan"
- Documentation: Grant Proposal Guidelines
Autism Speaks
- Publication Access: PubMed
- Data Access: Not specified
- Documentation: Policy
Ford Foundation
- Publication Access: CC-BY license
- Data Access: CC-BY license
- Documentation: Terms and Condition of Use
Bill and Melinda Gates Foundation
- Publication Access: CC-BY license
- Data Access: CC-BY license
- Documentation: Policy
Hewlett Foundation
- Publication Access: CC-BY license
- Data Access: CC-BY license
- Documentation: Press Release, Guiding Principles
Howard Hughes Medical Institute (HHMI)
- Publication Access: PubMed
- Data Access: Data supporting publications to be made available at no cost. Choose an appropriate discipline specific repository (if available).
- Documentation: Policy, Publication Policy
MacArthur Foundation
- Publication Access: CC-BY license
- Data Access: CC-BY license
- Documentation: Policy
Microsoft Research
- Publication Access: Microsoft Research open-access repository
- Data Access: Not specified
- Documentation: Policy
Gordon and Betty Moore Foundation
- Publication Access: Prospective grantees to develop a Data Sharing and/or Intellectual Property Plan
- Data Access: Prospective grantees to develop a Data Sharing and/or Intellectual Property Plan
- Documentation: Policy
World Bank
- Publication Access: Open Knowledge Repository
- Data Access: Not specified
- Documentation: Policy
Note: The above list of private funders is not comprehensive. Additional private funders' open access policies can be found on the Registry of Open Access Repository Mandates and Policies

US Directives

Open access (OA) is the free and unrestricted availability of digital content online. It can apply to any type of content, including scholarly work, software, audio, video, and more.

Features of open access

Free: OA content is available at no cost to the reader.
Unrestricted: OA content has few restrictions on its use or reproduction.
Digital: OA content is available in a digital format.
Open licenses: OA content often uses open licenses, like Creative Commons licenses, which allow for more reuse and sharing.

Benefits of open access

Increased access: OA provides greater access to information for the general public, students, teachers, and libraries.

Increased visibility: OA increases the visibility of research outputs, which can lead to a greater impact.

Increased transparency: OA makes scientific research more transparent and accessible.

Open Access Policies at Columbia

For more information contact Scholarly Communication & Publishing.

There are a number of ways to maintain and share your data in order to make it available to the scholarly community and the broader public. Check out the links and resources below to find more information about managing and sharing data, and to find a repository that is right for you and your research.

CU-Supported Data Repositories
- List of options for the storage, sharing and transfer of digital research data that is available to Columbia researchers. This table is maintained by the ReaDI Program. For more detail regarding the resources listed in the table below, please download the research data storage options PDF. All systems located at Columbia University’s Morningside Heights or Manhattanville Campus that process, transmit and/or store Sensitive Data must be registered with the CU Information Security Office. All Systems located at CUMC (“CUMC Systems”) must be registered with the CUMC Information Security Office. See Data Security webpage for more information.
- See RSAM User Guide for Registering your Device.
Repositories for Sharing Scientific Data (NIH)
- Browse through this listing of NIH-supported repositories to learn more about some places to share scientific data. Select the link provided in the “Data Submission Policy” column to find data submission instructions for each repository.
- Learn more on how to evaluate and select appropriate data repositories.
Open Domain-Specific Data Sharing Repositories (BioMedical Informatics Coordinating Committee - BMIC)
- Domain-specific repositories are typically limited to data of a certain type or related to a certain discipline.
Generalist Repositories (BioMedical Informatics Coordinating Committee - BMIC)
- Generalist repositories accept data regardless of data type, format, content, or disciplinary focus.
- We are currently recommending researchers to use Dryad, which is freely available to all CU researchers. You may also want to refer to the CU Data Repository Finder for other repositories that meet the NIH’s suggested requirements.

5. End of the Line

Ending your Research Journey

The final section addresses the closing stages of a research project. This section ensures that your research is concluded responsibly, with all necessary procedures followed for a clean and ethical project wrap-up.

There are a number of action items that need to be completed before a staff member leaves a research group. Below are some resources to help a PI ensure they obtain the necessary data and protocols before a group member leaves the University, as well as procedures when a PI is vacating laboratory space.

Laboratory Checklist for Departing Group Members | A checklist designed for PIs managing a laboratory-based research group for departing group members, such as graduate students and/or postdocs.
Data Management Departure Guidelines | This document outlines the official data management and departure procedures for Columbia University researchers, detailing responsibilities, approvals, and compliance steps necessary to ensure secure handling, retention, and transfer of research data and materials upon leaving the institution.
Checklist for Radiation Safety Personnel Change | Columbia's EH&S office provides a checklist to ensure a seamless and safe transition among personnel handling radioactive materials.
Procedures for Vacating a Lab Space | Required procedures list for either renovating, relocating or vacating a Columbia University lab space. Clearance will not be issued by EH&S until all these procedures are met.
Transition of Knowledge and Materials | Employees who are leaving should allocate time to review their work-related information and materials, and ensure that these materials are copied or ownership is transferred to their manager, unit administrator, or appropriate coworkers. Potential policies may include (From LabArchives):

Discipline-Specific Resources

Subject Area Resources

This section provides specialized research integrity and data management guidance tailored to different academic fields. Each discipline area includes curated collections of best practices, methodological guidance, data repositories, specialized tools and software, training tutorials, reporting standards, and relevant professional community resources. These resources address the unique research challenges, data types, and methodological considerations specific to each field, helping researchers find the most relevant and applicable guidance for their particular area of study.

Best Practices

Best Practices for Scientific Computing by Greg Wilson, et. al.
Tidy Data by Hadley Wickham
A Quick Guide to Organizing Computational Biology Projects by William Stafford Noble

Tools and Resources

Reproducibility in Computational and Experimental Mathematics Lecture Videos from ICERM
Tools for Reproducible Research - a collection of resources from Karl Broman
Scientific workflows for computational reproducibility in the life sciences: Status, challenges and opportunities by Sarah Cohen-Boulakia, et. al.
The Digital Science Center provides a wide range of software to support research and coursework in several science and engineering disciplines. All of the software below is available for use on the computers located in theScience & Engineering Library.
List of software and tools either available on the machines in the Research Data Service, or tools available online with the level of support available for each

Data and Sample Repositories

Repository Finder Decision Tree: A pilot project of the Enabling FAIR Data Project led by the American Geophysical Union (AGU) in partnership with DataCite and the Earth, space and environment sciences community, can help you find an appropriate repository to deposit your research data.
EarthChem: A suite of data systems that assist geoscientists with accessing, sharing, and using geochemical, petrological, and geochronological data.
Geochron: A database system designed to capture complete data and metadata to document geochronologic age estimation, allowing future reuse, recalculation, and integration with other data.
Marine Geology and Geophysics: The Marine Geoscience Data System (MGDS) provides a suite of tools and services for free public access to marine geoscience research data acquired throughout the global oceans and adjoining continental margins.
SESAR: A centralized registry that provides and administers unique identifiers for geoscience samples

Tutorials

Sample Management: Training Module for Rock Outcrop Samples: Presented as a three part training module for student researchers to learn sample collection, processing, and management protocol including proper sample documentation, registration, and tracking in a small academic department collaborative research setting.
Sample Management: Training Module for Soil Cores: This module describes the steps for collecting, processing and managing soil cores sampled manually by augering. The purpose is to provide a protocol for collecting, processing and managing this type of physical samples to facilitate future discovery and use, including assigning International Geosample Numbers (IGSN).
Digital Sample Management for SESAR: Tutorials available from System for Earth Sample Registration (SESAR)

Data Management

Data Management Training Clearinghouse: A registry for online learning resources about research data management. It was created in a collaboration between the U.S. Geological Survey's Community for Data Integration, the Earth Sciences Information Partnership (ESIP), and DataONE.

Resources from Special Interest Groups and Communities

COPDESS: The Coalition for Publishing Data in the Earth and Space Sciences connects Earth and space science publishers and data facilities to help translate the aspirations of open, available, and useful data from policy into practice.
DataONE: A community driven project providing access to data across multiple member repositories, supporting enhanced search and discovery of Earth and environmental data. DataONE promotes best practices in data management through responsive educational resources and materials.
EarthCube: initiated by NSF in 2011 to transform geoscience research by developing cyberinfrastructure to improve access, sharing, visualization, and analysis of all forms of geosciences data and related resources.
ESIP: Earth Science Information Partners (ESIP) is a 501(c)(3) nonprofit, volunteer and community-driven organization that advances the use of Earth science data.
IEDA: IEDA systems serve as primary community data collections for global geochemistry and marine geoscience research and support the preservation, discovery, retrieval, and analysis of a wide range of observational field and analytical data types.
USGS: Community for Data Integration: is a dynamic community of practice working together to grow USGS knowledge and capacity in scientific data and information management and integration.

Functional MRI

The Secret Lives of Experiments: Methods Reporting in the fMRI Literature by Joshua Carp
Fostering Reproducible fMRI Research (Nature Neuroscience)
Reproducibility of fMRI in the Clinical Setting: Implications for Trial Designs by Rose Bosnell et. al.

Mixed Methods and Qualitative Research

Analysing Qualitative Data by Catherine Pope, Sue Ziebland, and Nicholas Mays
Assessing Quality in Qualitative Research by Nicholas Mays and Catherine Pope
Introductory Social and Behavioral Science Training Material from OBSSR (NIH)
Mixed Methods in Biomedical and Health Services Research by Dr. Leslie A. Curry, et. al.
Qualitative and Mixed Methods Provide Unique Contributions to Outcomes Research by Dr. Leslie A. Curry, Dr. Ingrid M. Nembhard, Dr. Elizabeth H. Bradley
Using Qualitative Methods in Health Related Action Research by Julienne Meyer

Patient-Centered Outcomes Research and Observational Studies

Developing a Protocol for Observational Comparative Effectiveness Research: A User's Guide Edited by Priscilla Velentgas, Nancy A. Dreyer, Parivash Nourjah, Scott R. Smith, and Marion M. Torchia
Systematic Review and Evidence Integration for Literature-Based Environmental Health Science Assessments by Andrew A. Rooney, Abee L. Boyles, Mary S. Wolfe, John R. Bucher, and Kristina A. Thayar
Empirical confidence interval calibration for population-level effect estimation studies in observational healthcare data by Martijn J. Schuemie, George Hripcsak, Patrick B. Ryan, David Madigan, and Marc A. Suchard

Clinical Trial Design Learning Resources

Review of Recent Methodological Developments in Group-Randomized Trials: Part 1-Design by Elizabeth L. Turner, Fan Li, John A. Gallis, Melanie Prague and David M. Murray
Review of Recent Methodological Developments in Group-Randomized Trials: Part 2-Analysis by Elizabeth L. Turner, Melanie Prague, John A. Gallis, Fan Li, and David M. Murray
Clinical Study Design Checklist from SPIRIT
Improved Designs for Cluster Randomized Trials by Catherine M. Crespi
Reporting and methodological quality of sample size calculations in cluster randomized trials could be improved: a review by Clare Rutterford, Monica Taljaard, Stephanie Dixon, Andrew Copas, and Sandra Eldridge
Are missing data adequately handled in cluster randomised trials? A systematic review and guidelines by Karla Díaz-Ordaz, Michael G Kenward, Abie Cohen, Claire L Coleman, and Sandra Eldridge
Sample size calculations for the design of cluster randomized trials: A summary of methodology by Fei Gao, Arul Earnest, David B. Matchar, Michael J. Campbell, David Machin
GRT Sample Size Calculator
IRGT Sample Size Calculator
Online Courses
- Introduction to the Principles and Practice of Clinical Research (IPPCR). The NIH Clinical Center's Introduction to the Principles and Practice of Clinical Research (IPPCR) course trains registrants on how to effectively and safely conduct clinical research. The course focuses on the spectrum of clinical research and the research process by highlighting biostatistical and epidemiologic methods, study design, protocol preparation, patient monitoring, quality assurance, ethical and legal issues, and much more.
- Principles of Clinical Pharmacology. This course is an online lecture series covering the fundamentals of clinical pharmacology as a translational scientific discipline focused on rational drug development and utilization in therapeutics. The course focuses on the following core principles of pharmacology: pharmacokinetics; drug metabolism and transport; drug therapy in special populations; assessment of drug effects; drug discovery and development; pharmacogenomics and pharmacotherapy.
- Other clinical training opportunities are offered by NIH's Office of Clinical Research

Clinical Trial Protocol Development

NIH e-protocol writing tool: The electronic protocol writing tool aims to facilitate the development of Phase 2 and 3 IND/IDE Clinical Trial Protocol Template as well as the Behavioral and Social Sciences Research Involving Humans. The tool has been developed through the National Institutes of Health (NIH) Office of Science Policy.
SPIRIT Group resources: an international group of stakeholders with the initiative to improve the completeness and quality of trials protocols. (text adapated from website)

Reporting Guidelines

Questionnaire to determine the right reporting checklist for your work from the EQUATOR library
Instant feedback for your manuscript from Penelope- checks academic manuscripts written in Microsoft Word. In seconds, it assesses structure, declarations, statistics, referencing and other common reporting errors.
Research Reporting Guidelines and Initiatives by Organization - This chart lists the major biomedical research reporting guidelines that provide advice for reporting research methods and findings (from NIH - NLM)

Retrospective Chart Review

Compilation of suggested practices for creation of a retrospective chart review form
The Retrospective Chart Review: Important Methodological Considerations by Vasser Matt and Holzmann Matthew

Scientific Integrity

A Multidisciplinary Approach to Ensure Scientific Integrity in Clinical Research by Dr. Ko Bando, et. al

Simulation-Based Research

The INSPIRE network has collaborated with global partners (including four influential journals: Simulation in Healthcare, BMJ Simulation, Clinical Simulation in Nursing, and Advances in Simulation) to develop extensions specific to simulation-based research for both the CONSORT and STROBE statements.

Statistics

Statistics in Medicine publishes statistical tutorials
The Biostatistics in Action Lectures YouTube Channel
Strategies for Integration of Gender and Sex in Research
- Essential metrics for assessing sex & gender integration in health research proposals involving human participants by Suzanne Day, et. al.
- Sex and Gender Equity in Research: rationale for the SAGER guidelines and recommended use by Shirin Heidar

Systematic Reviews

Covidence - The cloud-based software supports import and de-duplication of citations, title, abstract and full-text screening, risk-of-bias assessment and data extraction. Columbia researchers can sign in for free with their UNI (provided by Augustus C. Long Health Sciences Library)
How to Do a Systematic Review: A Best Practice Guide for Conducting and Reporting Narrative Reviews, Meta-Analyses, and Meta-Syntheses by Siddaway AP, Wood AM, and Hedges LV
Clarifying differences between review designs and methods by David Gough, James Thomas & Sandy Oliver
What is a Systematic Review? from Curtin University

All About Generative AI

Generative artificial intelligence tools are rapidly reshaping how research is conducted, written, and reviewed, creating both new opportunities and new responsibilities for the scholarly community. As major funders such as NIH and NSF, and leading journals including Nature, Science, and Cell, have begun implementing formal policies on AI use, researchers must navigate an evolving landscape that consistently prioritizes transparency, human accountability, and intellectual originality. The resources and policies collected here are intended to help researchers understand current expectations and make informed, ethical decisions about when and how to use generative AI in their work.

Columbia University Generative AI Policy: Guidance for staff, faculty, students, and researchers on the reasonable use of generative AI. Please note that this policy is a “work in progress” as the technology, the law and the Columbia community usage evolves.

This landscape remains in flux, so checking each organization's current guidance before submission is always advisable.

SECTION: GOVERNMENT
Organization: NIH
Scope: Grant applications
Key Policy: Applications substantially developed by AI are not considered original; 6-application annual cap per PI (full cap in effect Jan. 1 2026); enforced via AI-detection software.
Cited Passage: NIH will not consider applications that are either substantially developed by AI, or contain sections substantially developed by AI, to be original ideas of applicants. — NOT-OD-25-132
Effective Date: Sept. 25 2025 (full cap Jan. 1 2026)
Source URL: https://grants.nih.gov/grants/guide/notice-files/NOT-OD-25-132.html

SECTION: GOVERNMENT
Organization: NIH
Scope: Post-award enforcement
Key Policy: Post-award detection of AI use may result in grant suspension, cost disallowances, termination, and referral to the Office of Research Integrity.
Cited Passage: NIH will not consider applications that are either substantially developed by AI, or contain sections substantially developed by AI, to be original ideas of applicants. If the detection of AI is identified post award, NIH may refer the matter to the Office of Research Integrity to determine whether there is research misconduct while simultaneously taking enforcement actions including but not limited to disallowing costs, withholding future awards, wholly or in part suspending the grant, and possible termination. — NOT-OD-25-132
Effective Date: Sept. 25 2025
Source URL: https://grants.nih.gov/grants/guide/notice-files/NOT-OD-25-132.html

SECTION: GOVERNMENT
Organization: NSF
Scope: Merit review process
Key Policy: Reviewers barred from uploading proposals to public AI tools (treated as a confidentiality breach); proposers encouraged to disclose AI use in project descriptions.
Cited Passage: NSF reviewers are prohibited from uploading any content from proposals, review information and related records to non-approved generative AI tools. Proposers are encouraged to indicate in the project description the extent to which, if any, generative AI technology was used and how it was used to develop their proposal. — NSF Notice Dec. 2023
Effective Date: Dec. 2023
Source URL: https://www.nsf.gov/news/notice-to-the-research-community-on-ai

SECTION: GOVERNMENT
Organization: NSF (PAPPG 24-1 Supp. 1)
Scope: Proposals & awards
Key Policy: AI-assisted proposal preparation must be disclosed; AI tool use in fabrication, falsification, or plagiarism is explicitly defined as research misconduct.
Cited Passage: RESEARCH MISCONDUCT means fabrication, falsification, or plagiarism, whether committed by an individual directly or through the use or assistance of other persons, entities, or tools, including artificial intelligence (AI)-based tools, in proposing or performing research funded by NSF, reviewing research proposals submitted to NSF, or in reporting research results funded by NSF. — PAPPG 24-1 Supp. 1 Chapter XII.C
Effective Date: Dec. 8 2025
Source URL: https://www.nsf.gov/policies/document/pappg24-1-supplement-1

SECTION: PUBLISHERS
Organization: Nature Portfolio (Springer Nature)
Scope: Journal manuscripts & peer review
Key Policy: No AI authorship; LLM use documented in Methods section; AI copy-editing exempt from disclosure; AI-generated images banned; reviewers must not upload manuscripts to AI tools.
Cited Passage: Large Language Models (LLMs), such as ChatGPT, do not currently satisfy our authorship criteria. Notably an attribution of authorship carries with it accountability for the work, which cannot be effectively applied to LLMs. Use of an LLM should be properly documented in the Methods section… The use of an LLM (or other AI-tool) for 'AI assisted copy editing' purposes does not need to be declared. — Nature Portfolio Editorial Policies
Effective Date: 2023 (updated 2025)
Source URL: https://www.nature.com/nature-portfolio/editorial-policies/ai

SECTION: PUBLISHERS
Organization: Science / AAAS
Scope: Journal manuscripts
Key Policy: No AI authorship; full disclosure of prompt, tool, and version required in cover letter, acknowledgments, and methods; AI-generated images banned without explicit editor permission.
Cited Passage: Authors who use AI-assisted technologies as components of their research study or as aids in the writing or presentation of the manuscript should note this in the cover letter and in the acknowledgments section of the manuscript. The full prompt used in the production of the work, as well as the AI tool and its version, should be disclosed. Editors may decline to move forward with manuscripts. — Science/AAAS Policy
Effective Date: 2023 (updated)
Source URL: https://www.science.org/content/blog-post/change-policy-use-generative-ai-and-large-language-models

SECTION: PUBLISHERS
Organization: Elsevier
Scope: Journal manuscripts & peer review
Key Policy: No AI authorship; AI writing use declared in a separate AI declaration statement; AI-generated images banned; editors and reviewers prohibited from using AI on submitted manuscripts. Policy updated September 2025.
Cited Passage: Authors preparing a manuscript for an Elsevier journal can use AI Tools to support them. However, these tools must never be used as a substitute for human critical thinking, expertise and evaluation… Ultimately, authors are responsible and accountable for the contents of their work. — Elsevier Generative AI Policies for Journals (updated Sept. 2025)
Effective Date: 2023 (updated Sept. 2025)
Source URL: https://www.elsevier.com/about/policies-and-standards/generative-ai-policies-for-journals

SECTION: PUBLISHERS
Organization: ACS Publications
Scope: Journal manuscripts & peer review
Key Policy: All AI tool use disclosed in Acknowledgments; extensive AI use (e.g., generating literature reviews) may result in manuscript rejection; reviewers barred from sharing manuscripts with AI tools.
Cited Passage: The use of AI tools for text or image generation should be disclosed in the manuscript within the Acknowledgment section with a description of when and how the tools were used. The editor may, at their discretion, determine that the AI use in a given submission is too extensive… This determination may result in manuscript rejection or a request for revision to remove or reduce AI-generated portions of the manuscript. — ACS AI Policy (last updated Dec. 13 2024)
Effective Date: Ongoing (updated Dec. 13 2024)
Source URL: https://researcher-resources.acs.org/publish/aipolicy

SECTION: PUBLISHERS
Organization: Wiley
Scope: Journal manuscripts & peer review
Key Policy: No AI authorship; AI use described transparently in Methods or Acknowledgements; authors must review AI tool terms for intellectual-property issues; reviewers prohibited from uploading manuscripts to AI tools.
Cited Passage: If an author has used AI Technology to develop any portion of a manuscript, its use must be described, transparently and in detail, in the Methods section (or via a disclosure or within the Acknowledgements section, as applicable). The author is fully responsible for the accuracy of any information provided by the tool… The final decision about whether use of a GenAI tool is appropriate or permissible in the circumstances of a submitted manuscript or a published article lies with the journal's editor. — Wiley Best Practice Guidelines on Research Integrity and Publishing Ethics (updated March 3 2025)
Effective Date: 2020 (updated March 3 2025)
Source URL: https://authorservices.wiley.com/ethics-guidelines/copyright-and-intellectual-property.html

SECTION: PUBLISHERS
Organization: Taylor & Francis
Scope: Journal manuscripts
Key Policy: No AI authorship; all AI tool use must be acknowledged including full name and purpose; AI-generated images prohibited; grammar and spelling tools exempt from disclosure.
Cited Passage: You must not list AI tools as a co-author of your article… You must clearly acknowledge within your article use of Generative AI tools. Please add a statement in the Methods or Acknowledgments section which includes: the full name of the tool used (with version number), how it was used, and the reason for use. — Taylor & Francis Authorship Policy
Effective Date: 2023 (ongoing updates)
Source URL: https://taylorandfrancis.com/our-policies/ai-policy/

SECTION: PUBLISHERS
Organization: SAGE Publishing
Scope: Journal manuscripts & peer review
Key Policy: No AI authorship; distinguishes between assistive AI (no disclosure required) and generative AI (must be disclosed in Methods or Acknowledgements); editors and reviewers prohibited from using AI to generate decision letters or review reports.
Cited Passage: The use of AI tools that can produce content such as generating references, text, images or any other form of content must be disclosed when used by authors or reviewers. Authors should cite original sources, rather than Generative AI tools as primary sources within the references. — SAGE Artificial Intelligence Policy
Effective Date: 2023 (ongoing updates)
Source URL: https://www.sagepub.com/journals/publication-ethics-policies/artificial-intelligence-policy

SECTION: PUBLISHERS
Organization: IEEE
Scope: Journal manuscripts & peer review
Key Policy: All AI-generated content (text, figures, images, code) must be disclosed in Acknowledgments including the AI system used and sections affected; reviewers prohibited from processing manuscripts through public AI platforms.
Cited Passage: The use of content generated by artificial intelligence (AI) in an article (including but not limited to text, figures, images, and code) shall be disclosed in the acknowledgments section of any article submitted to an IEEE publication. The AI system used shall be identified, and specific sections of the article that use AI-generated content shall be identified and accompanied by a brief explanation. — IEEE Submission and Peer Review Policies
Effective Date: 2023 (ongoing updates)
Source URL: https://journals.ieeeauthorcenter.ieee.org/become-an-ieee-journal-author/publishing-ethics/guidelines-and-policies/submission-and-peer-review-policies/

SECTION: PUBLISHERS
Organization: Cambridge University Press
Scope: Journal manuscripts
Key Policy: No AI authorship; all AI use must be declared and clearly explained; authors accountable for accuracy and integrity; AI-generated images require copyright clearance; individual journals may impose stricter requirements.
Cited Passage: AI does not meet the Cambridge requirements for authorship, given the need for accountability. AI and LLM tools may not be listed as an author on any scholarly work published by Cambridge. Authors are accountable for the accuracy, integrity and originality of their research papers, including for any use of AI. — Cambridge University Press Publishing Ethics Policy
Effective Date: 2023 (ongoing updates)
Source URL: https://www.cambridge.org/core/services/publishing-ethics/authorship-and-contributorship-journals

SECTION: PUBLISHERS
Organization: APA (American Psychological Association)
Scope: Journal manuscripts
Key Policy: No AI authorship; AI use disclosed in Methods section and cited using software citation template; full AI output must be uploaded as supplemental material.
Cited Passage: When a generative AI model is used in the drafting of a manuscript for an APA publication, the use of AI must be disclosed in the methods section and cited. AI cannot be named as an author on an APA scholarly publication. — APA Publishing Policies
Effective Date: 2023 (ongoing updates)
Source URL: https://www.apa.org/pubs/journals/resources/publishing-tips/generative-ai-policy

SECTION: PUBLISHERS
Organization: ACM (Association for Computing Machinery)
Scope: Journal manuscripts & conferences
Key Policy: AI-generated content permitted but must be fully disclosed in Acknowledgments; disclosure level should match proportion of AI-generated content; basic word processing tools exempt; no AI authorship.
Cited Passage: Generative AI tools and technologies, such as ChatGPT, may not be listed as authors of an ACM published Work. The use of generative AI tools and technologies to create content is permitted but must be fully disclosed in the Work… The level of disclosure should be commensurate with the proportion of new text or content generated by these tools. — ACM Policy on Authorship
Effective Date: 2023 (ongoing updates)
Source URL: https://www.acm.org/publications/policies/new-acm-policy-on-authorship

SECTION: PUBLISHERS
Organization: Cell Press
Scope: Journal manuscripts & figures
Key Policy: No AI authorship; AI-generated figures must be clearly labeled in figure legends; AI-generated images prohibited in figures representing primary experimental data; AI use restricted to readability improvements with standardized disclosure templates.
Cited Passage: Cell Press requires that AI-generated figures be clearly labeled as such in the figure legend and prohibits the use of AI-generated images in any figure that purports to represent primary experimental data. — Cell Press Editorial Policies
Effective Date: 2024 (updated 2025)
Source URL: https://www.cell.com/pb-assets/journals/research/cell/editorial-policies/cell-editorial-policies.pdf

SECTION: PUBLISHERS
Organization: PLOS (Public Library of Science)
Scope: Journal manuscripts & peer review
Key Policy: Full disclosure required for any AI use in submission content; reviewers and editors must not upload manuscripts to AI platforms; AI-generated fabrications or data misrepresentation treated as research misconduct.
Cited Passage: In cases where Large Language Model (LLM) AI tools or technologies contribute to generating text content for a PLOS submission, the article's authors are responsible for ensuring that all statements in the article… represent the authors' own ideas. The use of AI tools and technologies to fabricate or otherwise misrepresent primary research data is unacceptable. Noncompliance… will be considered misrepresentation of methods, contributions, and/or results. — PLOS Ethical Publishing Practice
Effective Date: 2023 (ongoing updates)
Source URL: https://journals.plos.org/plosone/s/ethical-publishing-practice

SECTION: PUBLISHERS
Organization: ICMJE (International Committee of Medical Journal Editors)
Scope: Cross-journal medical publishing ethics body
Key Policy: No AI authorship or citation of AI as a primary source; AI use disclosed in cover letter and manuscript; authors responsible for ensuring no plagiarism in AI-generated content; nondisclosure may constitute misconduct.
Cited Passage: Chatbots (such as ChatGPT) and other AI-assisted tools should not be listed as authors because they cannot be responsible for the accuracy, integrity, and originality of the work… Referencing AI-generated material as the primary source is not acceptable. Nondisclosure of AI use may require corrective action and may be construed as misconduct in some circumstances. — ICMJE Recommendations Section V.A (Use of AI by Authors)
Effective Date: Current (updated 2025)
Source URL: https://www.icmje.org/recommendations/browse/artificial-intelligence/ai-use-by-authors.html

SECTION: PUBLISHERS
Organization: COPE (Committee on Publication Ethics)
Scope: Cross-publisher ethics body
Key Policy: AI tools cannot be listed as authors; human authors bear full responsibility for all content, accuracy, and integrity; member publishers expected to implement compliant AI policies.
Cited Passage: The use of artificial intelligence (AI) tools such as ChatGPT or Large Language Models in research publications is expanding rapidly. COPE joins organisations, such as WAME and the JAMA Network among others, to state that AI tools cannot be listed as an author of a paper. — COPE Position on Authorship & AI (2024)
Effective Date: 2023–2024
Source URL: https://doi.org/10.24318/cCVRZBms

Updated on March 23, 2026

For researchers, preserving the credibility of science now demands more than simply disclosing when AI tools have been used; it calls for a thoughtful reckoning with how collaborating with AI shifts questions of responsibility, oversight, and scholarly norms. To help navigate these challenges, we highlight five key principles — drawn from Jamieson, Kearney, and Mazza (2024) — for mitigating the risks of scientific misconduct when using generative AI.

1. Transparent Disclosure and Attribution: Scientists should clearly disclose the use of generative AI in research, including the specific tools, algorithms, and settings employed. Human and AI contributions must be accurately distinguished, and prior literature should be properly cited even when AI omits those citations. Model creators should publish details about their models and training data and maintain long-term archives to enable replication. (See: McNutt et al., "Transparency in Authors' Contributions and Responsibilities to Promote Integrity in Scientific Publication," PNAS, 2018)

2. Verification of AI-Generated Content and Analyses: Scientists are accountable for the accuracy of data, imagery, and inferences drawn from generative models. This requires using appropriate methods to validate AI-assisted findings, disclosing supporting evidence, and monitoring for biases in AI output that could skew research outcomes. Model creators should disclose limitations and provide well-calibrated confidence assessments. (See: Fostering Responsible Computing Research: Foundations and Practices, NASEM, 2022)

3. Documentation of AI-Generated Data: All AI-generated or synthetic data, inferences, and imagery must be marked with provenance information so they are not mistaken for real-world observations. Model creators should annotate synthetic data used in training and monitor issues arising from the reuse of computer-generated content in future models. (See: Reproducibility and Replicability in Science, NASEM, 2019)

4. A Focus on Ethics and Equity: AI use should produce scientifically sound and socially beneficial results while mitigating risk of harm. Scientists and model creators should adhere to ethical guidelines around attribution, intellectual property, privacy, and consent, and promote equitable access to AI tools — particularly for underserved communities. AI should not be used without careful human oversight in peer review or funding decisions. (See: London, "A Justice-Led Approach to AI Innovation," Issues in Science and Technology, 2024; Parthasarathy & Katzman, "Bringing Communities In, Achieving AI for All," Issues in Science and Technology, 2024)

5. Continuous Monitoring, Oversight, and Public Engagement: Scientists, together with academia, industry, government, and civil society, should continuously evaluate AI's impact on the scientific process and adapt strategies as technologies evolve. Research communities must anticipate harmful uses, harness AI's societal potential, and solicit meaningful public participation in governance. (See: Gasser, "Governing AI with Intelligence," Issues in Science and Technology, 2024; Aidinoff & Kaiser, "Novel Technologies and the Choices We Make," Issues in Science and Technology, 2024)

Additional Resources:

Explore a growing Columbia collection of resources on the responsible use of generative AI.

Teaching and Learning in the Age of AI: Thinking about the role of AI in your courses? Explore the following pedagogical resources and join us for workshops and events for strategies and perspectives on teaching and learning with generative AI.
AI Community of Practice: The community is a platform for learning, discussion, and application of AI principles across various fields of study at Columbia University. We aim to demystify AI, spur innovation, and approach challenges with a fresh, AI-centric perspective through regular meetings, workshops, and collaborative projects. To learn about joining, send in your interest intake form.
AI Services: CUIT is developing a suite of AI services designed to open new modes of discovery in interdisciplinary research and to enhance productivity. The services in development include advanced audio transcription, text anonymization, and automated text mining. The aim of this initiative is to make cutting-edge advances in AI and LLMs as accessible as possible to the community at Columbia.

Research Integrity

Research misconduct can occur at any stage of the research lifecycle, from proposing a study to reporting the results! Columbia University is committed to upholding the highest standards of integrity at every stage of research—from the initial proposal and design to final publication and beyond. To this end, the University has established policies and procedures that define research misconduct, outline how allegations are investigated, and detail the consequences of misconduct.

1. Proposal Development and Design

Responsibilities
- Researchers are expected to develop protocols, methodologies, and grant proposals with accuracy and honesty.
Potential Risks
- Misrepresentation of data or objectives, or plagiarizing background literature in funding applications.
Preventing Misconduct
- The Office of Research Compliance and Training offers guidance on proper proposal practices and can clarify questions regarding research ethics.

2. Data Collection and Management

Responsibilities
- Ensure data collection methods are transparent, reproducible, and accurately recorded.
Potential Risks
- Fabrication (making up results) or falsification (altering data or results).
Preventing Misconduct
- Proper recordkeeping and secure data storage are integral. The Standing Committee on the Conduct of Research, in partnership with the Office of Research Compliance and Training, helps investigators implement best practices.

3. Analysis and Interpretation

Responsibilities
- Conduct unbiased analyses and interpret results responsibly.
Potential Risks
- Manipulating or selectively reporting results that skew findings.
Preventing Misconduct
- Researchers should adhere to rigorous scientific standards and consult with colleagues or the Office of Research Compliance and Training if questions arise.

4. Publication, Peer Review, and Reporting

Responsibilities
- Accurately present findings in manuscripts, conference presentations, and peer reviews.
Potential Risks
- Plagiarism of text or ideas and omission of critical information in publications.
Preventing Misconduct
- Clear citation practices, transparent presentation of data, and ethical peer-review processes help ensure the integrity of dissemination.

5. Post-Publication Oversight and Follow-Up

Responsibilities
- Address post-publication comments, correct any errors promptly, and preserve relevant data for future reference.
Potential Risks
- Failure to correct known inaccuracies or engaging in retaliatory practices against whistleblowers.
Preventing Misconduct
- Institutional checks—such as the University’s Standing Committee on the Conduct of Research—support corrections and follow-up inquiries to maintain credibility and public trust in research.

Consequences of Misconduct

Research misconduct may lead to institutional sanctions, such as termination of grants or disciplinary action, and can result in federal penalties. By articulating a clear definition of misconduct and an established investigation process, Columbia reinforces accountability and maintains a culture of ethical, high-quality research.

If you have concerns or questions at any point in the research lifecycle, please contact the Office of Research Compliance and Training or consult the Institutional Policy on Misconduct in Research for further guidance.

Featured Resource

The Lab Data Management Plan (LDMP)

Good Research Data Management (GRDM) is a comprehensive process encompassing the collection, validation, storage, protection, sharing, and processing of data. It is crucial for ensuring the integrity, accessibility, and reliability of data. By adequately documenting and managing data, researchers can increase their work's reproducibility, thus validating their results and enhancing their research impact. GRDM encourages sharing raw datasets, spurring potential new discoveries, and offering a valuable resource for less-funded researchers. GRDM can prevent future issues (e.g., data loss, data accuracy, data retrieval, etc.), saving researchers time and money. Proper preservation in a data repository guarantees data longevity, thereby safeguarding the researcher's contributions for future reference. As of 2023, agencies like NIH and NSF require formal data management plans as part of the funding application. Additionally, many academic journals mandate the provision of raw research data supporting published articles, further highlighting the significance of GRDM.

We’ve developed a new guide to a comprehensive Lab Data Management Plan (LDMP) to help research teams proactively implement robust, everyday data practices. The LDMP is a structured framework designed to help research teams systematically organize, store, and manage their data throughout the research lifecycle. It outlines key practices such as data documentation, version control, storage and backup, team roles, and protocols for staff transitions—ensuring data is handled responsibly, remains reproducible, and complies with institutional and funding requirements.

This framework builds on best practice guides and provides a structured, practical approach to Good Research Data Management (GRDM), with the goal of reducing data mismanagement and strengthening research quality and integrity across labs.

Click here to start implementing your LDMP today!

We Want to Hear From You! The ReaDI Program is committed to bringing the most relevant and useful resources to the Columbia research community. By providing your feedback on the resources provided, you will be continuing to strengthen the ReaDI Program's robust repository. Fill out the resource feedback form.

If you have any questions or suggestions about the ReaDI Program please email us [email protected].

The ReaDI Program

Resources for the Research Lifecycle

Learn More!

About

How to Navigate

1. Get Started

Beginning your Research Journey

Establishing a Culture of Rigor and Reproducibility

Training Staff on Rigor and Reproducibility

Writing Standard Operating Procedures (SOP), Protocols, and a Lab Data Management Plan

Streamlining Your Lab: Essential Organizational Tools and Templates

2. Propose & Plan

Preparing for all Aspects of your Research Project

Developing Sound Research Questions

Addressing Research Rigor and Transparency

Designing Rigorous Experiments

Planning a Data Management Strategy

3. Execute

Collecting and Analyzing Data

Managing Protocols

Collecting and Organizing Research Data

Managing Notebooks, Data and Code

Managing Research Samples and Reagents

Analyzing and Interpreting Data

Computational Research

Software and Tools

Avoiding Bias and Blinders

Planning for the Unexpected

4. Disseminate & Preserve

Sharing and Storing your Research Outcomes

Manuscript Preparation

Copyright and Plagiarism

Manage Sources and Citations Systematically

Reporting Guidelines

Authorship

Predatory Journals

Public Access

Public Access Mandates and Resources

US Private Funders

US Directives

Open Access Publishing

Features of open access

Benefits of open access

Open Access Policies at Columbia

Sharing Data and Finding the Right Repository

5. End of the Line

Ending your Research Journey

Offboarding of Staff

Discipline-Specific Resources

Subject Area Resources

Computational Research

Best Practices

Tools and Resources

Earth & Environmental Sciences

Data and Sample Repositories

Tutorials

Data Management

Resources from Special Interest Groups and Communities

Clinical and Health Sciences

Functional MRI

Mixed Methods and Qualitative Research

Patient-Centered Outcomes Research and Observational Studies

Clinical Trial Design Learning Resources

Clinical Trial Protocol Development

Reporting Guidelines

Retrospective Chart Review

Scientific Integrity

Simulation-Based Research

Statistics

Systematic Reviews

All About Generative AI

Generative AI Policies: NIH, NSF & Major Journals

This landscape remains in flux, so checking each organization's current guidance before submission is always advisable.

Gen AI and Research Integrity

Research Integrity

Research Integrity Across the Research Lifecycle

1. Proposal Development and Design

2. Data Collection and Management

3. Analysis and Interpretation