Statistical Analysis

The misuse of statistical analyses can cause irreproducible and misleading results (1). These resources have been selected to help researchers better understand the importance of choosing appropriate statistical analyses and are not intended to replace formal statistical training or consultation services.

1.) Weak statistical standards implicated in scientific irreproducibility by Ericka Check Hayden and Scientific Methods: Statistical Errors by Regina Nuzzo


Guidelines, Literature and Blogs

  • The Interactive Statistical Pages project represents an ongoing effort to develop and disseminate statistical analysis software in the form of web pages.
    Utilizing HTML forms, CGI and Perl scripts, JavaJavaScript and other browser-based technologies, each web page contains within it (or invokes) all the programming needed to perform a particular computation or analysis.
  • Statistical Modeling, Causal Inference, and Social Science from Professor Andrew Gelman,  a professor of statistics and political science and director of the Applied Statistics Center at Columbia University. 
  • Unbiased Research: Statistical Design and Analysis of Experiments. This is a blog of mostly biomedical PhD students, at Emory University, taking a course on "Statistical Design and Analysis of Experiments"
  • StatsBlogs: syndicates posts from statistics related blogs and brings traffic and user interaction to contributing blogs. It is a service by Talk Stats Forum, the #1 statistics forum with >10k visitors daily. We are dedicated to facilitate information sharing and exchange in the statistics community
  • Simply Statistics: We are three biostatistics professors (Jeff Leek, Roger Peng, and Rafa Irizarry) who are fired up about the new era where data are abundant and statisticians are scientists.
Orange Divider

Consulting Services, Resources and Tutorials

Services below are provided to Columbia researchers ranging from no-cost to fee-for-service.

The Biostatistics, Epidemiology and Research Design Resource (BERD) provides a wide range of design, statistical, and analytical support services to assist CUMC faculty members in garnering grant support and publishing study results. In conjunction with the Department of Biostatistics of the Mailman School of Public Health, BERD provides support through consultations and educational initiatives.

Biostatistics Resource in Design, Grants and Evaluation (BRIDGE) 

  • Consulting Service is no cost to faculty at CUMC for projects that require 1-2 sessions
  • Fee-for-service consulting available for work that can be completed during the consulting service session
  • Collaborations with faculty members in Department of Biostatistics are available if a project requires a longer term statistical consulting relationship

Department of Statistics Consulting Services
The Department of Statistics offers free statistical consulting to the Columbia community.  Consulting is available by appointment only.

Statistical Consulting Center
The Statistical Analysis Center (SAC), at Columbia University’s Mailman School of Public Health, is an experienced team of experts dedicated to providing state of the art statistical, data, logistical and regulatory support for clinical research. These services are available to anybody conducting clinical experiments and randomized clinical trials. 

Center for Open Science (COS)

COS is part of the Open Science Framework (OSF), which has developed a series of online workshops as part of their statistical and methodological consulting services. These materials are free and can be found here, the webinars are also available on OSF's YouTube channel.

edX Course: Principles, Statistical and Computational Tools for Reproducible Science

Learn skills and tools that support data science and reproducible research to ensure you can trust your research results, reproduce them yourself, and communicate them to others.

This free course covers fundamentals of reproducible science, case studies, data provenance, statistical methods for reproducible science, computational tools for reproducible science, and reproducible reporting science. These concepts are intended to translate to fields throughout the data sciences: physical and life sciences, applied mathematics and statistics, and computing.

Consider this course a survey of best practices that will help you create an environment in which you can easily carry out reproducible research and integrate with similar situations for your collaborators and colleagues.

Johns Hopkins University Data Science Lab
The major educational initiative of the JHUDSL is to create open-source online courses delivered through a range of platforms including Youtube, Github, Leanpub, and Coursera. We currently have four active MOOC programs that you can enroll in at any time. Join over 8 million other students in taking a course produced by the Johns Hopkins Data Science Lab!

Courses Available From Simply Stats
Orange Divider

Statistical Resources by Discipline

  • Statistics for Biologists is a collection of articles addressing important statistical issues that biologists should be aware of and provides practical advice to help them improve the rigor of their work (text adapted from Nature)
  • Statistics for Experimental Biologists 

    This website was started to solve two related problems: 1. How to connect researchers with the information they need to do their jobs properly. 2. How to improve the quality of preclinical biomedical science. It has been developed specifically for laboratory-based experimental biologists, and therefore the examples will be familiar and relevant to anyone with such a background. The articles consist of "how-to" topics (including what not to do), key concepts and ideas, key papers and books (all suitable for biologists), and the occasional opinion piece. It is assumed that readers will have taken a first course in statistics and are familiar with t-tests, ANOVA, and regression.

  • Computing Workflow for Biologist: A Roadmap by Ashley Shade and Tracey K. Teal
Orange Divider