Guidelines, Literature and Blogs
- Ten Simple Rules for Effective Statistical Practice by Robert E. Kass, et. al.
- Beyond Rigor: Appropriate Analysis by Patricia Campbell and Eric Jolly
A statistical definition for reproducibility and replicability by Prasad Patil, Roger D. Peng, Jeffrey Leek
Beyond subjective and objective in statistics by Andrew Gelman and Christian Hennig
The Statistics Decision Tree : The Decision Tree helps select statistics or statistical techniques appropriate for the purpose and conditions of a particular analysis and to select the MicrOsiris commands which produce them or find the corresponding SPSS and SAS commands.
The ASA's Statement on p-Values: Context, Process, and Purpose by Wasserstein, RL and Lazar, NA
Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations by Greenland, S. et. al.
Statisticians issue warning over misuse of P values by Monya Barker (Nature)
Some natural solutions to the p-value communication problem—and why they won’t work by Andrew Gelman and John Carlin
- The Interactive Statistical Pages project represents an ongoing effort to develop and disseminate statistical analysis software in the form of web pages.
- Statistical Modeling, Causal Inference, and Social Science from Professor Andrew Gelman, a professor of statistics and political science and director of the Applied Statistics Center at Columbia University.
- Unbiased Research: Statistical Design and Analysis of Experiments. This is a blog of mostly biomedical PhD students, at Emory University, taking a course on "Statistical Design and Analysis of Experiments"
- StatsBlogs: syndicates posts from statistics related blogs and brings traffic and user interaction to contributing blogs. It is a service by Talk Stats Forum, the #1 statistics forum with >10k visitors daily. We are dedicated to facilitate information sharing and exchange in the statistics community
- Simply Statistics: We are three biostatistics professors (Jeff Leek, Roger Peng, and Rafa Irizarry) who are fired up about the new era where data are abundant and statisticians are scientists.
Consulting Services, Resources and Tutorials
Services below are provided to Columbia researchers ranging from no-cost to fee-for-service.
The Biostatistics, Epidemiology and Research Design Resource (BERD) provides a wide range of design, statistical, and analytical support services to assist CUMC faculty members in garnering grant support and publishing study results. In conjunction with the Department of Biostatistics of the Mailman School of Public Health, BERD provides support through consultations and educational initiatives.
- Consulting Service is no cost to faculty at CUMC for projects that require 1-2 sessions
- Fee-for-service consulting available for work that can be completed during the consulting service session
- Collaborations with faculty members in Department of Biostatistics are available if a project requires a longer term statistical consulting relationship
Department of Statistics Consulting Services
The Department of Statistics offers free statistical consulting to the Columbia community. Consulting is available by appointment only.
Statistical Consulting Center
The Statistical Analysis Center (SAC), at Columbia University’s Mailman School of Public Health, is an experienced team of experts dedicated to providing state of the art statistical, data, logistical and regulatory support for clinical research. These services are available to anybody conducting clinical experiments and randomized clinical trials.
Center for Open Science (COS)
COS is part of the Open Science Framework (OSF), which has developed a series of online workshops as part of their statistical and methodological consulting services. These materials are free and can be found here, the webinars are also available on OSF's YouTube channel.
Learn skills and tools that support data science and reproducible research to ensure you can trust your research results, reproduce them yourself, and communicate them to others.
This free course covers fundamentals of reproducible science, case studies, data provenance, statistical methods for reproducible science, computational tools for reproducible science, and reproducible reporting science. These concepts are intended to translate to fields throughout the data sciences: physical and life sciences, applied mathematics and statistics, and computing.
Consider this course a survey of best practices that will help you create an environment in which you can easily carry out reproducible research and integrate with similar situations for your collaborators and colleagues.
Johns Hopkins University Data Science Lab
The major educational initiative of the JHUDSL is to create open-source online courses delivered through a range of platforms including Youtube, Github, Leanpub, and Coursera. We currently have four active MOOC programs that you can enroll in at any time. Join over 8 million other students in taking a course produced by the Johns Hopkins Data Science Lab!
Courses Available From Simply Stats
Data analysis for life sciences: A series of 7 classes that teach R and statistics for health sciences applications, with a particular focus on genomic technologies. The classes were built by Rafael Irizarry and Mike Love. You can find and sign up for all the classes on their web site.
Genomic Data Science Specialization on Coursera: A 7 course sequence focused on teaching tools for analyzing genomic data. The classes were built by Jeff Leek, Steven Salzberg, James Taylor, Ela Pertea, Liliana Florea, Ben Langmead, and Kasper Hansen. You can find and sign up for all the classes on Coursera.
Tips for working with a statistician
Statistical Resources by Discipline
- Degrees of Freedom in Planning, Running, Analyzing, and Reporting Psychological Studies: A Checklist to Avoid p-Hacking by Wicherts et. al. (Frontiers in Psychology)
A practical solution to the pervasive problems of p values by Wagenmakers (Pyschonomic Bulletin & Review)
False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant by Simons, JP; Nelson, LD; Simonsohn, U (Psychological Science)
- The Misuse and Abuse of Statistics in Biomedical Research by Matthew S. Thiese, Zachary C. Arnold and Skyler D. Walker
- Know Your Chances: Understanding Health Statistics by Steven Woloshin, MD, MS, Lisa M. Schwartz, MD, MS, and H. Gilbert Welch, MD, MPH
- Statistics for Biologists is a collection of articles addressing important statistical issues that biologists should be aware of and provides practical advice to help them improve the rigor of their work (text adapted from Nature)
- Statistics for Experimental Biologists
This website was started to solve two related problems: 1. How to connect researchers with the information they need to do their jobs properly. 2. How to improve the quality of preclinical biomedical science. It has been developed specifically for laboratory-based experimental biologists, and therefore the examples will be familiar and relevant to anyone with such a background. The articles consist of "how-to" topics (including what not to do), key concepts and ideas, key papers and books (all suitable for biologists), and the occasional opinion piece. It is assumed that readers will have taken a first course in statistics and are familiar with t-tests, ANOVA, and regression.
- Computing Workflow for Biologist: A Roadmap by Ashley Shade and Tracey K. Teal