Shared Research Computing Policy Advisory Committee
I write to you enthusiastic over the Shared Research Computing Policy Advisory Committee’s (SRCPAC) advancement of the University’s high-performance computing resource. From its humble beginnings in 2011, research computing at Columbia is now something new under the sun.
As SRCPAC’s Chair, I am tasked with representing faculty interests in comprehensive governance of the shared research computing facility (SRCF). Our community – including rotating subcommittees and working groups devoted to multiple strategic initiatives – is comprised of over 150 faculty, postdocs, staff, and students, and meets semiannually to review topics of considerable range, including cloud computing, educational workshops, facility operations, and policy changes. All Columbia faculty are invited and strongly encouraged to attend SRCPAC meetings; a faculty designee can attend in the event of scheduling conflicts. I hope that you will join us for the many discussions that the future holds.
Formed in 2011, SRCPAC is the manifestation of a movement many years in the making. It is a unified effort in further developing the physical infrastructure, administrative network, and governance policies that are fundamental to innovative computational research and supporting corresponding grant-making activities. Columbia is a global leader in integrating data science methodologies across all domains and disciplines; this leadership is powerfully represented by our Data Science Institute, among many other academic units. The University is committed to furthering this integration and capitalizing upon new emergent opportunities in computationally-driven discovery. This is SRCPAC.
In Fall 2016, SRCPAC has achieved yet another seminal milestone: the installation of Habanero, Columbia's third high performance computing cluster for shared use among Columbia’s researchers, complementing the existing Yeti cluster. At $1.5 Million, and with 31 discrete group purchases, Habanero contains 222 compute notes and 400+ terabytes of storage. This significant stride was made possible in no small part by the tireless efforts and commitments of the Faculty of Arts and Sciences, The Fu Foundation School of Engineering & Applied Science, and CUIT, making Habanero a collective achievement for which we should all be proud. The Habanero Operating Committee of users is chaired by my colleague, Dr. Kyle Mandli, Assistant Professor, Department of Applied Physics & Applied Mathematics.
We are at a pivotal time in research – both generally and especially so at Columbia and I encourage you to explore the wealth of information found below regarding SRCPAC’s mission, structure, and emergent themes.
Your inquiries and comments are welcome as we collectively decide how to navigate the future of research computing. As SRCPAC is a joint effort, there are two methods for communicating with staff resources:
- For technical questions related to HPC use or for general research computing support, please contact CUIT Research Computing Services at email@example.com;
- For policy, governance, and faculty affairs questions, please contact the Office of Research Initiatives at firstname.lastname@example.org.
Thank you again for joining us in this exciting endeavor – we look forward to working with you.
Chris Marianetti, PhD
Chair, Shared Research Computing Policy Advisory Committee
Associate Professor, Department of Applied Physics & Applied Mathematics
Excerpt from the SRCPAC Charter, November 9, 2011:
"The Shared Research Computing Policy Advisory Committee (SRCPAC) will be a faculty-dominated group focused on a variety of policy issues related to shared research computing on the Morningside campus. As the use of computational tools spreads to more disciplines to create, collaborate, and disseminate knowledge, there is a commensurate rise in the costs of establishing and maintaining these resources. Shared resources have proven to leverage those available to individuals or small groups, but require careful consideration of the policies governing the shared resource and the basis of the operating model.
While final authority and responsibility for such policies customarily rests with the senior administrators of the University, it is vital that the research faculty examine and recommend the policies and practices they deem best suited to accomplishing the research objectives."
For more information regarding shared research computing at Columbia University, or to register for the SRCPAC ListServ, please email email@example.com.
31 research groups from the University contributed a total of $1.5 million to purchase a new high performance computing (HPC) cluster, named Habanero. (News Announcement, February 13, 2017)
The new cluster is comprised of 222 computer systems, a high speed local network, and a parallel storage server. 14 systems include GPU hardware accelerators, allowing certain highly parallelized applications to achieve performance levels far beyond what would be possible on conventional hardware. The service is managed by CUIT staff under the guidance of a faculty-led committee responsible for overseeing operations.
Participating researchers represent a wide range of disciplines, including astronomy, biochemistry, data science, engineering, neuroscience, oceanography, physics, statistics, and many others.
Habanero is the third generation of centrally-managed HPC clusters at Columbia. The first, named Hotfoot, was launched in 2009, expanded in 2011, and retired in 2015. The second, Yeti, launched in 2013 and is still in production.
The Habanero cluster entered service in November 2016.
For photos of the launch, click here.
- Fall 2011 Agenda and Minutes
- Spring 2012 Minutes
- Fall 2013 Minutes (Yeti Governance)
- Spring 2014 Minutes
- Fall 2014 Agenda and Minutes
- Spring 2015 Agenda and Minutes
- Fall, 2015 Agenda and Minutes
- Spring 2016 Agenda and Minutes
- Fall 2016 Email Update (In Lieu of Meeting)
- Spring 2017 Minutes and Slides
- Fall 2017 Minutes and Slides
- Spring 2018 Minutes and Slides
Foundations for Research Computing provides informal training for Columbia University graduate students to develop fundamental skills for harnessing computation: core languages and libraries, software development tools, best practices, and computational problem-solving. Topics are covered from across the spectrum, from beginner to advanced. Beyond training, the Foundations program aims to create a computational community at Columbia, bringing disparate researchers together with the common thread of computation.
Chair: Marc Spiegelman, Applied Physics & Applied Mathematics
Habanero Operating Committee
Chair: Kyle Mandli, Applied Physics & Applied Mathematics
Yeti Operating Committee
Chair: Greg Bryan, Astronomy
Columbia's centrally-managed High Performance Computing (HPC) resources on the Morningside campus are housed in the Shared Research Computing Facility (SRCF), which consists of a dedicated portion of the university data center. A project to upgrade the electrical infrastructure of the data center was completed in Summer 2013*.
*The Shared Research Computing Facility project is supported by NIH Research Facility Improvement Grant 1G20RR030893-01, and associated funds from the New York State Empire State Development, Division of Science Technology and Innovation (NYSTAR) Contract C090171, both awarded April 15, 2010.
Columbia University Guidance on Retention of Research Data states Principal Investigators are responsible for identifying, collecting, managing, and retaining Research Data as custodian for the University. More information regarding University-wide services related to research data can be found here: https://research.columbia.edu/content/research-data-storage
In addition, the University has an enterprise agreement with Amazon Web Services; Columbia researchers are encouraged to review the relevant services and explore opportunities for integrating AWS into their research programs:
- Account Information: https://cuit.columbia.edu/aws
- Cloud Computing Consulting: https://cuit.columbia.edu/cloud-research-computing-consulting
- Intercampus Subcommittee
- Columbia Survey Working Group
- Cloud Subcommittee
- External Peer Survey Working Group
- Hotfoot HPC Operations Committee
- Manhattanville Liaison Working Group
- Research Storage Working Group
SRCPAC meets every Fall and Spring semester for approximately 90 minutes, with select faculty, administrators, and leadership presenting updates pertaining to the University's shared research computing infrastructure. All Columbia faculty, research scientists, postdocs, students, and administrative staff are welcome to attend meetings.
Meetings are scheduled and announced via the SRCPAC ListServ. To be added to this ListServ, please contact firstname.lastname@example.org.
To provide researchers access to High Performance Computing (HPC) clusters larger than individual researchers can typically afford or wish to individually acquire and maintain, Columbia has created the Shared Research Computing Facility (SRCF) for Morningside, Lamont, and Manhattanville researchers to jointly acquire and use HPC clusters.
We hope the following information will be useful as you develop your research program:
- If you wish to join the SRCPAC ListServ to keep informed of committee meetings and other important announcements pertaining to Columbia Shared HPC, please email email@example.com.
- The current machine is Habanero, a 5,328 core, 222 node cluster, with 269 TFLOPS of processing power. This new cluster is not currently accepting new buy-ins, although rental and free tier options are available.
- Typically each Spring, to coincide with recruiting season, faculty are polled to see if there is interest in a joint expansion round or new system purchase. A good way to ensure you are aware of upcoming events is to join the SRCPAC ListServ by emailing firstname.lastname@example.org.
- Research Computing Services (RCS) within CUIT – the entity that administratively supports the SRCF – will hold open office hours for Yeti and Habanero users from 3:00 p.m. – 5:00 p.m. on the first Monday of each month (e.g. October 3rd, November 7th, etc.) at the Science & Engineering Library in the Northwest Corner Building. The RCS team is happy to answer questions about the SRCF, Columbia’s agreement with Amazon Web Services, and access to external Government-supported resources (such as XSEDE).
If you have additional questions about the above broad overview, please feel free to email email@example.com. We very much hope to have you involved in governing and advancing the research computing infrastructure across Columbia University, and welcome!
Research conducted on the Habanero, Yeti, and/or Hotfoot machines has led to over 100 peer-reviewed publications in top-tier research journals. To view citations for these publications please visit:
To report new publications utilizing one or more of these machines, please email firstname.lastname@example.org.
Published research emerging out of computations run on the Habanero, Yeti, and/or Hotfoot machines must recognize the grants that have made this service possible. We ask that all related publications include the following acknowledgement text:
We acknowledge computing resources from Columbia University's Shared Research Computing Facility project, which is supported by NIH Research Facility Improvement Grant 1G20RR030893-01, and associated funds from the New York State Empire State Development, Division of Science Technology and Innovation (NYSTAR) Contract C090171, both awarded April 15, 2010.
The University’s shared research computing clusters are not authorized to host HIPAA-protected data. Therefore, the collection, storage, or transmission of Sensitive Data, as defined within the Columbia University Data Classification Policy, is strictly prohibited on Habanero and Yeti.
Habanero now includes an Education Tier for course instructors to use when educating students. Whereas previous shared high performance clusters offered capacity for classes deploying HPC, such use was always ranked below that of the researchers. Conversely, Habanero's current high-priority Education Tier was made possible through the generous commitments of Mary Boyce, Dean of The Fu Foundation School of Engineering and Applied Science, and David Madigan, Executive Vice President and Dean of the Faculty of Arts and Sciences.
A number of no-cost internal and external resources exist to train new and existing users in computational methodologies, high-performance computing, and data science. Please click here to view a list of the resources available to Columbia students, faculty, and staff.