Shared Research Computing Policy Advisory Committee
The Ginsburg High Performance Computing cluster, a $1.4 Million joint purchase by 33 research groups and departments, went live in February 2021. The system consists of 139 nodes with a total of 4448 cores (32 cores per node), including 22 GPU hardware accelerated systems allowing certain highly-parallelized applications to achieve performance levels far beyond what would be possible on conventional hardware.
Ginsburg Cluster Specifications
- 87 Standard Nodes (192 GB)
- 30 High Memory Nodes (768 GB)
- 18 RTX 8000 GPU nodes (2 GPUs modules per server)
- 4 V100S GPU nodes (2 GPU modules per server)
- All servers are equipped with Dual Intel Xeon Gold 6226R processors (2.9 GHz)
- 570TB of DDN ES7790 Lustre storage
- EDR Infiniband
For more information, please visit Columbia's Shared High Performance Computing webpage.
Letter from Chair Marianetti to SRCPAC in lieu of meeting
Monday, December 7, 2020
Dear SRCPAC Members,
SRCPAC traditionally meets every Fall and Spring. As we have no pressing issues this Fall, I am writing to update you on items of interest, in lieu of yet another Zoom meeting. I will detail the following items:
- Our next HPC Cluster
- Habanero move
- Google Cloud Platform (GCP) for Research
- RCS Consulting
- Foundations for Research Computing
Next HPC Cluster
As voted on by you, the next shared HPC cluster will be named Ginsburg! We target go-live for Ginsburg in January 2021, which is only slightly behind schedule. This mild delay was due to a combination of the pandemic and a last minute add-on of $300k worth of equipment from a new center grant within Columbia. Ginsburg will consist of 139 compute nodes (87 standard nodes, 30 high memory nodes, 18 RTX 8000 GPU nodes, and 4 V100S GPU nodes) and 570TB of storage. This compares to 110 nodes (92 Standard nodes, 10 High Memory nodes, 8 GPU Nodes) and 430 TB of storage of the initial Terremoto installation.
The Habanero cluster currently sits in the Zuckerman Institute data center. It will reach its four year end-of-life in December 2020. In the Spring, our solicitation of interest in the opportunity to extend the life of Habanero by one year garnered overwhelming participation. As a reminder, the highlights are:
- RCS will need to relocate Habanero from the ZIDC to CUIT’s Morningside data center.
- A charge of $250 per node will cover the move, installation, and warranty.
- If you chose not to pay to have your nodes moved, they will become a part of the edu/free tier.
- Participants will still have the same rights and priorities to their nodes.
- The move will occur in February 2021 which will start the year extension.
- The charge for the extended year will be considered ‘services provided by CUIT’, not ‘equipment’.
We would like to thank Raj Bose and the ZIDC staff for their alliance and collaboration in hosting Habanero for the past four years.
Google Cloud Platform (GCP) for Research
CUIT can now provision you for Google Cloud Platform (GCP) for teaching and learning and research. GCP offers compute, storage, databases, servers and other services. Benefits include:
- CUIT’s enterprise agreement and BAA
- Automated project provisioning
- Access with your Columbia UNI and password
- Multifactor authentication
- Chartstring payment
- Built-in security and privacy controls
- GCP training and consulting is available
For more information, see https://cuit.columbia.edu/gcp.
As a reminder, you can consult with CUIT’s Research Computing Services team (RCS) on your research computing needs. Whether you have questions about cloud computing, on-prem HPC, or external resources such as XSEDE, please reach out to[email protected].
Foundations for Research Computing
The Foundations for Research Computing program has had to adapt to the pandemic by taking all in-person training online and updating offerings for a changing environment. This update on the program will share how the program has shifted in response to the events of 2020, and also how it has grown in significant ways despite these challenges. As always, we welcome any follow-up questions or discussion about Foundations for Research Computing.
As this is a longer update, we have broken it into three sections to focus on specific areas of interest:
- Remote Transition
- Partnership Pilots
- New Programming
In mid-March, despite adverse circumstances and little time to prepare, Foundations for Research Computing instructors rose to the challenge and offered a full two-day Foundations for Research Computing Bootcamp over Zoom. Helpers who would normally circulate in the room assisted through chat, and CUIT was instrumental in setting up accounts and technology on short notice to accommodate the 56 researchers who were able to participate during this difficult time. While the number of participants was smaller than originally planned, this number still represents an increase from the 45 researchers served at the March 2019 bootcamp.
Following the March 16-17 bootcamp, all Foundations for Research Computing training has taken place online. Since this transition, the program has offered 24 events for 583 researchers, including the scheduled two-day bootcamp in August. This is a decrease from the 759 researchers trained over this period in 2019. This difference is mostly attributable to a reduction in the "intensives" category of full-day events for intermediate researchers, and we are optimistic that a reconfigured version of our intermediate intensives can resume in the fall.
Since March, Foundations for Research Computing has sought out best practices for online teaching, and developed or experimented with new practices in cases where guidance was not available. The Carpentries has recently released additional guidance on conducting technical workshops using remote technology, and for the planned January bootcamp the program will pilot a four-day, half-day bootcamp format following their proposed model. Despite challenges associated with online learning, feedback for online events has remained positive, and our August two-day bootcamp received a Net Promoter score in the "excellent" range according to a survey of participants. Compared to this period in 2019, participation in two-day introductory bootcamps since March has been approximately the same: 169 in 2019 and 171 in 2020.
In addition to our standard programming, Foundations for Research Computing trialed a Curriculum Innovation Grant (CIG) program last spring. This program is targeted at creating specialized and mid-scale mini-offerings. CIG offered grants to seven recipients to create and teach a technical workshop or workshop module. Though some planned components of the program could not be completed, five of the seven CIG recipients have taught a technical workshop, and most have served as helpers or otherwise participated in Foundations for Research Computing. We are currently evaluating the effectiveness of the CIG program and will make recommendations in the Spring for proposed modifications.
In its third year of operation, the Foundations for Research Computing program is experimenting with partnering with several groups around the university to enable the partners to independently run technical training based on the Foundations for Research Computing model. In FY21, the program is piloting three partnerships with specific groups at Columbia: the Division of Cardiology at CUIMC, the Department of Mechanical Engineering, and a collective of interested faculty in the humanities. Foundations will provide support, resources, and expertise. Specifically, Foundations has prepared a packet of materials, including communications templates, forms, and checklists, to share with partners. The program also provides initial planning sessions, specific strategic support for troubleshooting issues, and assistance in finding additional instructors. Partner groups receive up to two Carpentries training slots gratis, with an option for additional instructors to be trained at cost, and partner instructors are expected to participate in the larger instructor community.
These pilots are in an early stage, but have already seen some success. On August 19-20, Mechanical Engineering ran a bootcamp on the Foundations for Research Computing model overseen by Associate Professor Arvind Narayanaswamy. The bootcamp, "UNIX, Git, and Python For Mechanical Engineers," targeted an incoming class of 110 MS students in mechanical engineering. A primary goal for these partnerships has been to leverage existing program resources to create a broader impact on campus. In addition to providing Carpentries training, resources such as checklists and communications templates, and light logistical support, Foundations for Research Computing connected Arvind with two experienced graduate student instructors who received a stipend for their work. As a partner, Arvind has significant experience in technical pedagogy, and it remains to be seen whether all such partnerships can be run as independently. A bootcamp for the Reilly Lab in the Division of Cardiology will be run in January in cooperation with Roger Lefort in Research Compliance and Training. We look forward to giving a full report on these partnerships at the Spring SRCPAC meeting.
In coordination with CUIT’s launch of the Google Cloud Platform (GCP) service, CUIT partnered with Foundations for Research Computing in offering two new workshops in a series on cloud computing this fall: Introduction to Cloud Computing for Research and High Performance Computing on Google Cloud Platform. We anticipate that these workshops will supplement the existing workshops offered by CUIT that prepare researchers to access cluster resources.
This semester, the program has begun piloting a Research Computing Reading Group targeting researchers with intermediate to advanced experience in research computing. The group, convened by a graduate student paid as a Libraries Digital Intern, discusses current or historical technical papers in areas such as natural language processing and machine learning that are relevant to research computing. We look forward to presenting this pilot for evaluation in the spring for a determination of its impact on researchers and whether it will offer continuing value to the program.
In 2020, despite challenges, the Foundations for Research Computing program continues to offer computational training that allows Columbia researchers to access shared resources at the university. Thank you to SRCPAC for the committee's continued support. Please reach out at any time with questions, either to [email protected] or to the Foundations for Research Computing program coordinator at [email protected]
I look forward to seeing you early next semester. In the meantime, questions and comments welcome to SRCPAC at [email protected].
Chris Marianetti, PhD
Chair, Shared Research Computing Policy Advisory Committee (SRCPAC)
Associate Professor, Department of Applied Physics and Applied Mathematics
I write to you enthusiastic over the Shared Research Computing Policy Advisory Committee’s (SRCPAC) advancement of the University’s high-performance computing resource. From its humble beginnings in 2011, research computing at Columbia is now something new under the sun.
As SRCPAC’s Chair, I am tasked with representing faculty interests in comprehensive governance of the shared research computing facility (SRCF). Our community – including rotating subcommittees and working groups devoted to multiple strategic initiatives – is comprised of over 150 faculty, postdocs, staff, and students, and meets semiannually to review topics of considerable range, including cloud computing, educational workshops, facility operations, and policy changes. All Columbia faculty are invited and strongly encouraged to attend SRCPAC meetings; a faculty designee can attend in the event of scheduling conflicts. I hope that you will join us for the many discussions that the future holds.
Formed in 2011, SRCPAC is the manifestation of a movement many years in the making. It is a unified effort in further developing the physical infrastructure, administrative network, and governance policies that are fundamental to innovative computational research and supporting corresponding grant-making activities. Columbia is a global leader in integrating data science methodologies across all domains and disciplines; this leadership is powerfully represented by our Data Science Institute, among many other academic units. The University is committed to furthering this integration and capitalizing upon new emergent opportunities in computationally-driven discovery. This is SRCPAC.
In Fall 2016, SRCPAC achieved yet another seminal milestone: the installation of Habanero, Columbia's third high performance computing cluster for shared use among Columbia’s researchers. At an initial cost of $1.5 Million and expanded the following year, Habanero consists of 44 discrete group purchases and contains 302 compute notes and 800 terabytes of storage. Terremoto, Columbia's fourth HPC cluster to enter production, went live in December 2018 and was expanded in 2019. Terremoto is a joint purchase of 35 research groups and departments and consists of 137 compute nodes and over half a petabyte of storage. And most recently, the Ginsburg cluster, a joint purchase by 33 research groups and departments, was launched in February 2021. Ginsburg consists of 139 nodes including 22 GPU hardware accelerated systems.
These significant strides were made possible in no small part by the tireless efforts and commitments of the Faculty of Arts and Sciences, The Fu Foundation School of Engineering & Applied Science, and CUIT, making Habanero and Terremoto collective achievements for which we should all be proud. The High Performance Operating Committee of users is chaired by my colleague, Dr. Kyle Mandli, Assistant Professor, Department of Applied Physics & Applied Mathematics.
We are at a pivotal time in research – both generally and especially so at Columbia and I encourage you to explore the wealth of information found below regarding SRCPAC’s mission, structure, and emergent themes.
Your inquiries and comments are welcome as we collectively decide how to navigate the future of research computing. As SRCPAC is a joint effort, there are two methods for communicating with staff resources:
- For technical questions related to HPC use or for general research computing support, please contact CUIT Research Computing Services at [email protected];
- For policy, governance, and faculty affairs questions, please contact the Office of Research Initiatives at [email protected].
Thank you again for joining us in this exciting endeavor – we look forward to working with you.
Chris Marianetti, PhD
Chair, Shared Research Computing Policy Advisory Committee
Associate Professor, Department of Applied Physics & Applied Mathematics
Excerpt from the SRCPAC Charter, November 9, 2011:
"The Shared Research Computing Policy Advisory Committee (SRCPAC) will be a faculty-dominated group focused on a variety of policy issues related to shared research computing on the Morningside campus. As the use of computational tools spreads to more disciplines to create, collaborate, and disseminate knowledge, there is a commensurate rise in the costs of establishing and maintaining these resources. Shared resources have proven to leverage those available to individuals or small groups, but require careful consideration of the policies governing the shared resource and the basis of the operating model.
While final authority and responsibility for such policies customarily rests with the senior administrators of the University, it is vital that the research faculty examine and recommend the policies and practices they deem best suited to accomplishing the research objectives."
For more information regarding shared research computing at Columbia University, or to register for the SRCPAC ListServ, please email [email protected].
- FY14 Annual Report
- FY15 Annual Report
- FY16 Annual Report
- FY17 Annual Report
- FY18 Annual Report
- FY19 Annual Report
- Fall 2011 Agenda and Minutes
- Spring 2012 Minutes
- Fall 2013 Minutes (Yeti Governance)
- Spring 2014 Minutes
- Fall 2014 Agenda and Minutes
- Spring 2015 Agenda and Minutes
- Fall, 2015 Agenda and Minutes
- Spring 2016 Agenda and Minutes
- Fall 2016 Email Update (In Lieu of Meeting)
- Spring 2017 Minutes and Slides
- Fall 2017 Minutes and Slides
- Spring 2018 Minutes and Slides
- Fall 2018 Minutes and Slides
- Spring 2019 Minutes and Slides
- Fall 2019 Minutes and Slides
- Spring 2020 Minutes and Slides
Foundations for Research Computing provides informal training for Columbia University graduate students to develop fundamental skills for harnessing computation: core languages and libraries, software development tools, best practices, and computational problem-solving. Topics are covered from across the spectrum, from beginner to advanced. Beyond training, the Foundations program aims to create a computational community at Columbia, bringing disparate researchers together with the common thread of computation.
Chair: Marc Spiegelman, Applied Physics & Applied Mathematics
HPC Operating Committee
Chair: Kyle Mandli, Applied Physics & Applied Mathematics
- Fall 2019 Minutes and Slides
- Spring 2019 Minutes and Slides (formerly Habanero Operating Committee)Spring 2017 Minutes and Slides (formerly Habanero Operating Committee)
- Spring 2018 Minutes and Slides (formerly Habanero Operating Committee)
- Fall 2017 Minutes and Slides (formerly Habanero Operating Committee)
- Spring 2016 Minutes and Slides (formerly Yeti Operating Committee)
- Fall 2015 Minutes and Slides (formerly Yeti Operating Committee)
- Spring 2015 Minutes and Slides (formerly Yeti Operating Committee)
The Shared Research Computing Facility (SRCF) consists of a dedicated portion of the university data center on the Morningside Campus. It is dedicated to house shared computing resources managed by CUIT, such as Columbia's centrally-managed High Performance Computing (HPC) and the Secure Data Enclave (SDE).
A project to upgrade the electrical infrastructure of the data center was completed in Summer 2013*.
In 2018**, cooling was expanded to increase capacity to accommodate shared computing into the foreseeable future.
*The Shared Research Computing Facility project is supported by NIH Research Facility Improvement Grant 1G20RR030893-01, and associated funds from the New York State Empire State Development, Division of Science Technology and Innovation (NYSTAR) Contract C090171, both awarded April 15, 2010.
**The 2018 Cooling expansion is supported by joint contributions from CUIT, the Office of the Executive Vice President for Research, Arts and Sciences, and Engineering and Applied Science.
Columbia University Guidance on Retention of Research Data states Principal Investigators are responsible for identifying, collecting, managing, and retaining Research Data as custodian for the University. More information regarding University-wide services related to research data can be found here: https://research.columbia.edu/content/research-data-storage
In addition, the University has an enterprise agreement with Amazon Web Services; Columbia researchers are encouraged to review the relevant services and explore opportunities for integrating AWS into their research programs:
- Account Information: https://cuit.columbia.edu/aws
- Cloud Computing Consulting: https://cuit.columbia.edu/cloud-research-computing-consulting
- Intercampus Subcommittee
- Columbia Survey Working Group
- Cloud Subcommittee
- External Peer Survey Working Group
- Hotfoot HPC Operations Committee
- Manhattanville Liaison Working Group
- Research Storage Working Group
SRCPAC meets every Fall and Spring semester for approximately 90 minutes, with select faculty, administrators, and leadership presenting updates pertaining to the University's shared research computing infrastructure. All Columbia faculty, research scientists, postdocs, students, and administrative staff are welcome to attend meetings.
Meetings are scheduled and announced via the SRCPAC ListServ. To be added to this ListServ, please contact [email protected].
To provide researchers access to High Performance Computing (HPC) clusters larger than individual researchers can typically afford or wish to individually acquire and maintain, Columbia has created the Shared Research Computing Facility (SRCF) for Morningside, Lamont, and Manhattanville researchers to jointly acquire and use HPC clusters.
We hope the following information will be useful as you develop your research program:
- If you wish to join the SRCPAC ListServ to keep informed of committee meetings and other important announcements pertaining to Columbia Shared HPC, please email [email protected].
- There are three active Shared HPC clusters, including Terremoto, Habanero, and the newest system, Ginsburg. These clusters are governed by a faculty-led community, the Shared Research Computing Policy Advisory Committee (SRCPAC), and are administratively supported by full-time staff within the CUIT Research Computing Services team, providing maintenance, technical support, software installation and guidance on future computing needs.
- Typically each Spring, to coincide with recruiting season, faculty are polled to see if there is interest in a joint expansion round or new system purchase. A good way to ensure you are aware of upcoming events is to join the SRCPAC ListServ by emailing [email protected].
- Research Computing Services (RCS) within CUIT – the entity that administratively supports the SRCF – holds online zoom office hours for HPC users from 3:00 p.m. – 5:00 p.m. on the first Monday of each month. Please RSVP here if interested. The RCS team is happy to answer questions about the SRCF, Columbia’s agreement with Google Cloud Platform, Amazon Web Services, and access to external Government-supported resources (such as XSEDE).
If you have additional questions about the above broad overview, please feel free to email [email protected]. We very much hope to have you involved in governing and advancing the research computing infrastructure across Columbia University, and welcome!
Research conducted on the Habanero, Yeti, and/or Hotfoot machines has led to over 100 peer-reviewed publications in top-tier research journals. To view citations for these publications please visit:
To report new publications utilizing one or more of these machines, please email [email protected].
Published research emerging out of computations run on the Habanero, Yeti, and/or Hotfoot machines must recognize the grants that have made this service possible. We ask that all related publications include the following acknowledgement text:
We acknowledge computing resources from Columbia University's Shared Research Computing Facility project, which is supported by NIH Research Facility Improvement Grant 1G20RR030893-01, and associated funds from the New York State Empire State Development, Division of Science Technology and Innovation (NYSTAR) Contract C090171, both awarded April 15, 2010.
The University’s shared research computing clusters are not authorized to host HIPAA-protected data. Therefore, the collection, storage, or transmission of Sensitive Data, as defined within the Columbia University Data Classification Policy, is strictly prohibited on Habanero and Yeti.
Habanero now includes an Education Tier for course instructors to use when educating students. Whereas previous shared high performance clusters offered capacity for classes deploying HPC, such use was always ranked below that of the researchers. Conversely, Habanero's current high-priority Education Tier was made possible through the generous commitments of Mary Boyce, Dean of The Fu Foundation School of Engineering and Applied Science, and David Madigan, Executive Vice President and Dean of the Faculty of Arts and Sciences.
A number of no-cost internal and external resources exist to train new and existing users in computational methodologies, high-performance computing, and data science. Please click here to view a list of the resources available to Columbia students, faculty, and staff.