2019‑2020 |
Chief Scientist, Pittsburgh Supercomputing Center, Carnegie Mellon University
|
-
Developed a complete HPC+AI+Data ecosystem supporting research in medicine, science, and engineering, in coordination with diverse federal sponsors (NSF, NIH, DoD) and partnerships between academia and industry.
-
Architect, Principal Investigator, and Project Director for the Bridges-2 national supercomputer:
$10M acquisition, with expected total funding of $22.5M (NSF Office of Advanced Cyberinfrastructure (OAC) award 1928147). Led development of the successful proposal for Bridges-2, working with vendor partners and the national scientific community, and directed the Bridges-2 acquisition and early scientific outreach.
Developed in partnership with HPE, Bridges-2 is designed for rapidly evolving research, featuring 488 dual-socket AMD EPYC 7742 (Rome) nodes with 256 GB of RAM, 16 similar nodes with 512 GB of RAM, 4 large-memory nodes with 4 TB of RAM and 4 Xeon Platinum 8260M (Cascade Lake) CPUs, and 24 GPU nodes each with 8 NVIDIA Tesla V100-32GB SXM2 GPUs and 384–768 GB of CPU RAM.
Bridges-2 is interconnected by Mellanox HDR-200 Infiniband, with dual rails interconnecting the GPU nodes to improve scaling for deep learning.
Bridges-AI, which still has considerable value, is planned to be federated with Bridges-2 when Bridges is decommissioned.
Bridges-2 introduces a flash filesystem to support training on large datasets and a hierarchical disk and tape storage system managed by HPE DMF (Data Management Framework).
-
Architect, Co-Principal Investigator, and Associate Director of Scientific and Broader Impact for Neocortex, a revolutionary AI supercomputer:
$5M acquisition, with expected total funding of $11.25M (NSF Office of Advanced Cyberinfrastructure award 2005597).
Neocortex integrates two Cerebras CS-1 systems, each with a Wafer Scale Engine (WSE) deep learning processor, with a large-memory (24TB) HPE Superdome Flex HPC server, with closely balanced bandwidths and integration with Bridges-2 for management of large data and complementary, general-purpose computing.
This unique system is designed to enable research into scaling across multiple CS-1 systems, streaming data at high bandwidth from the SDFlex's large memory to each CS-1 independently or both together.
It is intended to accelerate deep learning by a factor of up to 1,000 while maintaining familiar, easy-to-use interfaces (TensorFlow and PyTorch), to be followed by an SDK and API for researchers developing fundamental algorithms.
-
Director, Advancing Cancer Biology at the Frontiers of Machine Learning and Mechanistic Modeling, supported by the National Cancer Institute.
NCI specifically invited me to direct this Innovation Lab, which brought together interdisciplinary experts who otherwise seldom work together.
The Lab was extremely successful: Of the nine ideas generated for pilot projects, six were deemed worthy of funding.
I issued four subawards through PSC for pilot projects that are expected to yield substantial R01 proposals, and NCI contributed additional support for two more.
-
Principal Investigator for Amplifying the Value of HuBMAP Data Through Data Interoperability and Collaboration, an NIH Common Fund Data Ecosystem (CFDE) project to apply machine learning to HuBMAP and Kids First (a complementary NIH CFDE consortium) data to understand causes of childhood cancers and structural birth defects and to amplify the FAIRness of HuBMAP data.
-
(Progression from prior appointment)
Hardware/Software Architect and Principal Investigator for Human BioMolecular Atlas Program (HuBMAP) Infrastructure and Engagement,
being developed with a hybrid HPC/cloud (currently AWS) approach to cost-effectively maximize capability, interoperability, and reliability: $5,710,740 awarded to date through the NIH Common Fund.
The HuBMAP Consortium HuBMAP consists of 19 lead institutions and approximately 35 collaborating institutions, with 11 more lead institutions soon to be added.
Led software development using an agile methodology to accommodate evolving requirements and the many developers of the Consortium.
Led the core Infrastructure and Engagement Component to the inaugural HuBMAP data release (September 1, 2020).
Complete information on HuBMAP is available through the HuBMAP Consortium Website, and HuBMAP data is available through the HuBMAP Data Portal.
-
(Progression from prior appointment)
Architect, Principal Investigator, and Project Director for Bridges: directed production operations include systems support, scientific an educational outreach, and partnerships; successfully led annual NSF reviews of operations, and planned the transition to Bridges-2, including the federation of Bridges-AI with Bridges-2. Bridges has served over 2,100 research projects conducted by over 20,000 users at over 800 institutions (spanning academia, national labs and federal reserve banks, and industry) across the U.S. and their international collaborators.
-
Co-Principal Investigator for OCCAM+P38: Instantiating and Sustaining a Repository of Executable and Interactive Computer Architecture Experiments, supported by DoD. Directed the use of Bridges to support collaboration on computer architecture for Project 38, a set of architectural explorations involving DoD, the DOE Office of Science, and the DOE National Nuclear Security Administration (NNSA).
-
Strategic Planning, Center for AI Innovations in Medical Imaging (CAIIMI).
CAIIMI is a new research center launched in January 2020 with members representing UPMC (University of Pittsburgh Medical Center hospital system), University of Pittsburgh, Carnegie Mellon University, and the Pittsburgh Supercomputing Center (myself). Worked with the CAIIMI director and team to plan an ambitious proposal and provided discretionary access to Bridges-AI in support of CAIIMI research.
-
Enabled urgent COVID-19 research.
Provided discretionary allocations and priority scheduling on Bridges for the COVID-19 High Performance Computing Consortium (announced by the White House on March 23, 2020) as well as for COVID-19 projects external to the Consortium. The projects were in AI, genomics, and molecular dynamics. Significant advances made through those allocations included development of a fast (30-minute), low-cost, reliable test for COVID-19 (Mason group; Weill Cornell Medicine); a database of molecules that are candidates for therapeutics (Isayev group; Carnegie Mellon University); and another database of candidate therapeutics submitted to the Europe's JEDI Grand Challenge: Billion Molecules against COVID-19.
|
2017‑2019 |
Interim Director, Pittsburgh Supercomputing Center, Carnegie Mellon University
|
-
Led the PSC team of approximately 60 FTEs and PSC's interactions with its parent universities and external stakeholders.
-
Created a new Artificial Intelligence and Big Data group to amplify PSC's ability to generate new opportunities and make valuable contributions.
Creation of the AI&BD group made possible Bridges-AI expansion of Bridges, the award for the highly innovative Neocortex system, and numerous research collaborations.
-
Co-led the successful proposal for Bridges-AI, a $1.8M expansion of Bridges that expanded aggregate AI capacity across NSF's national cyberinfrastructure by 283%.
Bridges-AI introduced the NVIDIA DGX-2 and NVIDIA Tesla V100 (“Volta”) GPUs to the NSF research community to enable research using scalable AI.
-
Co-founded the Compass Consortium, an innovative program through which industry participants could evaluate emerging AI technologies for their specific domains, learn of best practices, and engage PSC and academic experts in Consortium (multi-partner, shared) and Pilot (single-partner) research projects.
-
Architect and Principal Investigator for Human BioMolecular Atlas Program (HuBMAP) Infrastructure and Engagement, being developed with a hybrid HPC/cloud approach to cost-effectively maximize capability, interoperability, and reliability: led development of the successful proposal to NIH, hybrid cloud hardware and interoperating microservices-based software architecture with federated Globus identity management, and the core Infrastructure and Engagement Component of the Consortium (~8 FTEs).
-
Principal Investigator for Challenges and Opportunities in Scientific Data Discovery and Reuse in the Data Revolution: Harnessing the Power of AI.
Authored a successful proposal (NSF award 1839014, $50,000) to host the AIDR 2019: Artificial Intelligence for Data Discovery and Reuse conference, which was held on May 13–15, 2019 at Carnegie Mellon University.
-
Created the Pittsburgh Research Computing Initiative to catalyze data-, AI-, and HPC-driven collaboration and research across Carnegie Mellon University, the University of Pittsburgh, and the UPMC (University of Pittsburgh Medical Center) hospital system. Over 130 research groups participated.
-
(Progression from prior appointment)
Architect, Principal Investigator, and Project Director for Bridges:
Directed production operations include systems support, scientific an educational outreach, and partnerships.
-
(Progression from prior appointment)
Co-Investigator for Big Data for Better Health (BD4BH).
Led data infrastructure architecture for applying machine learning to genomic and imaging data for breast and lung cancer research, supported by the Pennsylvania Department of Health. Results included methods for detecting anomalous gene expression, indexing to search expression data for expressed viruses, identification of tumor progression features, etc.
-
(Progression from prior appointment)
Principal Investigator for the Data Exacell.
Led a team of ~5 FTEs to successfully conclude pilot projects. Results included demonstration of PSC's filesystem technology that was then deployed in Bridges, distributed deployment of the Galaxy framework that was transitioned to Bridges large-memory nodes for production use, and independent acquisition of the PghBio filesystem to serve The Cancer Genome Atlas and other biomedical data for Big Data for Better Health, the Center for Causal Discovery, the Center for AI Innovations in Medical Imaging, and other strategic projects.
-
(Progression from prior appointment)
Co-Investigator and Software Performance Architect for the Center for Causal Discovery (CCD), an NIH Big Data to Knowledge (BD2K) Center of Excellence.
Successfully concluded the NIH BD2K project, which produced numerous advances in algorithms for causal analysis of genomic, imaging, and time series data; improved understanding of signaling pathways and mutations; biomarkers of typical and atypical brain activity; and other areas.
-
Member of the Pittsburgh Quantum Institute
(PQI):
Advised on strategic directions for quantum applications and simulation.
|
2016‑2017 |
Sr. Director of Research, Pittsburgh Supercomputing Center, Carnegie Mellon University
|
-
Principal Investigator for the Data Exacell.
Transitioned from Co-PI to PI.
Led a team of approximately 5 FTEs to conduct pilot projects including the Pittsburgh Genome Research Repository (PGRR; thorough which a local, high-performance copy of The Cancer Genome Atlas (TCGA) was made available to researchers), a framework for reproducible science focusing on genomics (Galaxy), and a data-intensive workflow for radio astronomy (in collaboration with the National Radio Astronomy Observatory (NRAO)).
-
(Progression from prior appointment)
Architect, Principal Investigator, and Project Director for the Bridges national supercomputer:
directed installation and acceptance testing, including the world's first deployment of the Intel Omni-Path Architecture (OPA) fabric, and directed production operations, annual NSF reviews of the acquisition and operations, and approximately 30 PSC staff supported by the Bridges project.
-
(Progression from prior appointment)
Co-Investigator for Big Data for Better Health (BD4BH).
Led data and software architecture for applying machine learning to genomic and imaging data for breast and lung cancer research, supported by the Pennsylvania Department of Health.
-
(Progression from prior appointment)
Co-Investigator and software performance architect for the Center for Causal Discovery (CCD), an NIH Big Data to Knowledge (BD2K) Center of Excellence.
Advised on high-performance implementation of causal discovery algorithms, particularly for analysis of fMRI data to understand the brain causome.
Directed implementation on Bridges of the Causal Web, a browser-based portal to democratize access to sophisticated algorithms (e.g. FGES) on large-memory nodes, effectively delivering HPC Software as a Service.
|
2015‑2016 |
Director of Research, Pittsburgh Supercomputing Center, Carnegie Mellon University
|
-
Architect, Principal Investigator, and Project Director for the Bridges national supercomputer, which pioneered the convergence of HPC, AI, and Big Data.
Wrote the successful proposal that resulted in $20.9M of funding from the NSF Office of Advanced Cyberinfrastructure (award 1445606).
The Bridges architecture was designed to make HPC and AI accessible to “nontraditional” communities and applications that had never used HPC before.
It has been emulated multiple times around the world, and its success is substantiated by the recent NSF award for Bridges-2.
-
Co-Investigator and software performance architect for the Center for Causal Discovery (CCD), an NIH Big Data to Knowledge (BD2K) Center of Excellence.
The CCD developed highly efficient causal discovery algorithms that can be practically applied to very large biomedical datasets and applied them to three distinct biomedical questions: cancer driver mutations, lung fibrosis, and the brain causome, as a vehicle for algorithm development and optimization.
The CCD also disseminated causal discovery algorithms, software, and tools and trained scientists and biomedical investigators in the use of CCD tools.
The CCD's graph algorithms made extensive use of PSC's large-memory Blacklight and Bridges systems, for which I was co-PI and PI, respectively.
September 29, 2014–August 31, 2019.
-
Co-Investigator for Big Data for Better Health (BD4BH).
Co-authored a successful proposal to the Pennsylvania Department of Health to apply state-of-the-art machine learning algorithms to genomic and imaging data for breast and lung cancer research, with Bridges providing data and AI capability to an interdisciplinary team including the University of Pittsburgh Department, CMU, UPMC, and PSC.
Awarded for June 1, 2015–May 31, 2018.
-
Co-Principal Investigator for Open Compass.
Co-authored the successful proposal “Open Compass: Leveraging the Compass AI Engineering Testbed to Accelerate Open Research” (NSF award 1833317; $300,000 awarded through NSF's EAGER program for high-risk, high-reward research; May 1, 2018–April 30, 2021), a project to evaluate emerging AI technologies for deep learning networks relevant to research (GCNs, 3D CNNs, CNNs for very large images, LSTMs for time series, etc.), develop and disseminate best practices, and conduct training.
-
Co-Principal Investigator for the Data Exacell.
Authored the successful proposal “CIF21 DIBBS: The Data Exacell” (NSF award 1261721, $8,914,035, October 1, 2018–September 30, 2018; “DIBBs” = NSF's Data Infrastructure Building Blocks program).
Originally envisaged as an exascale data management system to enable data-intensive research, the emphasis was shifted through the cooperative agreement to explore the potential of high-performance data analytics (HPDA) and novel data storage technologies.
Initial computational technologies included large memory (Blacklight; 2×16TB cache-coherent shared memory), Sherlock (a multithreaded graph processor; YarcData uRiKa, with a primarily RDF+SPARQL user interface), and a SLASH2-based filesystem.
I also authored a successful supplemental proposal to acquire PSC's first NVIDIA Tesla GPUs to add strong support for deep learning.
-
Co-Principal Investigator for Sherlock, a system for high-performance graph analytics.
Authored the successful proposal to NSF (award 1234749, $1,226,000 September 1, 2012–August 31, 2015) to acquire and make available to the research community Sherlock, a Cray YarcData uRiKA data appliance consisting of a next-generation Cray XMT supercomputer (NG-XMT) running the uRiKA application architecture and augmented by Cray XT5 compute nodes to broaden the range of relevant applications to address the challenges of graph-based data analytics.
It was based on the Cray XT5 infrastructure, specialized for graph analytics through the inclusion of Cray-proprietary Threadstorm 4.0 processors and AMD HyperTransport-attached SeaStar2 interconnect chips to provide a flat, globally-addressable memory.
-
Co-Principal Investigator for Blacklight, the world's largest shared-memory system.
Authored the successful proposal “Very Large Shared Memory System for Science and Engineering” (NSF award 1041726, $2,942,517, August 1, 2010–July 31, 2014).
Blacklight consisted of two SGI UV1000 systems, each with the maximum of 16TB (32TB aggregate) of hardware cache-coherent shared memory implemented on NUMAlink.
It prioritized user productivity by allowing OpenMP, Java, and MATLAB applications to scale to 16TB and 2,048 cores.
It also introduced the Nehalem microarchitecture via Intel Xeon X7560 CPUs to the national research community.
Blacklight was the preferred system nationwide for large-scale genome sequence assembly and was also in high demand for data analytics.
Its architecture, applications, and users informed the design of Bridges.
-
Co-Principal Investigator for SDCI HPC Improvement: High-Productivity Performance Engineering (Tools, Methods, Training) for NSF HPC Applications.
Applied, evaluated, and hardened performance engineering tools including TAU, PAPI, PerfSuite, KOJAK, Scalasca, Vampir, and ipm for scientific applications with diverse execution profiles: AMR hydro+Nbody cosmology (ENZO), molecular dynamics (NAMD), and quantum simulation of nanoscale semiconductor devices (NEMO3D).
Co-led tutorials in performance engineering at the SC08, SC09, LCI 2009, and LCI 2010 conferences.
-
Principal Investigator for EAGER: Exploring the Potential of “Native Client” for Computational Science.
This NSF EAGER (rapid-turnaround, high-risk, high-reward) project tested the usability and affordability of Google's “Exacycle” cloud service using the “Native Client” (NaCl) programming model for scientific applications.
-
Lead, Advanced Computational Environments, DoD High Performance Computing Modernization Program (HPCMP) User Productivity Enhancement, Technology Transfer, and Training (PETTT).
Worked with the HPTi/DRC/Engility (successive acquisitions) team and multiple universities to drive effective use of HPC resources by DoD research scientists.
-
(Progression from prior appointment)
Led applications support for BigBen, PSC's Cray XT3 MPP system.
Oversaw optimization of job placement in the Cray XT3's 3D torus, improving performance by 4.7–11.7%.
Oversaw collaboration with a Cray hardware architect to optimize Seastar router settings, improving throughput for communications-intensive applications by up to 36% through age-based arbitration and changing the clock tick and 40% more by enabling and balancing across four virtual channels.
|
2004‑2020 |
Visiting Research Physicist, Physics Department, Carnegie Mellon University
|
- Taught Advanced Computational Physics (MPI, high-performance 3D FFTs, data analytics, etc.).
- Guest lectured at the CMU Physics Upper Class Colloquium.
- Supported CMU Physics faculty in their use of PSC advanced computing resources and to compete for grants in Physics.
|
2004‑2005 |
Manager, Strategic Applications, Pittsburgh Supercomputing Center, Carnegie Mellon University
|
-
Led the Strategic Applications Group to identify and drive important areas of research, software development, and collaboration.
-
Led applications support for BigBen, PSC's Cray XT3 MPP system.
When it was installed at PSC in 2005, BigBen was the world's first Cray XT3.
Four full-scale applications — cosmology (GADGET), molecular dynamics (CHARMM), weather (WRF), and quantum chemistry (GAMESS) — were prepared in advance and demonstrated at the SC04 conference.
Preparation began using an FPGA simulator for the Seastar router and the Catamount microkernel OS on machines at Cray.
We collaborated with Sandia National Laboratory (SNL), where Cray Red Storm was developed, on scheduling, reliability, availability, and serviceability (RAS), and Catamount.
-
Led HCI study of programming language productivity, focusing on X10, UPC, and MPI for high concurrency applications, in collaboration with the IBM PERCS team and supported by the DARPA HPCS Program.
Developed the SUMS methodology, which applies statistical techniques to comprehensive, fine-grained instrumentation of programmers’ activities and enables objective evaluation of soft ware development processes.
In addition to precursor activities, we conducted a 4.5-day, IRB-approved human subjects study. Ours was the only university project invited to present at IBM s DARPA reviews.
-
(Progression from prior appointment)
Co-Functional Area Point of Contact for Computational Chemistry and Materials Science, DoD High Performance Computing Modernization Program (HPCMP) User Productivity Enhancement and Technology Transfer (PET).
Mentored an advanced intern in performance engineering and development of a GPU version of VASP (Vienna Ab initio Simulation Package, an important application for materials science and chemistry).
|
1998‑2003 |
Sr. Scientific Specialist, Pittsburgh Supercomputing Center, Carnegie Mellon University
|
-
Principal Investigator for Collaborative Research: ITR/AP: Novel Scalable Simulation Techniques for Chemistry, Materials Science and Biology.
Co-developed a scalable, open source, ab initio molecular dynamics (Car-Parrinello MD) code with applications to chemistry, materials science and engineering, geoscience, and biology. Coded in Charm++, a parallel dialect of C++ that excels in latency hiding.
NSF award 0121367, October 1, 2001–September 30, 2006, $206,773 (PSC portion).
-
Co-Functional Area Point of Contact for Computational Chemistry and Materials Science, DoD High Performance Computing Modernization Program (HPCMP) User Productivity Enhancement and Technology Transfer (PET).
Led performance optimization, GPU acceleration, functionality enhancement, and applied research in collaboration with, and in support for, DoD research scientists. 2001–2009.
-
Ported and optimized a large number of applications and libraries for LeMieux,
which introduced clusters of commodity CPUs (DEC Alpha EV68) to the NSF high performance computing community. LeMieux's 610 Compaq Alphaserver ES45 nodes were interconnected by dual-rail Quadrics.
At the time of its installation, LeMieux was #2 on the Top500 list, following only LLNL's ASCI White system.
|
1994‑1998 |
Scientific Specialist, Pittsburgh Supercomputing Center, Carnegie Mellon University
|
-
Developed parallel versions of GAMESS (quantum chemistry), AMBER (molecular dynamics),
and X-PLOR (X-ray crystallography) for the Cray T3D and T3E.
Parallelization was implemented using PVM (prior to the development of MPI).
-
Taught and supported parallel programming for PSC's Cray T3D and
T3E (Jaromir).
-
Provided tier-3 consulting in quantum chemistry and numerical methods.
-
Managed applications and libraries for PSC's
Cray C916/512 supercomputer (Mario) and Cray T3D.
|
1992‑1994 |
Scientific Programmer, Pittsburgh Supercomputing Center, Carnegie Mellon University
|
-
Joint NSF-NASA Initiative in Evaluation (JNNIE):
Evaluated the capabilities and deficiencies of scalable parallel computing architectures using
a variety of application benchmarks.
-
Provided tier-3 consulting in quantum chemistry, numerical methods, and parallel programming.
-
Managed applications and libraries for PSC’s Cray C916/512 supercomputer (Mario) and Cray T3D.
|
1983‑1992 |
Software engineer, Self-employed
|
-
Software engineer:
industrial simulation and metallurgical engineering, accounting systems, and enterprise databases.
|