ACES - Accelerating Computing for Emerging Sciences Grant uri icon

abstract

  • The ever-growing complexity of Science and Engineering (S&E) workflows and expectations of Open Science have encouraged researchers to adopt new technologies, such as containerization, virtualization and composability, that enable them to respond to an increasingly complex cyberinfrastructure (CI) landscape while producing shareable, and reproducible results. ACES (Accelerating Computing for Emerging Sciences), an innovative advanced computational prototype to be developed by Texas A&M University, tries to answer a fundamental question: how does one effectively offer a holistic computing platform that can simultaneously meet the needs of a continuum of users in diverse research communities with varying levels of computing adoption? The project will allow researchers to creatively develop new programming models and workflows that utilize these architectures while simultaneously advancing HPC (High Performance Computing) and data science projects.

    The ACES platform removes significant bottlenecks in advanced computing by introducing the flexibility to aggregate various components (i.e., processors, accelerators and memory) on an as-needed basis to solve problems that were previously not addressable. By letting researchers switch and run on accelerators best suited for their workflows, ACES will benefit many research and development projects in the fields of artificial intelligence and machine learning (AI/ML), cybersecurity, health population informatics, genomics and bioinformatics, human and agricultural life sciences, oil & gas simulations, de novo materials design, climate modeling, molecular dynamics, quantum computing architectures, imaging, smart and connected societies, geosciences, and quantum chemistry. Toward facilitating researcher use, ACES will offer avenues for interactive computing, portals, and cloud connectivity. ACES will support the national research community through coordination systems supported by the National Science Foundation (NSF). Finally, ACES will also leverage existing efforts that promote science and broaden participation in computing at the K-12, collegiate, and professional levels to have a transformative impact nationally by focusing on training, education and outreach. ACES activities are designed to expand the participation of traditionally underrepresented groups in computing and STEM (Science, Technology, Engineering and Mathematics), particularly at minority-serving institutions. ACES will offer fellowships to students, continue efforts to support teacher programs, and offer a number of formal and informal courses, whose materials will be offered to the national community for use free-of-charge.

    This project funds the development of a dynamically composable high-performance data analysis and computing platform, named ACES. AI and ML are integrated with traditional simulation and modeling approaches in the pursuit of innovation. Edge-computing and instrumental probes have pushed the need to verify, process, store, analyze, and query vast amounts of unstructured data in real time. The coupling of analytics with closely-situated data on highly-usable web-based technologies connected to a compute backend have led to a paradigm shift in expectations from research computing environments. The ACES innovative composable hardware platform helps accelerate transformative changes in research areas that can leverage novel High Bandwidth Memory (HBM) processors and accelerators for analytics and computing. ACES leverages Liqid’s composable framework via PCIe (Peripheral Component Interconnect express) Gen5 on Intel’s HBM Sapphire Rapid processors to offer a rich accelerator testbed consisting of Intel Ponte Vecchio GPUs (Graphics Processing Units), Intel FPGAs (Field Programmable Gate Arrays), NEC Vector Engines, NextSilicon co-processors, Graphcore IPUs (Intelligence Processing Units). The accelerators are coupled with Intel Optane memory and DDN Lustre storage interconnected with Mellanox NDR 400Gbps (gigabit-per-second) InfiniBand to support workflows that benefit from optimized devices. ACES will enable applications and workflows to dynamically integrate the different accelerators, memory, and in-network computing protocols to glean new insights by rapidly processing large volumes of data, and provide researchers with a unique platform to produce complex hybrid programming models that effectively supports calculations that were not feasible before.

date/time interval

  • 2021 - 2026