Lecture Descriptions
A Not So Simple Matter of Software by Jack Dongarra
For nearly forty years, Moore’s Law produced exponential growth in hardware performance, and during that same time, most software failed to keep pace with these hardware advances. We will look at some of the algorithmic and software changes that have tried to keep up with the advances in the hardware. In this talk, we examine how high performance computing has changed over the last 40-years and look toward the future in terms of trends. These changes have had and will continue to have a significant impact on our numerical scientific software. A new generation of software libraries and algorithms are needed to effectively use the high performance computing environments in use today.
Agent-Based Simulation for Earthquake Disaster Mitigation by Maddegedara Lalith
Considering the large number of lives involved, widespread damages, and impact on the economy, high-resolution models are essential for making comprehensive decisions in disaster mitigation. In this regard, significant progress has been made using HPC to simulate natural hazards (e.g., typhoons and earthquakes) and the damage they cause to the built environment. However, the use of HPC to comprehensively analyze the impact of these disasters on people and the economy is still in its infancy. To fill this gap, we have developed HPC-enhanced simulators to analyze large-scale evacuations and the impact of disasters on the national economies. Our simulator for analyze tsunami-triggered mass evacuations of coastal urban areas, includes a high-resolution (e.g., 1mx1m) 2D model of urban environment, agents capable of recognizing the features in the environment, and interacting with the environment and other agents in a complex manner. While it has reasonably high strong scalability, it can accommodate several millions of agents spread over several hundred square kilometers. Our HPC-enhanced economic simulator is capable of including every economic entity (e.g., every household, every firm, government agencies, etc.) in a country and mimicking real-world economic interactions. We calibrated the model to the Japanese economy using data available at government data portals, and validated by reproducing past observations of the national economy, each industrial sector, and even individual firms. High computational efficiency and scalability allow us to simulate a single period of the model of the Japanese economy consisting of 130 million agents within 2 minutes using 128 CPU cores. By integrating it with physics-based disaster simulators, we aim to simulate the impact of natural disasters on the national economy. In this lecture, we explain the details of these two agent-based models.
Diagnostic studies of High-Resolution Global Climate Model by Yoshiyuki Kajikawa
The development of HPC has brought higher resolution and more elaboration in the climate model. We are now entering the era of “Kilometre-scale” modeling of the climate system. With these developments, it would be natural that the climate model analysis should be also advanced. In this lecture, the diagnostic studies of high-resolution climate model simulations with the benefit of high-resolution will be introduced with the pioneering recent studies. In particular, the expression of convection in the regional and global climate model as well as its aggregation process will be focused. In the latter half of the lecture, we will also introduce how the reproducibility of the climate fields and elements are improved in various spatio-temporal scales by resolving the cumulus convection. We would like to share and discuss the direction of renewed climate science with high-resolution climate model simulations.
Earth Observation Data Analysis Using Machine Learning by Naoto Yokoya
This lecture focuses on applying machine learning techniques with earth observation data analysis. We start by exploring different applications of Earth observation data, such as environmental monitoring, disaster management, and urban planning. After an introduction to machine learning, we delve into the basics of neural networks and how they can be applied to remote sensing data. Participants will take part in a hands-on session where they will build a neural network model for building damage classification. They will then explore automated mapping techniques using remote sensing imagery, with a particular focus on semantic segmentation for land cover mapping. Another hands-on session will guide participants through practical implementation steps for land cover mapping using machine learning algorithms.
From Sequence to Function: Applications of Protein Language Models in Protein Design by Camila Pontes
Proteins play a crucial role in various cellular processes, making their study vital for understanding biology and underpinning disease mechanisms. Over recent years, language models were naturally adapted to understand the language of proteins. This happened because the protein language has many similarities with natural languages: while a phrase is composed of words according to a specific grammar, a protein is composed of different amino acids following biological rules. This session will give an overview of how different protein language models (pLMs) can be leveraged to obtain functional insights about protein families or to generate new-to-nature protein sequences. In the practical session, we will investigate the outputs (embeddings and logits) of a BERT-based protein language model called ESM2 and learn how to interpret them. Then, we will work on an example of how these outputs can be used to redesign a protein.
Hands-on for Scientific Benchmarking by Jens Domke
In this hands-on, we will go through one example on how to test and reach the peak performance of an important HPC proxy which tests a major architecture feature of Fugaku. The attendees will analyze, modify, execute, and post-process an OpenMP-based benchmark, and learn how to approach the systematic benchmarking process. The attendees will be introduced to typical issues which they might encounter with other apps and how to tackle those challenges. Ideally, the outcome of this session will to enable the attendees to replicate the art of benchmarking on an HPC systems at their home institute.
Integration of 3D Earthquake Simulation & Real-Time Data Assimilation by Kengo Najajima
We propose an innovative method of computational science for sustainable promotion of scientific discovery by supercomputers in the Exascale Era by integrating (Simulation/Data /Learning (S+D+L)), and develop a software platform “h3-Open-BDEC” for integration of (S+D+L) and evaluate the effects of the integration of on heterogenous supercomputer systems. Our target system the Wisteria/BDEC-01system with 33+PF at the University of Tokyo, which started its operation in May 2021, and consists of computing nodes for CSE with A64FX and those for Data Analytics/AI with NVIDIA A100 GPU’s. The h3-Open-BDEC is designed for extracting the maximum performance of the supercomputers with minimum energy consumption. Japan is a country with many natural disasters. In particular, the damage caused by the earthquake is enormous. Over the past 30 years, large earthquakes have occurred across Japan. Although it is extremely difficult to predict the occurrence of an earthquake, research is currently being actively conducted to minimize damage after an earthquake occurs. We have applied h3-Open-BDEC to Seism3D/OpenSWPC-DAF (Data-Assimilation-Based Forecast), which was developed by ERI/U.Tokyo for integration of simulation and data assimilation. In this talk, we will describe the demonstration of real-time data assimilation of the developed code on the Wisteria/BDEC-01 with h3-Open-BDEC using measured data through JDXnet (Japan Data eXchange network) with 2,000+ high-sensitivity/broadband seismic observation stations in Japan.
Introduction to HPC Applications and systems by Bernd Mohr
In this introductory lecture, students will learn what "high performance computing" (HPC) means and what differentiates it from more mainstream areas of computing. They will also be introduced to the major application areas that use HPC for research and industry, and how AI and HPC interact with each other. Then, it will present the major HPC system architectures needed to run these applications (distributed and shared memory, hybrid, and heterogeneous systems).
Introduction to HPC Programming by Bernd Mohr
In this second introductionary lecture, students will be provided with an overview of the programming languages, frameworks, and paradigms used to program HPC applications and systems. They will learn how MPI can be used to program distributed memory systems (clusters), how OpenMP can be used for shared memory systems, and finally, how to program graphics processing units (GPUs) with OpenMP, OpenACC, or lowel-level methods like CUDA or ROCm/HIP.
Introduction to the Use of Fugaku by Jorji Nonaka
This session will provide an overview of the hardware and software resources of the supercomputer Fugaku available to the users, as well as a hands-on introduction to their use via traditional CLI (Command Line Interface) as well as Web-based GUI (Graphical User Interface). The overview also includes information resources to the user guides, operational status, and user support.
Parallel Programming on GPUs and/or HPC Python Programming by Bernd Mohr
In these lectures, students will be able to learn how to program graphics processing units (GPUs) and will perform programming exercises on one or more GPU systems (e.g. NVIDIA or AMD) made available in the cloud. We also plan for a lecture on how to use effectively Python on HPC systems. More details will be provided soon. For this lecture, students should have some basic experience with programming computers with C or C++.
Parallel Programming with MPI and OpenMP Hands-on by Jens Domke and Bernd Mohr
In this lecture, students will be provided with all the necessary details, to perform parallel programming exercises with MPI and OpenMP on the Fugaku supercomputer of RIKEN, Japan, one of the fastest computers in the world. Ideally, students should have some basic experience with programming computers with C, C++, FORTRAN or Python. Experiences with the Linux operation system are also helpful, but not required.
Scientific Benchmarking by Jens Domke
In this lecture, the attendees will learn the dos and don'ts of the scientific benchmarking process. The lecture will introduce why benchmarking is important in the current complex landscape of hardware and software, and will highlight a few concepts like metrics, methodologies, data visualization and interpretation, etc. The lecture will also showcase pitfalls and mitigation strategies based on the experiences collected by the HPC/AI community over the years.
Solving 3D Puzzles of Biomolecular Interactions by Integrative Modeling by Alexandre Bonvin
The prediction of the quaternary structure of biomolecular macromolecules is of paramount importance for fundamental understanding of cellular processes and drug design. One way of increasing the accuracy of modelling methods used to predict the structure of biomolecular complexes is to include as much experimental or predictive information as possible in the process. We have developed for this purpose the versatile integrative modelling software HADDOCK (https://www.bonvinlab.org/software) available as a web service from https://wenmr.science.uu.nl. HADDOCK can integrate a large variety of information derived from biochemical, biophysical or bioinformatics methods to enhance sampling, scoring, or both.The lecture (online before the school) will highlight some recent developments around HADDOCK and its new modular HADDOCK3 version and illustrate its capabilities with various examples including among others recent work on the modelling of antibody-antigen interactions from sequence only, a notoriously difficult class of complexes to predict for AI-based methods. The practical session will demonstrates the use of the new modular HADDOCK3 version for predicting the structure of an antibody-antigen complex using knowledge of the hypervariable loops on the antibody (i.e., the most basic knowledge) and epitope information identified from NMR experiments for the antigen to guide the docking.
The Digital Revolution of Earth System Modelling by Peter Dueben
This talk will outline three revolutions that happened in Earth system modelling in the past decades. The quiet revolution has leveraged better observations and more compute power to allow for constant improvements of prediction quality of the last decades, the digital revolution has enabled us to perform km-scale simulations on modern supercomputers that further increase the quality of our models, and the machine learning revolution has now shown that machine learned weather models are often competitive with physics based weather models for many forecast scores while being easier, smaller and cheaper. This talk will summarize the past developments, explain current challenges and opportunities, and outline how the future of Earth system modelling will look like. In particular, regarding machine-learned foundation models in a physical domain such as Earth system modelling.
The Use of LLMs and Other AI Models in Science by Mohamed Wahib
The rapid advancement of large language models (LLMs) and other AI models has opened new frontiers in scientific research. In this talk, we will explore how these cutting-edge technologies are transforming various scientific disciplines. Across a wide range of science domains, LLMs and AI are enabling researchers to tackle complex problems with unprecedented speed and precision. We will first give an introduction to LLMs, then we discuss practical applications in fields such as chemistry, biology, and physics, and examine the potential challenges and ethical considerations in integrating these AI models into scientific workflows.