Workshop course

The ALCF workshop provides researchers with AI and deep learning skills

As artificial intelligence (AI) continues to consolidate its place as a powerful tool for science, the Argonne Leadership Computing Facility (ALCF) remains committed to training researchers in the use of machine learning, deep learning and other emerging AI techniques on its world-class supercomputers. Resources. The ALCF is a user facility of the United States Department of Energy’s Office of Science at the Argonne National Laboratory.

Huihuo Zheng from Argonne leads a session on Distributed Deep Learning at the ALCF 2021 Simulation, Data and Learning Workshop.

“As part of Argonne’s efforts to advance AI for science, the lab brings together an impressive array of AI technologies and expertise,” said Kyle Felker, computer scientist Argonne. “The goal of the ALCF training program is to develop the community of researchers who can use our advanced computing resources to accelerate science. “

In October, Felker helped organize the ALCF Simulation, Data and Learning (SDL) Workshop, an annual event designed to help researchers improve the performance and productivity of simulation, science and science applications. data and machine learning on ALCF systems. Participants had the opportunity to learn about cutting-edge AI methods and technologies while working directly with ALCF scientists in dedicated hands-on sessions.

“It was especially important for the workshop organizers to focus the event on interactive and hands-on teaching,” said Felker. “With the lightning speed of development of new deep learning models, software and hardware frameworks, we are forced to make major updates to event content every year, and we want attendees to immediately see how to map their scientific applications to these ALCF tools and resources. The addition of the Polaris and Aurora supercomputers next year represents a great challenge and an opportunity to make SDL 2022 a useful stepping stone to these new platforms. “

During the three-day virtual workshop, participants learned how to use deep learning tools, such as the Horovod framework, the DeepSpeed ​​library, and the DeepHyper package developed by Argonne. They were able to test these tools in real time on the ALCF computing resources, including ThetaGPU, an extension compatible with the AI ​​and the simulation of the Theta supercomputer.

“As the presenters led the sessions, I ran the application codes presented simultaneously,” said Smeet Chheda, a PhD student in computer science at Stony Brook University. “The hands-on experience with ThetaGPU was amazing and the presenters knew a wide range of topics. They helped me with small problems as well as big theoretical problems.

Chheda, who was among 60 researchers to attend this year’s workshop, attended the event to learn how to use large-scale systems like ThetaGPU to advance her machine learning research.

“I am already directly applying the distributed deep learning concepts that were presented in the workshop, and I look forward to also experiencing what I learned about neural architecture research,” said Chheda. “In the future, I hope to continue working with ALCF on various scientific machine learning issues. “

Like Chheda, most of the attendees intended to continue their work at ALCF after the event ended. The workshop ended with a session detailing how researchers can apply to Director’s discretionary projects to test and optimize their software and prepare for a future project through allocation programs, such as INCITE, ALCC and the program. of ALCF Data Science.

“As our facility continues to expand the use of AI and machine learning across leadership computing resources, it is critical that we bring the ALCF user community with us,” said Ray. Loy, ALCF responsible for training, debuggers and math libraries. “Our workshops and training events are essential for educating current users and training a new set of researchers who can leverage AI and supercomputers for science at ALCF.”

The Argonne Leadership Computing Center provides high-performance computing capabilities to the scientific and engineering community to advance fundamental discovery and understanding across a wide range of disciplines. Supported by the Office of Science of the United States Department of Energy (DOE), the Advanced Scientific Computing Research (ASCR) program, the ALCF is one of the DOE’s two executive computing facilities in the country dedicated to science. opened.

Argonne National Laboratory seeks solutions to urgent national problems in science and technology. The country’s leading national laboratory, Argonne conducts cutting-edge fundamental and applied scientific research in virtually all scientific disciplines. Argonne researchers work closely with researchers from hundreds of businesses, universities, and federal, state, and municipal agencies to help them solve their specific problems, advance U.S. scientific leadership, and prepare the nation for a better future. With employees from over 60 nations, Argonne is managed by UChicago Argonne, LLC for the United States Department of Energy Science Office.

The Office of Science of the United States Department of Energy is the largest proponent of basic physical science research in the United States and strives to address some of the most pressing challenges of our time. For more information visit https://energy.gov/science.


Source: Logan Ludwig, ALCF