EBSCO Logo
Connecting you to content on EBSCOhost
Results
Title

An effective 3-D fast fourier transform framework for multi-GPU accelerated distributed-memory systems.

Authors

Zhou, Binbin; Lu, Lu

Abstract

This paper introduces an efficient and flexible 3D FFT framework for state-of-the-art multi-GPU distributed-memory systems. In contrast to the traditional pure MPI implementation, the multi-GPU distributed-memory systems can be exploited by employing a hybrid multi-GPU programming model that combines MPI with OpenMP to achieve effective communication. An asynchronous strategy that creates multiple streams and threads to reduce blocking time is adopted to accelerate intra-node communication. Furthermore, we combine our scheme with the GPU-Aware MPI implementation to perform GPU-GPU data transfers without CPU involvement. We also optimize the local FFT and transpose by creating fast parallel kernels to accelerate the total transform. Results show that our framework outperforms the state-of-the-art distributed 3D FFT library, being up to achieve 2× faster in a single node and 1.65× faster using two nodes.

Subjects

FAST Fourier transforms; GRAPHICS processing units

Publication

Journal of Supercomputing, 2022, Vol 78, Issue 15, p17055

ISSN

0920-8542

Publication type

Academic Journal

DOI

10.1007/s11227-022-04491-7

EBSCO Connect | Privacy policy | Terms of use | Copyright | Manage my cookies
Journals | Subjects | Sitemap
© 2025 EBSCO Industries, Inc. All rights reserved