Document Type
Conference Paper
Publication Date
2023
DOI
10.1109/HiPC58850.2023.00055
Pages
377-386
Conference Name
IEEE 30th International Conference on High Performance Computing, Data, and Analytics (HiPC), December 18-21, 2023, Goa, India
Abstract
Many scientific and engineering applications require repeated calculations of derivatives of output functions with respect to input parameters. Automatic Differentiation (AD) is a method that automates derivative calculations and can significantly speed up code development. In Computational Fluid Dynamics (CFD), derivatives of flux functions with respect to state variables (Jacobian) are needed for efficient solutions of the nonlinear governing equations. AD of flux functions on graphics processing units (GPUs) is challenging as flux computations involve many intermediate variables that create high register pressure and require significant memory traffic because of the need to store the derivatives. This paper presents a forwardmode AD method based on multivariate dual numbers that addresses these challenges and simultaneously reduces the floatingpoint operation count. The dimension of the multivariate dual numbers is optimized for performance. The flux computations are restructured to minimize the number of temporary variables and reduce register pressure. For effective utilization of memory bandwidth, shared memory is used to store the local flux Jacobian. This AD implementation is compared with several other Jacobian implementations on an NVIDIA V100 GPU (V100). For three-dimensional perfect-gas compressible-flow equations implemented in a practical CFD code, the AD implementation of a flux Jacobian based on multivariate dual numbers of dimension 5 outperforms all other GPU AD implementations on V100. Its performance is comparable with the optimized handdifferentiated version. The implementation achieves 75% of the peak floating-point throughput and 61% of the peak global device memory bandwidth usage.
Rights
© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Original Publication Citation
Zubair, M., Ranjan, D., Walden, A., Nastac, G., Nielsen, E., Diskin, B., Paterno, M., Jung, S., & Davis, J. H. (2023). Efficient GPU implementation of automatic differentiation for computational fluid dynamics. Paper presented at the IEEE 30th International Conference on High Performance Computing, Data, and Analytics (HiPC), Goa, India, pp. 377-386 . doi: 10.1109/HiPC58850.2023.00055
Repository Citation
Zubair, M., Ranjan, D., Walden, A., Nastac, G., Nielsen, E., Diskin, B., Paterno, M., Jung, S., & Davis, J. H. (2023). Efficient GPU implementation of automatic differentiation for computational fluid dynamics. Paper presented at the IEEE 30th International Conference on High Performance Computing, Data, and Analytics (HiPC), Goa, India, pp. 377-386 . doi: 10.1109/HiPC58850.2023.00055
ORCID
0000-0002-5449-1779 (Zubair)
Included in
Numerical Analysis and Scientific Computing Commons, Other Computer Engineering Commons, Systems Architecture Commons
Comments
This is the accepted version of an IEEE-copyrighted work.