-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RT-TDDFT GPU Acceleration: RT-TD now has preliminary support for GPU computation #5773
base: develop
Are you sure you want to change the base?
RT-TDDFT GPU Acceleration: RT-TD now has preliminary support for GPU computation #5773
Conversation
The current program has some bugs that cause the data in Useful information:
|
…assignment operator overload) instead
Tensor
Tensor
on CPU and refactoring linear algebra operations in TDDFT
…pport for Tensor on CPU and refactoring linear algebra operations in TDDFT
LGTM👍, a good example showing the possibility of using tensor. |
Oops, I found that cuSOLVER does not provide LU based matrix inversion |
…velop into TDDFT_GPU_phase_1
…velop into TDDFT_GPU_phase_1
The initial implementation of the GPU version of RT-TDDFT has been completed and is functioning perfectly! Currently awaiting the merge of @Critsium-xy's PR #5862 (removing the Efficiency testing will follow in the near future. |
Tensor
on CPU and refactoring linear algebra operations in TDDFT
Phase 1: Rewriting existing code using
Tensor
(complete)This is merely a draft and does not represent the final code. Since
Tensor
can effectively support heterogeneous computing, the goal of the first phase is to rewrite the existing algorithms usingTensor
. Currently, all memory is still explicitly allocated on the CPU (the parameter of theTensor
constructor iscontainer::DeviceType::CpuDevice
).Phase 2: Adding needed BLAS and LAPACK support for
Tensor
on CPU and refactoring linear algebra operations in TDDFT (complete)Key Changes:
lapack_getrf
andlapack_getri
inmodule_base/module_container/ATen/kernels/lapack.h
to support matrix LU factorization (getrf
) and matrix inversion (getri
) operations forTensor
objects.zgetrf_
andzgetri_
) declarations inmodule_base/lapack_connector.h
to comply with standard conventions.Tensor
operations in TDDFT. These linear algebra operations incontainer::kernels
module frommodule_base/module_container/ATen
include aDevice
parameter, enabling seamless support for heterogeneous computing (GPU acceleration in future phases).Phase 3: RT-TDDFT GPU acceleration (complete)
Added linear solver interfaces:
getrs
) using LAPACK.getrf
) and linear solver (getrs
) using cuSOLVER.Refactored RT-TDDFT I/O and parameters:
td_force_dt
,td_vext
,td_vext_dire_case
,out_dipole
,out_efield
) from theEvolve_elec
class.PARAM.inp
input interface to simplify template class usage withDevice
parameter.Heterogeneous computing support:
Device
template parameter to RT-TDDFT core algorithm classes and functions.base_device::memory::synchronize_memory_op
) to ensure proper data handling across devices.BlasConnector::copy
operations with memory synchronization functions.GPU acceleration for RT-TDDFT:
Disclaimer
The current GPU version of the RT-TDDFT algorithm can only run in an environment with MPI processes set to 1 (no additional handling for 2D block cyclic distribution is implemented, and matrix partitioning will lead to incorrect results). As a result, this feature is not yet available to general users and can only be enabled by modifying the source code. Further refactoring will be carried out in the future to make it accessible to users.