Követés
Ahmad Abdelfattah
Ahmad Abdelfattah
Research Scientist, Innovative Computing Laboratory, University of Tennessee
E-mail megerősítve itt: icl.utk.edu
Cím
Hivatkozott rá
Hivatkozott rá
Év
Performance, design, and autotuning of batched GEMM for GPUs
A Abdelfattah, A Haidar, S Tomov, J Dongarra
International Conference on High Performance Computing, 21-38, 2016
1102016
High-performance tensor contractions for GPUs
A Abdelfattah, M Baboulin, V Dobrev, J Dongarra, C Earl, J Falcou, ...
Procedia Computer Science 80, 108-118, 2016
642016
Parallel programming models for dense linear algebra on heterogeneous systems
J Dongarra, M Abalenkovs, A Abdelfattah, M Gates, A Haidar, J Kurzak, ...
Supercomputing frontiers and innovations 2 (4), 67-86, 2015
582015
High-performance matrix-matrix multiplications of very small matrices
I Masliah, A Abdelfattah, A Haidar, S Tomov, M Baboulin, J Falcou, ...
European Conference on Parallel Processing, 659-671, 2016
562016
Kblas: An optimized library for dense matrix-vector multiplication on gpu accelerators
A Abdelfattah, D Keyes, H Ltaief
ACM Transactions on Mathematical Software (TOMS) 42 (3), 1-31, 2016
432016
The design of fast and energy-efficient linear solvers: On the potential of half-precision arithmetic and iterative refinement techniques
A Haidar, A Abdelfattah, M Zounon, P Wu, S Pranesh, S Tomov, ...
International Conference on Computational Science, 586-600, 2018
422018
With extreme computing, the rules have changed
J Dongarra, S Tomov, P Luszczek, J Kurzak, M Gates, I Yamazaki, H Anzt, ...
Computing in Science & Engineering 19 (3), 52-62, 2017
392017
A survey of numerical linear algebra methods utilizing mixed-precision arithmetic
A Abdelfattah, H Anzt, EG Boman, E Carson, T Cojean, J Dongarra, A Fox, ...
The International Journal of High Performance Computing Applications 35 (4 …, 2021
352021
Fast batched matrix multiplication for small sizes using half-precision arithmetic on GPUs
A Abdelfattah, S Tomov, J Dongarra
2019 IEEE international parallel and distributed processing symposium (IPDPS …, 2019
352019
A novel fast and accurate pseudo-analytical simulation approach for MOAO
E Gendron, A Charara, A Abdelfattah, D Gratadour, D Keyes, H Ltaief, ...
Adaptive Optics Systems IV 9148, 2148-2160, 2014
352014
A survey of numerical methods utilizing mixed precision arithmetic
A Abdelfattah, H Anzt, EG Boman, E Carson, T Cojean, J Dongarra, ...
arXiv preprint arXiv:2007.06674, 2020
342020
A guide for achieving high performance with very small matrices on GPU: a case study of batched LU and Cholesky factorizations
A Haidar, A Abdelfattah, M Zounon, S Tomov, J Dongarra
IEEE Transactions on Parallel and Distributed Systems 29 (5), 973-984, 2017
232017
C++ api for blas and lapack
M Gates, P Luszczek, A Abdelfattah, J Kurzak, J Dongarra, K Arturov, ...
SLATE Working Notes, 2017
22*2017
Fast Cholesky factorization on GPUs for batch and native modes in MAGMA
A Abdelfattah, A Haidar, S Tomov, J Dongarra
Journal of Computational Science 20, 85-93, 2017
192017
Pipelining computational stages of the tomographic reconstructor for multi-object adaptive optics on a multi-gpu system
A Charara, H Ltaief, D Gratadour, D Keyes, A Sevin, A Abdelfattah, ...
SC'14: Proceedings of the International Conference for High Performance …, 2014
192014
Optimizing memory-bound SYMV kernel on GPU hardware accelerators
A Abdelfattah, J Dongarra, D Keyes, H Ltaief
International Conference on High Performance Computing for Computational …, 2012
192012
Novel HPC techniques to batch execution of many variable size BLAS computations on GPUs
A Abdelfattah, A Haidar, S Tomov, J Dongarra
Proceedings of the International Conference on Supercomputing, 1-10, 2017
182017
Algorithms and optimization techniques for high-performance matrix-matrix multiplications of very small matrices
I Masliah, A Abdelfattah, A Haidar, S Tomov, M Baboulin, J Falcou, ...
Parallel Computing 81, 1-21, 2019
172019
Towards half-precision computation for complex matrices: A case study for mixed precision solvers on gpus
A Abdelfattah, S Tomov, J Dongarra
2019 IEEE/ACM 10th Workshop on Latest Advances in Scalable Algorithms for …, 2019
162019
Linear algebra software for large-scale accelerated multicore computing
A Abdelfattah, H Anzt, J Dongarra, M Gates, A Haidar, J Kurzak, ...
Acta Numerica 25, 1-160, 2016
162016
A rendszer jelenleg nem tudja elvégezni a műveletet. Próbálkozzon újra később.
Cikkek 1–20