Follow
Mostofa Patwary
Mostofa Patwary
Director, Applied Deep Learning Research, NVIDIA
Verified email at nvidia.com - Homepage
Title
Cited by
Cited by
Year
Megatron-LM: Training Multi-Billion Parameter Language Models Using GPU Model Parallelism
M Shoeybi, M Patwary, R Puri, P LeGresley, J Casper, B Catanzaro
arXiv preprint arXiv:1909.08053, 2019
18912019
Scalable Bayesian Optimization Using Deep Neural Networks
J Snoek, O Rippel, K Swersky, R Kiros, N Satish, N Sundaram, M Patwary, ...
arXiv preprint arXiv:1502.05700, 2015
13632015
Deep learning scaling is predictable, empirically
J Hestness, S Narang, N Ardalani, G Diamos, H Jun, H Kianinejad, ...
arXiv preprint arXiv:1712.00409, 2017
8192017
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
S Smith, M Patwary, B Norick, P LeGresley, S Rajbhandari, J Casper, ...
arXiv preprint arXiv:2201.11990, 2022
806*2022
Efficient large-scale language model training on GPU clusters using megatron-LM
D Narayanan, M Shoeybi, J Casper, P LeGresley, M Patwary, ...
Proceedings of the International Conference for High Performance Computing …, 2021
7042021
Twitter trending topic classification
K Lee, D Palsetia, R Narayanan, MMA Patwary, A Agrawal, A Choudhary
Data Mining Workshops (ICDMW), 2011 IEEE 11th International Conference on …, 2011
4842011
GraphMat: High performance graph analytics made productive
N Sundaram, N Satish, MMA Patwary, SR Dulloor, MJ Anderson, ...
Proceedings of the VLDB Endowment 8 (11), 1214-1225, 2015
4112015
Navigating the maze of graph analytics frameworks using massive graph datasets
N Satish, N Sundaram, MMA Patwary, J Seo, J Park, MA Hassaan, ...
Proceedings of the 2014 ACM SIGMOD international conference on Management of …, 2014
2492014
A new scalable parallel DBSCAN algorithm using the disjoint-set data structure
MMA Patwary, D Palsetia, A Agrawal, W Liao, F Manne, A Choudhary
SC'12: Proceedings of the International Conference on High Performance …, 2012
2402012
Factuality enhanced language models for open-ended text generation
N Lee, W Ping, P Xu, M Patwary, PN Fung, M Shoeybi, B Catanzaro
Advances in Neural Information Processing Systems 35, 34586-34599, 2022
1812022
StarCoder 2 and The Stack v2: The Next Generation
A Lozhkov, R Li, LB Allal, F Cassano, J Lamy-Poirier, N Tazi, A Tang, ...
arXiv preprint arXiv:2402.19173, 2024
1792024
Training Question Answering Models From Synthetic Data
R Puri, R Spring, M Patwary, M Shoeybi, B Catanzaro
arXiv preprint arXiv:2002.09599, 2020
1772020
Controllable Story Generation with External Knowledge Using Large-Scale Language Models
P Xu, M Patwary, M Shoeybi, R Puri, P Fung, A Anandkumar, B Catanzaro
Proceedings of the 2020 Conference on Empirical Methods in Natural Language …, 2020
159*2020
BioMegatron: Larger Biomedical Domain Language Model
HC Shin, Y Zhang, E Bakhturina, R Puri, M Patwary, M Shoeybi, R Mani
Proceedings of the 2020 Conference on Empirical Methods in Natural Language …, 2020
1522020
Fast maximum clique algorithms for large graphs
RA Rossi, DF Gleich, AH Gebremedhin, MMA Patwary
Proceedings of the companion publication of the 23rd international …, 2014
1202014
Fast Algorithms for the Maximum Clique Problem on Massive Sparse Graphs
B Pattabiraman, M Patwary, M Ali, AH Gebremedhin, W Liao, ...
arXiv preprint arXiv:1209.5818, 2012
1152012
ColPack: Software for graph coloring and related problems in scientific computing
AH Gebremedhin, D Nguyen, MMA Patwary, A Pothen
ACM Transactions on Mathematical Software (TOMS) 40 (1), 1-31, 2013
1032013
End-to-End Training of Neural Retrievers for Open-Domain Question Answering
DS Sachan, M Patwary, M Shoeybi, N Kant, W Ping, WL Hamilton, ...
arXiv preprint arXiv:2101.00408, 2021
1002021
Deep learning at 15PF: supervised and semi-supervised classification for scientific data
T Kurth, J Zhang, N Satish, E Racah, I Mitliagkas, MMA Patwary, T Malas, ...
Proceedings of the International Conference for High Performance Computing …, 2017
972017
BD-CATS: big data clustering at trillion particle scale
MMA Patwary, S Byna, NR Satish, N Sundaram, Z Lukić, V Roytershteyn, ...
SC'15: Proceedings of the International Conference for High Performance …, 2015
852015
The system can't perform the operation now. Try again later.
Articles 1–20