Can FPGAs beat GPUs in accelerating next-generation deep neural networks? E Nurvitadhi, G Venkatesh, J Sim, D Marr, R Huang, J Ong Gee Hock, ... Proceedings of the 2017 ACM/SIGDA international symposium on field …, 2017 | 616 | 2017 |
Accelerating binarized neural networks: Comparison of FPGA, CPU, GPU, and ASIC E Nurvitadhi, D Sheffield, J Sim, A Mishra, G Venkatesh, D Marr 2016 International Conference on Field-Programmable Technology (FPT), 77-84, 2016 | 429 | 2016 |
WRPN: Wide reduced-precision networks A Mishra, E Nurvitadhi, JJ Cook, D Marr arXiv preprint arXiv:1709.01134, 2017 | 376 | 2017 |
Accelerating recurrent neural networks in analytics servers: Comparison of FPGA, CPU, GPU, and ASIC E Nurvitadhi, J Sim, D Sheffield, A Mishra, S Krishnan, D Marr 2016 26th International Conference on Field Programmable Logic and …, 2016 | 247 | 2016 |
Accelerating deep convolutional networks using low-precision and sparsity G Venkatesh, E Nurvitadhi, D Marr 2017 IEEE International Conference on Acoustics, Speech and Signal …, 2017 | 173 | 2017 |
GraphGen: An FPGA framework for vertex-centric graph computation E Nurvitadhi, G Weisz, Y Wang, S Hurkat, M Nguyen, JC Hoe, JF Martínez, ... 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom …, 2014 | 165 | 2014 |
Programmable coarse grained and sparse matrix compute hardware with advanced scheduling E Nurvitadhi, B Vembu, NCG Von Borries, R Barik, TH Lin, K Sinha, ... US Patent 10,186,011, 2019 | 132 | 2019 |
ProtoFlex: Towards scalable, full-system multiprocessor simulations using FPGAs ES Chung, MK Papamichael, E Nurvitadhi, JC Hoe, K Mai, B Falsafi ACM Transactions on Reconfigurable Technology and Systems (TRETS) 2 (2), 1-32, 2009 | 121 | 2009 |
A customizable matrix multiplication framework for the intel harpv2 xeon+ fpga platform: A deep learning case study DJM Moss, S Krishnan, E Nurvitadhi, P Ratuszniak, C Johnson, J Sim, ... Proceedings of the 2018 ACM/SIGDA International Symposium on Field …, 2018 | 105 | 2018 |
Machine learning accelerator mechanism A Bleiweiss, A Ramesh, A Mishra, D Marr, J Cook, S Sridharan, ... US Patent 11,373,088, 2022 | 102 | 2022 |
Compute optimizations for neural networks K Nealis, A Yao, X Chen, E Ould-Ahmed-Vall, SS Baghsorkhi, ... US Patent 10,410,098, 2019 | 101 | 2019 |
High performance binary neural networks on the Xeon+ FPGA™ platform DJM Moss, E Nurvitadhi, J Sim, A Mishra, D Marr, S Subhaschandra, ... 2017 27Th International conference on field programmable logic and …, 2017 | 99 | 2017 |
Beyond peak performance: Comparing the real performance of AI-optimized FPGAs and GPUs A Boutros, E Nurvitadhi, R Ma, S Gribok, Z Zhao, JC Hoe, V Betz, ... 2020 international conference on field-programmable technology (ICFPT), 10-19, 2020 | 95 | 2020 |
Exploration of low numeric precision deep learning inference using intel® fpgas P Colangelo, N Nasiri, E Nurvitadhi, A Mishra, M Margala, K Nealis 2018 IEEE 26th annual international symposium on field-programmable custom …, 2018 | 91 | 2018 |
Machine learning sparse computation mechanism E Nurvitadhi, B Vembu, TH Lin, K Sinha, R Barik, NCG Von Borries US Patent 10,346,944, 2019 | 81 | 2019 |
Specialized fixed function hardware for efficient convolution R Barik, E Ould-Ahmed-Vall, X Chen, D Srivastava, A Yao, K Nealis, ... US Patent 10,824,938, 2020 | 77 | 2020 |
Why compete when you can work together: FPGA-ASIC integration for persistent RNNs E Nurvitadhi, D Kwon, A Jafari, A Boutros, J Sim, P Tomson, H Sumbul, ... 2019 IEEE 27th Annual International Symposium on Field-Programmable Custom …, 2019 | 73 | 2019 |
A complexity-effective architecture for accelerating full-system multiprocessor simulations using FPGAs ES Chung, E Nurvitadhi, JC Hoe, B Falsafi, K Mai Proceedings of the 16th international ACM/SIGDA symposium on Field …, 2008 | 72 | 2008 |
Machine learning sparse computation mechanism for arbitrary neural networks, arithmetic compute microarchitecture, and sparsity for training mechanism E Nurvitadhi, A Bleiweiss, D Marr, E Wang, S Dwarakapuram, ... US Patent 11,636,327, 2023 | 69 | 2023 |
Hardware accelerator architecture and template for web-scale k-means clustering E Nurvitadhi, G Venkatesh, S Krishnan, S Subhaschandra, D Marr US Patent App. 15/396,515, 2018 | 68 | 2018 |