Understanding soft error sensitivity of deep learning models and frameworks through checkpoint alteration E Rojas, D Pérez, JC Calhoun, LB Gomez, T Jones, E Meneses 2021 IEEE International Conference on Cluster Computing (CLUSTER), 492-503, 2021 | 9 | 2021 |
Exploring the effects of silent data corruption in distributed deep learning training E Rojas, D Pérez, E Meneses 2022 IEEE 34th International Symposium on Computer Architecture and High …, 2022 | 2 | 2022 |
A characterization of soft-error sensitivity in data-parallel and model-parallel distributed deep learning E Rojas, D Pérez, E Meneses Journal of Parallel and Distributed Computing 190, 104879, 2024 | | 2024 |
On the Detection of Silent Data Corruptions in HPC Applications Using Redundant Multi-threading D Pérez, T Ropars, E Meneses European Conference on Parallel Processing, 290-302, 2020 | | 2020 |
Improving redundant multithreading performance for soft-error detection in HPC applications DS Pérez-Arroyo Instituto Tecnológico de Costa Rica, 2018 | | 2018 |
Leveraging Modern Multi-core Processor Features to Efficiently Deal with Silent Errors D Pérez, T Ropars, E Meneses Memorias de eventos académicos TEC, 2017 | | 2017 |
Leveraging modern multi-core processors features to efficiently deal with silent errors D Pérez Arroyo, E Meneses Rojas, C Garita Rodríguez San José, Costa Rica: Tecnológico de Costa Rica, 2017 | | 2017 |