The alignment problem from a deep learning perspective R Ngo, L Chan, S Mindermann arXiv preprint arXiv:2209.00626, 2022 | 100 | 2022 |
Avoiding side effects by considering future tasks V Krakovna, L Orseau, R Ngo, M Martic, S Legg Advances in Neural Information Processing Systems 33, 19064-19074, 2020 | 41 | 2020 |
REALab: An embedded perspective on tampering R Kumar, J Uesato, R Ngo, T Everitt, V Krakovna, S Legg arXiv preprint arXiv:2011.08820, 2020 | 12 | 2020 |
Avoiding tampering incentives in deep RL via decoupled approval J Uesato, R Kumar, V Krakovna, T Everitt, R Ngo, S Legg arXiv preprint arXiv:2011.08827, 2020 | 7 | 2020 |
Computing Power and the Governance of Artificial Intelligence G Sastry, L Heim, H Belfield, M Anderljung, M Brundage, J Hazell, ... arXiv preprint arXiv:2402.08797, 2024 | 3 | 2024 |
Automating Supervision of AI Delegates R Ngo, J Tallinn Cambridge Handbook of Responsible Artificial Intelligence, 2022 | | 2022 |