*Result*: Proton evaluation of single event effects in the NVIDIA GPU Orin SoM: Understanding radiation vulnerabilities beyond the SoC

Title:
Proton evaluation of single event effects in the NVIDIA GPU Orin SoM: Understanding radiation vulnerabilities beyond the SoC
Contributors:
Universitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors, Barcelona Supercomputing Center
Publisher Information:
Institute of Electrical and Electronics Engineers (IEEE)
Publication Year:
2024
Collection:
Universitat Politècnica de Catalunya, BarcelonaTech: UPCommons - Global access to UPC knowledge
Document Type:
*Conference* conference object
File Description:
7 p.; application/pdf
Language:
English
Relation:
https://ieeexplore.ieee.org/document/10616076; info:eu-repo/grantAgreement/EC/H2020/101008126/EU/RADiation facility Network for the EXploration of effects for indusTry and research/RADNEXT; info:eu-repo/grantAgreement/EC/HE/101082622/EU/MODULAR MODEL-BASED DESIGN AND TESTING FOR APPLICATIONS IN SATELLITES/METASAT; info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2019-107255GB-C21/ES/BSC - COMPUTACION DE ALTAS PRESTACIONES VIII/; info:eu-repo/grantAgreement/AEI//IJC2020-045931-I; https://hdl.handle.net/2117/426407
DOI:
10.1109/IOLTS60994.2024.10616076
Rights:
Open Access
Accession Number:
edsbas.F15F035
Database:
BASE

*Further Information*

*In this paper, we investigate the single event effects under proton irradiation for the state-of-the-art embedded GPU NVIDIA Jetson Orin NX System-on-Module (SoM). Designed for deployment across safety critical domains and in particularly automotive, this system represents a cutting-edge advancement in high performance embedded computing with functional safety features, which makes it an ideal candidate for use in space systems. Our study evaluates the Single-Event Effects (SEE) manifested within the SoM’s central processing unit (CPU), graphics processing unit (GPU), and associated peripherals. Through our analysis, we aim to delineate the origins of Single Event Functional Interrupts (SEFI) occurring at the SoM level. Furthermore, we provide a detailed exposition on the errors observed within the GPU complex, elucidating the requisite conditions for their manifestation. Unlike previous works which treat embedded GPUs under irradiation as black box, we are able to identify the source of SEEs through ARM’s RAS subsystem, and observe for the first time in literature GPU SEEs. Our investigation culminates in a comprehensive assessment of the SoM’s susceptibility, identifying particularly sensitive components. ; This work has received funding from the European Union’s H2020 research and innovation programme under grant agreement No 101008126, corresponding to the RADNEXT project. It was also supported by ESA through the 4000136514/21/NL/GLC/my co-funded PhD activity ”Mixed Software/Hardware-based Fault-tolerance Techniques for Complex COTS System-on-Chip in Radiation Environments”. Moreover, it was partially supported by the European Community’s Horizon Europe programme under the METASAT project (grant agreement 101082622), the Spanish Ministry of Economy and Competitiveness under grants PID2019-107255GB-C21 and IJC2020-045931-I (Spanish State Research Agency / http://dx.doi.org/10.13039/501100011033) and the HiPEAC Network of Excellence. TRIUMF receives federal funding via a contribution agreement with the ...*