Approaching System Reliability in the AI Era
Virtual: https://events.vtools.ieee.org/m/485845[]Ensuring hardware system reliability is increasingly critical in the evolving AI landscape, particularly within data centers. Drawing upon extensive experience leading reliability initiatives for cutting-edge hardware, this presentation will outline a general methodology for designing reliable complex AI systems. It will emphasize the necessity of a multidisciplinary approach, integrating model-based system engineering, rigorous reliability testing, […]