Abstract:
A system and method for determining reliability-aware runtime optimal processor configuration can integrate soft and hard error data into a single metric, referred to as the balanced reliability metric (BRM), by using statistical dimensionality reduction techniques. The BRM can be used to not only adjust processor voltage to optimize overall reliability but also to adjust the number of on-cores to further optimize overall processor reliability. In some implementations, both coarse-grained actuations, based on optimal core count, and fine-grained actuations, based on optimal processor voltage (V), may be used, where feedback control can recursively re-compute soft and hard error data based on a new configuration, until convergence at an optimal configuration.