Frequency Throttling Side Channel Software Guidance for Cryptography Implementations Appunti dalla rete

In modern processors, the time to execute a given set of instructions may vary depending on many factors. Two important factors in these variations are the number of cycles and the CPU frequency.

For developers implementing cryptographic algorithms, Intel recommends selecting instructions whose execution time is data-independent in order to mitigate timing side channels due to cycle differences. Intel has provided guidance for developing constant time/ constant cycle code in Intel’s Guidelines for Mitigating Timing Side Channels Against Cryptographic Implementations.

This document provides software guidance for mitigating timing side channels due to CPU frequency behavior. Power management algorithms in most modern processors, including Intel® processors, provide mechanisms to enforce electrical parameters (such as power and current) to remain below specified limits. CPU frequency throttling is triggered when one of these limits is reached, which results in CPU frequency changes regardless of whether Intel® Turbo Boost Technology is enabled or not. Such frequency change and derived behavior may be correlated with information being processed by the CPU and it may be possible to infer parts of the information through sophisticated analysis of the frequency change behavior. The guidance in this document is based on Intel’s information and understanding at the time of writing. However as with other security guidance, Intel’s guidance is subject to change as the threat landscape evolves or new information becomes available.

Frequency Throttling Side Channel

When CPUs process data, transistors are switched on and off depending on the data being processed. Switching transistors uses energy. Consequently, running the same workload with different data may change the CPU’s power consumption. This physical property may lead to malicious actors correlating the system’s reported power consumption with possible secret data being processed on the system. Refer to Intel’s Running Average Power Limit Energy Reporting technical article for more details on Intel’s mitigations.

The CPU power management unit routinely calculates the running averaged electrical parameters during the past time window and compares it against the power management reactive limits. If any of the limits are exceeded, the power management algorithm will trigger CPU throttling and adjust the maximal allowed frequency accordingly. As a result, there is an inverse correlation between the average throttling frequency and the power consumption¹ before frequency throttling: a workload with higher power consumption before throttling tends to run at lower average throttled frequency, and vice versa. Furthermore, since the power consumption of a workload may be correlated with the data being processed, the throttling frequency may also be correlated with the data, which becomes a frequency side channel. The CPU frequency change also causes a difference in the execution time of the workload and results in a timing side channel.

Figure 1: Power management reactive limits throttling converts power differences to frequency/timing differences

Figure 1 explains the side channel using an illustrative example. Figure 1(a) shows the same program executed twice with the input data 1 (in blue) and data 2 (in orange), respectively. Assuming the program is a constant-cycle implementation, the number of cycles with data 1 and data 2 are the same (c1=c2). On the other hand, power consumption of processing data 1 and data 2 might be different. Without loss of generality, we assume processing data 1 consumes higher power (p1) compared to that of data 2 (p2). When neither p1 nor p2 exceeds the default power limit (or any other reactive limit), there is no throttling, and the frequency stays at f_default when the program is running. As a result, the execution time of the program is the same with either data 1 or data 2.

Assuming the total power consumption exceeds the power limit due to increased system power consumption (such as when stressor code starts to run in parallel with the function) or a reduction of the power limit, frequency throttling occurs. As shown in Figure 1(c), since processing data 1 consumes more power than data 2, the averaged throttled frequency of data 1 (freq₁) will become lower than that of data 2 (freq₂) to ensure the power limit is satisfied. Of course, both throttling frequencies are lower than f_default. Therefore, even if the number of execution cycles of the program is still data-invariant, the throttling frequency, and hence execution time, becomes data-variant. An attacker may utilize this side channel to extract secret data (such as cryptographic keys) from a constant-time cryptography implementation, since a typical constant-time implementation ensures only constant-cycle execution, and data-dependent variations in CPU frequency will result in data-dependent code execution time.

Power Management Reactive Limits

Intel® processors have several reactive limits related to power management, such as Running Average Power Limit (RAPL) and Voltage Regulator Thermal Design Current Limit (VR-TDC).

Running Average Power Limit (RAPL)

RAPL is a feature supported by Intel power management architecture to cap the power consumption on the system. When the configured power limit is exceeded, the CPU will be forced to run at a lower frequency to maximize performance while meeting the power limit requirement. Intel currently provides multiple power limit capabilities, including package-level power limits and platform-level power limits. Ring 0 software can configure both the running average window and the power limit of each capability through model specific registers (MSRs) such as, MSR_PKG_POWER_LIMIT for package-level power limit. Refer to Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3 Section 14.10 “Platform Specific Power Management Support” for more details.

Voltage Regulator Thermal Design Current Limit (VR-TDC)

VR-TDC is a power management feature supported by Intel power management architecture. The feature introduces a current limit, naturally specified in amperes, that is to be maintained to preserve the electrical constraints on properties of the voltage regulator (VR). Generally, the control algorithm monitors the Exponential Moving Average (EMA) current, which is also measured in amperes, by reading current measurements from the VR. As with other control algorithms, this algorithm controls the power budget based on a given time window. If the limit is hit, the processor will reduce the CPU frequency (frequency throttling) to ensure the current remains below this limit.

Related Issues

Intel released microcode updates (MCUs) in IPU 2020.2 and IPU 2021.2 responding to Running Average Power Limit energy reporting vulnerabilities (CVE-2020-8694 and CVE-2020-8695). None of the MCUs are meant to mitigate the frequency throttling side channel vulnerability. In the Software Guidance for Cryptographic Implementations section of this article, Intel provides software guidance for mitigating frequency throttling side channels against cryptographic implementations. Intel recommends cryptographic library and application developers refer to the suggested methods in this article to assess and harden their code against the frequency throttling side channel (also known as “Hertzbleed”).

Software Guidance for Cryptographic Implementations

This section provides cryptographic application and library developers² with guidance to assess the risk and reduce the impact of frequency throttling side channels on cryptographic implementations. Remember that the root cause of the frequency throttling side channel is the power side channel, and the mitigation of it has been researched exhaustively. This document is not intended to provide comprehensive solutions to mitigate the frequency throttling side channel for all cryptography implementations, but instead provides recommendations to enable cryptography developers to evaluate the risks and to help harden their software implementations against this side channel. Intel recommends that cryptographic implementations follow existing guidance for developing constant-cycle code, as described in Intel’s Guidelines for Mitigating Timing Side Channels Against Cryptographic Implementations, and the Data Operand Independent Timing Instruction Set Architecture Guidance.

Conditions of the Attack

Cryptographic implementations may be vulnerable to frequency throttling side channels when all the following conditions are met. If one or more of these listed prerequisites is not satisfied, the cryptography implementation should not be impacted by this type of side channel.

Check the list of conditions below to assess your implementation’s risk based on the nature of the implementation and your threat model.

The Cryptographic Implementation is Vulnerable to Power Side Channel Attack

The power side channel is the fundamental root cause of the frequency throttling side channel. An implementation that is vulnerable to the frequency throttling side channel needs to satisfy all the prerequisites of physical power side channel attacks, except for the physical access capability to measure power. The prerequisites for a malicious actor to exploit the physical power side channel include, but are not limited to:

The ability to repeatedly initiate cryptographic operations with the same secret key to collect enough data.
For block ciphers, the ability to read input/output or inter-round state of the block cipher primitives. Note that the input/output is not necessarily the plaintext or ciphertext. One example is the counter (CTR) mode of block cipher, where the input to the block cipher is the concatenation of the nonce and counter.

Ability to Ensure Victim Execution hits the CPU’s Reactive Limits

For the power frequency to be correlated with secret data, a reactive limit needs to be hit while the victim workload is running. There are several techniques that an attacker may take to meet this necessary condition.

An attacker may run multiple instances of the same victim workload (using the same secret data) on multiple cores to increase power consumption of the package (to hit the limit) and increase Signal-to-Noise Ratio (SNR)³.
An attacker may run stressor workloads in parallel with the victim workload to increase package power consumption.
An attacker with ring 0 privilege can reduce the limits through reactive limit configuration interfaces (for example, MSRs) to ensure the victim workload hits the limit.

Ability to Monitor Frequency Change or Related CPU Behaviors with Sufficient Resolution

The attacker needs to be able to sample CPU frequency while the victim workload is running, or else observe the execution time of the victim workload with sufficient resolution to identify data-dependent differences in the measured information.

Software Implementation

The following guidance can help developers mitigate the frequency throttling side channel. Additionally, this guidance can be used as a defense-in-depth mechanism even when not all conditions for the frequency throttling side channel are met.

Applying Effective Solutions Against Power Side Channel

Most of the proven software-based countermeasures against power side channels on cryptographic primitives will also be effective against the frequency throttling side channel. For example, mitigations that reduce power SNR of single instructions are effective for both the physical power side channel and the frequency throttling side channel, since the SNR of averaged power consumption (and timing due to throttling) during the running average window is also reduced. Other techniques, such as software-based masking [1] [2], should be also effective against frequency throttling side channels.

Note that countermeasures that randomize the execution order of instructions, while being effective in making trace alignment and identification of point of interest harder for physical power side channel, are less effective at mitigating the frequency throttling side channel. This is because reordering instructions at the cycle granularity is less likely to impact averaged power consumption during the averaging time window of milliseconds or longer.

For cryptographic applications, an example of a generic countermeasure against power side channel is key refresh. One of the necessary conditions for power side channel is the ability to repeatedly kick off cryptographic operations with the same sensitive key. If the secret key is refreshed before enough traces can be collected, it will be harder for the attacker to fully deduce the secret. Key refresh frequency may be based on timing (for example, refresh per several hours) or data volume (for example, the volume of data being encrypted with the same key). If the implementor is uncertain of which threshold to use, the lowest threshold that meets performance/design requirements should be selected. It should be noted that the practicality of key refresh depends on the specific cryptographic use case (for example, key refresh is typically not applicable to disk encryption).

Avoiding Unnecessary Exposure of Reactive Limit Configuration Interfaces

As stated in the Conditions of the Attack section, if an adversary has access to certain hardware interfaces (for example, MSR_PKG_POWER_LIMIT), they may configure and reduce the reactive limits to make it easier for the victim workload to trigger frequency throttling. To reduce the attack surface, privileged software (for example, hypervisor or ring-0 software) that has access to these interfaces should avoid unnecessarily exposing those interfaces to untrusted entities (for example guest VMs or ring-3 software). If there is a business need to expose these interfaces, the designers of the privileged software should be aware of the potential security implication.

Restricting Correlation of Frequency Change or Related Behaviors

Another common countermeasure against side channel attacks is to jam the channel with noise to deter the attacker from deducing the secret. As the side channel in this attack is frequency change or derived behaviors, the noise can be injected into the frequency transition or timing information.

One method to do this is to leverage the inherent noise during cryptographic application calls. The cryptographic library provider or cryptographic application provider may restrict the maximum size of the plaintext/ciphertext allowed per API invocation, so that more invocations of the API are needed to process the same amount of data, which introduces more inherent noise.

Besides that, a cryptography implementor may proactively inject random noise into the cryptographic operations to increase timing variation. To implement this countermeasure, the developer may add dummy instructions that introduce sufficient power or latency variation. The dummy instructions should be independent from the secret data using cryptographic functions. For example, timing variations can be introduced using a loop of instructions with random iterations. In addition to that, any power variation induced by the dummy instructions may also increase the entropy of the frequency transition. To help ensure every frequency change is impacted by noise, Intel recommends injecting some noise during the running time window of the reactive limits that may be leveraged by an attacker. One possible way to balance the trade-offs between security and performance is to combine this scheme with the key refresh countermeasure described in Applying Effective Solutions Against Power Side Channel section to increase the time needed to perform a successful attack to a key lifetime that is acceptable for your implementation.

Steps to Take to Protect Your Code

Cryptographic library application providers are advised to take the following steps to assess and protect their code:

Assess whether the implementation is impacted based on the threat model and the necessary conditions of the attack.
If the cryptographic implementation is impacted and mitigation is needed:
1. Apply generic countermeasures against the power side channel on the cryptography primitive level (for example, masking) or cryptography application level (for example, key refresh).
2. Restrict correlation of frequency change or related behaviors. Examples include restricting maximum input data size per invocation and injecting random delay noise.
For privileged software or hypervisors, avoid unnecessary exposure of reactive limit configuration interfaces to untrusted entities.

References

S. Mangard, E. Oswald and T. Popp, Power Analysis Attacks: Revealing the Secrets of Smart Cards.
E. Prouff and M. Rivain, “Masking against Side-Channel Attacks: a Formal Security Proof,” in Annual International Conference on the Theory and Applications of Cryptographic Techniques, 2013.

Footnotes

For simplicity, in this document, we use power to indicate any electrical parameter that exhibits similar behavior.
The definition of application is the software that utilizes cryptographic library primitives and owns/manages cryptography keys. The definition of a library is the software that provides cryptography primitives but does not own the key. Instead, the cryptographic library relies on the application to provide the cryptography key to use.
Here, signal is the secret-correlated power consumption, and noise is the secret-independent power consumption.

Source :
https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/frequency-throttling-side-channel-guidance.html