ACPICPU(4) BSD Kernel Interfaces Manual ACPICPU(4)NAMEacpicpu — ACPI CPU
SYNOPSIS
acpicpu* at cpu?
DESCRIPTION
The acpicpu device driver supports certain processor features that are
either only available via ACPI or that require ACPI to function properly.
Typically the ACPI processor functionality is grouped into so-called C-,
P-, and T-states.
C-states
The processor power states, or C-states, are low-power modes that can be
used when the CPU is idle. The idea is not new: already in the 80486
processor a specific instruction (HLT) was used for this purpose. This
was later accompanied by a pair of other instructions (MONITOR, MWAIT).
By default, NetBSD may use either one; see the machdep.idle-mechanism
sysctl(8) variable. ACPI provides the latest amendment.
The following C-states are typically available. Additional processor or
vendor specific states (C4, ..., Cn) are handled internally by acpicpu.
C0 This is the normal state of a processor; the CPU is busy exe‐
cuting instructions.
C1 This is the state that is typically reached via the mentioned
x86 instructions. On a typical processor, C1 turns off the
main internal CPU clock, leaving APIC running at full speed.
The CPU is free to temporarily leave the state to deal with
important requests.
C2 The main difference between C1 and C2 lies in the internal
hardware entry method of the processor. While less power is
expected to be consumed than in C1, the bus interface unit is
still running. But depending on the processor, the local APIC
timer may be stopped. Like with C1, entering and exiting the
state are expected to be fast operations.
C3 This is the deepest conventional state. Parts of the CPU are
actively powered down. The internal CPU clock is stopped. The
local APIC timer is stopped. Depending on the processor, addi‐
tional timers such as x86/tsc(9) may be stopped. Processor
caches may be flushed. Entry and exit latencies are expected
to be high; the CPU can no longer “quickly” respond to bus
activity or other interruptions.
Each state has a latency associated with entry and exit. The higher the
state, the lower the power consumption, and the higher the potential per‐
formance costs.
The acpicpu driver tries to balance the latency constraints when choosing
the appropriate state. One of the checks involves bus master activity;
if such activity is detected, a lower state is used. It is known that
particularly usb(4) may cause high activity even when not in use. If
maximum power savings are desirable, it may be necessary to use a custom
kernel without USB support. And generally: to save power with C-states,
one should avoid polling, both in userland and in the kernel.
P-states
The processor performance states, or P-states, are used to control the
clock frequencies and voltages of a CPU. Underneath the abstractions of
ACPI, P-states are associated with such technologies as “SpeedStep”
(Intel), “PowerNow!” (AMD), and “PowerSaver” (VIA).
The P0-state is always the highest operating frequency supported by the
processor. The number of additional P-states may vary across processors
and vendors. Each higher numbered P-state represents lower clock fre‐
quencies and hence lower power consumption. Note that while acpicpu
always uses the exact frequencies internally, the user-visible values
reported by ACPI may be rounded or approximated by the vendor.
Unlike conventional CPU frequency management, ACPI provides support for
Dynamic Voltage and Frequency Scaling (DVFS). Among other things, this
means that the firmware may request the implementation to dynamically
scale the presently supported maximum or minimum clock frequency. For
example, if acpiacad(4) is disconnected, the maximum available frequency
may be lowered. By default, the NetBSD implementation may manipulate the
frequencies according to the notifications from the firmware.
T-states
Processor T-states, or “throttling states”, can be used to actively modu‐
late the time a processor is allowed to execute. Outside the ACPI nomen‐
clature, throttling and T-states may be known as “on-demand clock
modulation” (ODCM).
The concept of “duty cycle” is relevant to T-states. It is generally
defined to be a fraction of time that a system is in an “active” state.
The T0-state has always a duty cycle of 100 %, and thus, comparable to
the C0-state, the processor is fully active. Each additional higher-num‐
bered T-state indicates lower duty cycles. At most eight T-states may be
available, although also T-states use DVFS.
The duty cycle does not refer to the actual clock signal, but to the time
period in which the clock signal is allowed to drive the processor chip.
For instance, if a T-state has a duty cycle of 75 %, the CPU runs at the
same clock frequency and uses the same voltage, but 25 % of the time the
CPU is forced to idle. Because of this, the use of T-states may severely
affect system performance.
There are two typical situations for throttling: power management and
thermal control. As a technique to save power, T-states are largely an
artifact from the past. There was a short period in the x86 lineage when
P-states were not yet available and throttling was considered as an
option to modulate the processor power consumption. The approach was
however quickly abandoned. In modern x86 systems P-states should be pre‐
ferred in all circumstances. It is also more beneficial to move from the
C0-state to deeper C-states than it is to actively force down the duty
cycle of a processor.
But T-states have retained their use as a last line of defense against
critical thermal conditions. Many x86 processors include a catastrophic
shutdown detector. When the processor core temperature reaches this fac‐
tory defined trip-point, the processor execution is halted without any
software control. Before this fatal condition, it is possible to use
throttling for a short period of time in order to force the temperatures
to lower levels. The thermal control modulation is typically started
only when the system is in the highest-power P-state and a high tempera‐
ture situation exists. After the temperatures have returned to non-crit‐
ical levels, the modulation ceases.
System Control Variables
The acpicpu driver uses the same sysctl(8) controls for P-states as the
ones provided by est(4) and powernow(4). Depending on the processor, the
second-level node is either machdep.est or machdep.powernow. Please note
that future versions of acpicpu may however remove these system control
variables without further notice.
In addition, the following two variables are available.
hw.acpi.cpu.dynamic A boolean that controls whether the states are
allowed to change dynamically. When enabled,
C-, P-, and T-states may all change at run‐
time, and acpicpu may also take actions based
on requests from the firmware.
hw.acpi.cpu.passive A boolean that enables or disables automatic
processor thermal management via acpitz(4).
Statistics
The acpicpu driver uses event counters to track the times a processor has
entered a given state. It is possible to view the statistics by using
vmstat(1) (with the -e flag).
SEE ALSOacpi(4), acpitz(4), est(4), odcm(4), powernow(4), cpu_idle(9)
Etienne Le Sueur and Gernot Heiser, Dynamic Voltage and Frequency
Scaling: The Laws of Diminishing Returns,
http://www.ertos.nicta.com.au/publications/papers/LeSueur_Heiser_10.pdf,
October, 2010, Proceedings of the 2010 Workshop on Power Aware Computing
and Systems (HotPower'10).
David C. Snowdon, Operating System Directed Power Management, School of
Computer Science and Engineering, University of New South Wales,
http://ertos.nicta.com.au/publications/papers/Snowdon:phd.pdf, March,
2010, PhD Thesis.
Microsoft Corporation, Windows Native Processor Performance Control,
Version 1.1a, http://msdn.microsoft.com/en-us/windows/hardware/gg463343,
November, 2002.
Venkatesh Pallipadi and Alexey Starikovskiy, The Ondemand Governor. Past,
Present, and Future, Intel Open Source Technology Center,
http://www.kernel.org/doc/ols/2006/ols2006v2-pages-223-238.pdf, July,
2006, Proceedings of the Linux Symposium.
HISTORY
The acpicpu device driver appeared in NetBSD 6.0.
AUTHORS
Jukka Ruohonen ⟨jruohonen@iki.fi⟩
CAVEATS
At least the following caveats can be mentioned.
· It is currently only safe to use C1 on NetBSD. All other C-states
are disabled by default.
· Processor thermal control (see acpitz(4)) is not yet supported.
· Depending on the processor, changes in C-, P-, and T-states may all
skew timers and counters such as x86/tsc(9). This is neither handled
by acpicpu nor by est(4) or powernow(4).
· There is currently neither a well-defined, machine-independent API
for processor performance management nor a “governor” for different
policies. It is only possible to control the CPU frequencies from
userland.
BSD August 6, 2011 BSD