1 Active/Idle Toggling with 0BASE0BASE-x for Energy Efficient Ethernet November 2007 IEEE 802.3az Task Force Presenter: Robert Hays Intel Corporation Contributors: Dave Chalupsky, Eric Mann, James Tsai, Aviad Wertheimer 2 Agenda • • • • • • • GbE Controller Power & Energy Consumption Active/Idle Toggling with 0BASE-x Proposal Average Power Curves State Transitions Implications to 802.3 Specifications PHY Considerations Recommendations 3 Power & Energy Glossary Operating Power – (Watts) The rate at which electrical energy is delivered to a circuit or system – – • Peak Power – (Watts) Maximum operating power of a system Idle Power – (Watts) Operating power of a system at rest Energy Consumption – (Joules) Aggregate power consumed by a system over a period of time. – – Energy Efficiency – (Joules/bit) Energy required to complete a unit of work. E.g. energy required to transmit/receive each bit of data. Average Power – (Watts) Energy consumed divided by the period measured Peak Power Operating Power power • Idle Power t0 Energy Consumption = t1 t1 ∫ power(t) dt t0 Average Power = Energy Consumption / (t1-t0) 4 GbE Controller Power Measurements Single-Port PCI-e 10/100/1000 Controller (MAC+PHY) Power Measurements Average Power (mW) 1400 Test Information: • Intel® 82573L Gigabit Ethernet controller, including: – 1200 1000 – – 800 1217 600 1010 400 200 504 53 194 483 • • • 10/100/1000 PHY, MAC, buffers, PCI-Express x1 host interface Typical client PC NIC device .13µm fab process Idle = No traffic, 0 Mbps Active = Line-rate bi-directional traffic No Link = Cable removed (D0 state) 314 0 No Link Idle 10 100 1000 Link Rate (Mbps) Active Source: Intel labs Observations: • 100M power is ~40% of 1000M power at 10% performance • 100M & 1000M Idle power savings (vs. Active) is 17-35% – • 10M Idle power savings is 60% – • Savings come from PCIe L0s/L1 idle-state usage and turning off some circuits Additional savings from 10BASE-T idle signaling “No Link” is the lowest-power mode with the device still “on” (D0 state) – Savings come from turning off all unessential digital & analog circuits GbE Controller Energy Consumption 9.00 8.00 7.00 6.00 5.00 4.00 3.00 2.00 1.00 0.00 9.00 8.00 7.00 6.00 5.00 4.00 3.00 2.00 1.00 0.00 10 100 Energy Consumption (J) Energy Required to Transmit 10MB * Transfer Time (s) 5 1000 Link Rate (Mbps) Transfer Time (s) Energy Consumption (J) Source: Intel labs Observations: • 1000M transmission is 4x more energy efficient (J/bit) than 100M because it sends the same amount of data in 1/10th the time • 1000M is 40x more energy efficient than 10M • An “ideal” EEE solution would transmit data with minimal energy and return to low-power idle between packet bursts – An Idle state would need to be defined by 802.3 for higher-speed PHYs * Note: Bidirectional traffic. 10MB transferred in each direction. 6 Proposed “0BASE0BASE-x” Idle 0BASE-x is a “quiet” line idle that consumes minimum power – • 0BASE-x idle power expectations for a GbE device: – – • • ‘0’ = zero data rate, ‘x’ = Variable for any supported PHY type “No Link” ≤ 0BASE-x ≤ 10BASE-T Idle (e.g. 53mW ≤ 0BASE-x ≤ 194mW) Est. it to be closer to “No Link” Single-Port PCI-e 10/100/1000 Controller (MAC+PHY) Power Measurements 0BASE-x could take form of: 1400 – A newly defined idle signal or… 1200 – Reduced-voltage 10BASE-T idle 0BASE-x variations are likely – – Support for varying PHY types with unique timing requirements Possibly multiple sleep levels with differing resume latencies Average Power (mW) • 1000 800 1217 600 1010 400 200 504 53 194 483 314 0 No Link Idle Active Source: Intel labs 10 100 Link Rate (Mbps) 1000 7 Active/Idle Toggling with 0BASE0BASE-x Concept Active Idle • Principle: Transmit data at fastest rate then return to idle – • Active/Idle toggling could be used instead of PHY rate shifting – – – • Energy savings come from power cycling between active/idle states Offers the best energy efficiency on links with lower utilization Integrates well with existing PC power management schemes (e.g. ACPI) Clock & power gating (on/off) is easier than rate shifting Asymmetrical operation would provide even better energy efficiency – – – Each direction could enter active & idle states independently Most end-node traffic is heavily weighted toward either send or receive Tx & Rx data paths already operate independently above the PHY 8 Behavioral Comparison Off (No Link) Packets time IPG GbE 100Mb power PHY Rate Shifting: time Down-Shift Delay Up-Shift Delay GbE Active 0BASE-x Idle power Active/Idle Toggling with 0BASE0BASE-x: time Idle Interval Active Idle Delay Delay Lower Energy Consumption On (GbE) power Typical NIC Today (On or Off): 9 Conceptual Average Power vs. BW Utilization 6 Average Power (W) 5 4 Fast-Start 1G Subset-PHY 10G Subset-PHY Active/Idle Toggling 3 2 1 10 100 1000 10,000 Bandwidth Utilization (Mbps) • FastFast-Start offers course-grain power regulation via 10x PHY choices • SubsetSubset-PHY allows finer-grain power steps utilizing new PHY modes • Active/Idle Toggling with 0BASE-x allows smooth power averaging and lowest energy consumption on underutilized connections Average Power vs. BW Utilization Simulation Avg Power vs. Bandwidth Utilization Using Active/Idle Toggling 1300 1200 1100 1000 900 800 700 600 500 400 300 200 100 0 Average Power (mW) 10 Idle (65mw) 0.01% 0.10% 1.00% 10.00% 100.00% (100kb/s) (1Mb/s) (10Mb/s) (100Mb/s) (1Gb/s) Bandwidth Utilization Input Assumptions: •Traffic Input = Trace_VOIP_*.txt •1000Mbps Active Power = 1217mW •Idle Power = 65mW Source: Intel labs •Active (Resume) Delay = 10us •Idle (Sleep) Wait Period = 10us •Idle (Sleep) Delay = 1us 1000Mbps (1217mw) Comparison to Existing GbE Controller Power Avg Power vs. Bandwidth Utilization Using Active/Idle Toggling 1300 1200 1100 1000 900 800 700 600 500 400 300 200 100 0 Average Power (mW) 11 0.01% 0.10% 1.00% 10.00% 100.00% (100kb/s) (1Mb/s) (10Mb/s) (100Mb/s) (1Gb/s) Bandwidth Utilization Source: Intel labs Key (Active / Idle Power Assumptions): Active/Idle Toggling Simulation (1217mW / 65mW) 1000BASE-T Measurements (1217mW / 1010mW) 100BASE-TX Measurements (483mW / 314mW) 10BASE-T Measurements (504mW / 194mW) 12 Simulation Model Contribution • Intel will contribute the ‘Average Power vs. Bandwidth Utilization’ simulation model to the IEEE 802.3az Task Force • The “C” program source code and sample traffic pattern trace files will be posted on the EEE Tools web page: – http://grouper.ieee.org/groups/802/3/az/public/tools/index.html 13 Active/Idle State Transitions Tx_Idle Rx_Active Tx_Active Rx_Idle Transition Description Transition Initiator Tx_Active Transmit data path resumes to Active when the system wants to send data System Policy Manager (e.g. LAN Driver) Tx_Idle Transmit data path goes to Idle when there System Policy Manager is no data to send (e.g. LAN Driver) Rx_Active Receive data path resumes to Active when link partner wants to send data Link Partner Rx_Idle Receive data path goes to Idle when link partner has completed sending data Link Partner 14 Computer Networking System Block Diagram OS Kernel LAN Driver Software CPU Core(s) Processor CPU Interface (e.g. FSB) Memory Controller I/O Memory Buffer System Memory Memory Bus (e.g. DDR2) I/O Bus Chipset Host Interface (e.g. PCI-Express) I/O Bus Data Buffers Filters, Queues, & Protocol Offloads Ethernet MAC MAC/PHY Interface Controller MAC/PHY Interface (e.g SGMII, XAUI) MAC/PHY Interface DSP Tx Rx Hybrid PHY Media-Dependant Interface (e.g. Cat6 cable) 15 Tx_Active Transition 1. LAN Driver manages Tx toggling policy. Example: It monitors I/O Memory Buffer for data to reach a fill threshold, then initiates Tx_Active OS Kernel LAN Driver Software 2. LAN Driver tells Controller when to enter Tx_Active 3. Controller enters Tx_Active and begins buffering data CPU Core(s) Processor CPU Interface (e.g. FSB) Memory Controller I/O Memory Buffer System Memory Memory Bus (e.g. DDR2) I/O Bus Chipset Host Interface (e.g. PCI-Express) I/O Bus Data Buffers 4. Controller tells PHY to resume Tx_Active and then sends data to PHY 5. Local PHY enters Tx_Active 6. Local PHY sends it’s link partner a “start-transmit” frame and begins transmitting data after a specified delay Filters, Queues, & Protocol Offloads Ethernet MAC MAC/PHY Interface Controller MAC/PHY Interface (e.g SGMII, XAUI) MAC/PHY Interface DSP Tx Rx Hybrid PHY Media-Dependant Interface (e.g. Cat6 cable) Key: Vendor Dependent 802.3az Specified 16 Tx_Idle Transition 1. LAN Driver manages the Tx toggling policy. Example: It monitors I/O Memory Buffer for it to empty, then initiates Tx_Idle OS Kernel LAN Driver Software 2. LAN Driver tells Controller when to enter Tx_Idle CPU Core(s) Processor CPU Interface (e.g. FSB) Memory Controller I/O Memory Buffer System Memory Memory Bus (e.g. DDR2) I/O Bus Chipset Host Interface (e.g. PCI-Express) I/O Bus Data Buffers Filters, Queues, & Protocol Offloads Ethernet MAC 3. Controller tells PHY to enter Tx_Idle MAC/PHY Interface Controller 4. Controller enters Tx_Idle 5. Local PHY stops transmitting and enters Tx_Idle MAC/PHY Interface (e.g SGMII, XAUI) MAC/PHY Interface DSP Tx Rx Hybrid PHY Media-Dependant Interface (e.g. Cat6 cable) Key: Vendor Dependent 802.3az Specified 17 Rx_Active Transition OS Kernel LAN Driver Software CPU Core(s) Processor CPU Interface (e.g. FSB) Memory Controller I/O Memory Buffer System Memory Memory Bus (e.g. DDR2) I/O Bus Chipset Host Interface (e.g. PCI-Express) 5. Controller places data into I/O Memory Buffer I/O Bus Data Buffers 1. Local PHY receives “starttransmit” frame from it’s link partner Filters, Queues, & Protocol Offloads Ethernet MAC MAC/PHY Interface Key: Vendor Dependent 802.3az Specified Controller 2. PHY tells controller to enter MAC/PHY Interface Rx_Active (e.g SGMII, XAUI) MAC/PHY Interface DSP Tx 3. PHY enters Rx_Active and begins receiving data 4. Controller enters Rx_Active and receives data Rx Hybrid PHY Media-Dependant Interface (e.g. Cat6 cable) 18 Rx_Idle Transition OS Kernel LAN Driver Software CPU Core(s) Processor CPU Interface (e.g. FSB) Memory Controller I/O Memory Buffer System Memory Memory Bus (e.g. DDR2) I/O Bus Chipset Host Interface (e.g. PCI-Express) I/O Bus Data Buffers Filters, Queues, & Protocol Offloads Ethernet MAC 1. Local PHY detects Idle signal and enters Rx_Idle MAC/PHY Interface Key: Vendor Dependent 802.3az Specified Controller MAC/PHY Interface (e.g SGMII, XAUI) MAC/PHY Interface 3. Controller enters Rx_Idle DSP Tx 2. PHY tells controller to enter Rx_Idle Rx Hybrid PHY Media-Dependant Interface (e.g. Cat6 cable) 19 Implications to IEEE 802.3 Specifications OS Kernel LAN Driver Software CPU Core(s) Processor CPU Interface (e.g. FSB) Memory Controller I/O Memory Buffer System Memory Memory Bus (e.g. DDR2) I/O Bus Chipset Host Interface (e.g. PCI-Express) Vendor dependent (unspecified): •Tx state transition control & policy •System interface & data movement I/O Bus Data Buffers Filters, Queues, & Protocol Offloads Ethernet MAC MAC/PHY Interface Controller MAC/PHY Interface (e.g SGMII, XAUI) MAC/PHY Interface DSP Tx Rx Hybrid PHY Media-Dependant Interface (e.g. Cat6 cable) 802.3 specifications: •0BASE-x signaling for each PHY type •Active/idle transition signals •Transition timing requirements •MAC/PHY interface control •Control/status registers for active & idle states •Auto-negotiation capability registers •Clause 30 MIB implications •Error detection & recovery 20 0BASE0BASE-x PHY Considerations* 1. Idle power consumption – How low can Idle get for each PHY type? 2. Transition delays – How quickly can the PHY resume active operation? 3. Timing recovery – How to maintain link status and clock sync? 4. Asymmetric operation – How to support transmission one way while the other is idle? 5. Implementation cost & complexity – How would 0BASE-x compare to Fast-Start or Subset-PHY? *Note: These considerations are addressed in zimmerman_01_1107.pdf 21 Recommendations to 802.3az Task Force • Define a 0BASE-x Idle state for each supported PHY type in the IEEE 802.3az Objectives • Consider Active/Idle Toggling with 0BASE-x as an alternative to PHY Rate Shifting for Energy Efficient Ethernet 22
© Copyright 2026 Paperzz