PDF

Designing for Stratix 10 Devices
with Power in Mind
AN-767
2016.06.14
Subscribe
Send Feedback
Contents
Contents
1 Designing for Stratix 10 Devices with Power in Mind....................................................... 3
1.1 Power Optimization Techniques and Recommendations................................................ 3
1.2 Power Estimation with Resource Utilization.................................................................6
Document Revision History................................................................................................ 8
Designing for Stratix 10 Devices with Power in Mind
2
1 Designing for Stratix 10 Devices with Power in Mind
1 Designing for Stratix 10 Devices with Power in Mind
Stratix® 10 devices offer advanced features and capabilities for power reduction such
as SmartVID, power gating of unused blocks, low-power transceivers, low-voltage
devices, and low-static power devices.
Additionally, Stratix 10 devices are the only high-performance FPGAs and
programmable SoCs developed on Intel’s industry-leading 14 nm Tri-Gate processes,
offering up to 70% lower power compared to the previous generation.
This application note highlights the power optimizing strategies that you can
implement while you design using the Stratix 10 FPGAs. It also showcases the power
consumption statistics for resources utilized in various design scenarios.
1.1 Power Optimization Techniques and Recommendations
Device Availability with Power Option
The suffix after the speed grade in part number denotes the power options offered in
Stratix 10 devices:
•
V—SmartVID
•
L—Low Power (Fixed voltage)
•
X—Extreme Low Power (Fixed voltage)
L devices have 0.85V fixed voltage and are binned for low static power. These are
speed grade 2 devices.
X devices have 0.8V fixed voltage and are binned for the lowest static power. These
are speed grade 3 devices.
SmartVID devices have “standard” static power. These are speed grade 1, 2, and 3
devices.
SmartVID
The SmartVID feature compensates the process variation by narrowing the process
distribution using voltage adaptation. Instead of a constant voltage, SmartVIDenabled devices will opportunistically adjust the voltage of the device for optimal
power while at the same time meeting its performance goals. To save power, voltage
is reduced on devices that have performance in excess of what is required to meet
specifications.
SmartVID allows a power regulator to provide the Stratix 10 devices with lower VCC
and VCCP voltage levels while maintaining the performance of the specific device speed
grade. When SmartVID is used, Stratix 10 devices must be powered up to a default
voltage level for both VCC and VCCP. After the VID value in the Stratix 10 device is
determined and propagated to the external voltage regulator, both the VCC and VCCP
©
2016 Intel Corporation. All rights reserved. Intel, the Intel logo, Altera, Arria, Cyclone, Enpirion, MAX,
Megacore, NIOS, Quartus and Stratix words and logos are trademarks of Intel Corporation in the US and/or
other countries. Other marks and brands may be claimed as the property of others. Intel warrants performance
of its FPGA and semiconductor products to current specifications in accordance with Intel's standard warranty,
but reserves the right to make changes to any products and services at any time without notice. Intel assumes
no responsibility or liability arising out of the application or use of any information, product, or service
described herein except as expressly agreed to in writing by Intel. Intel customers are advised to obtain the
latest version of device specifications before relying on any published information and before placing orders for
products or services.
ISO
9001:2008
Registered
1 Designing for Stratix 10 Devices with Power in Mind
voltages are regulated based on the VID value. SmartVID voltages can vary between
0.8V and 0.94V, in 10mV increments. For more information, refer to the Stratix 10
Power Management User Guide.
DSP Power Gating
Stratix 10 devices support static power gating for the DSP blocks, which eliminates
their static power consumption when they are not used. The Quartus® Prime software
automatically formulates static power gating for DSP blocks that are not used. Power
gating of the DSP blocks is enabled via the Configuration RAM (CRAM) bits.
Stratix 10 devices also support DSP partial reconfiguration. The Quartus Prime
software generates a bitstream that powers up DSP blocks as required during partial
reconfiguration.
Intel recommends using built-in DSP registers whenever possible for optimal power
savings. A study of various designs that used built-in DSP registers versus none
showed upto 50% decrease in power consumption.
M20K Power Gating
The Stratix 10 M20K memory block can also be static power gated. Each half of the
memory array can be powered down via PMOS sleep devices that power them. The
Quartus Prime software uses this feature to shut down the power supply for the parts
of a memory array that are not used.
The Quartus Prime software generates a bitstream that powers up M20K memory
blocks as required during partial reconfiguration.
The mode of a M20K block can influence its power consumption. As shown below, for
the same number of memory blocks (8500 M20K blocks) and toggle rate (40%),
power consumption depends on the respective memory type.
Figure 1.
M20K Power Consumption Comparison For Different Configurations
Watts
18
16
14
12
10
8
6
4
2
Static Power
Dynamic Power
True Dual
Port
Simple Dual
Port
Single Port
Clock Gating
Clock gating can reduce dynamic power consumption. When an application is idle, its
clock can be gated temporarily and ungated based on wake-up events. You can
achieve dynamic power reduction by gating the clock signals to any circuitry that is
determined to be inactive as per your design requirements. Clock gating can be
performed at the following levels:
Designing for Stratix 10 Devices with Power in Mind
4
1 Designing for Stratix 10 Devices with Power in Mind
•
Root Clock Gate
There is one root clock gate per I/O bank and transceiver bank. This gate is part
of the periphery DCM (Distributed Clock Multiplexer) and is located close to the
clock buffer. The Stratix 10 root clock gate is intended for limited clock gating
scenarios where high insertion delays can be tolerated. When you enable a root
clock gate, expect a delay of several clock cycles between the assertion of the
clock gate and the corresponding change on the output clock signal. For a high
frequency clock, use the SCLK (sector clock) gating. For more information, refer to
the Stratix 10 Clocking and PLL User Guide.
•
Sector Clock Gate
Every Stratix 10 FPGA is divided into sectors. Each sector has its own clock
network which provides more flexibility. The section clock gating is done at the
SCLK multiplexor level. There are 32 SCLKs in each sector of the device. Each
SCLK has a clock gate and by-passable clock gate path. The SCLK gates are
controlled by clock enable inputs from the core logic. The Quartus Prime software
can route up to eight different clock enable signals to the 32 SCLKs in a sector.
The clock signal going into the SCLK network in a sector can only reach the core
logic in that sector.
When you instantiate a SCLK gate in your design, the Quartus Prime software
automatically duplicates the SCLK gate to create a clock gate in every sector to
which the clock signal is routed. The SCLK gate is suitable for cycle-specific clock
gating for high-frequency clocks. The timing of the path to SCLK gate is analyzed
by the Quartus Prime software.
•
I/O PLL Clock Gate
Each output counter of the Stratix 10 I/O PLL can be dynamically gated. This
provides a useful alternative to the root clock gate as the root clock can only gate
one out of the nine output counters.
However, the I/O PLL clock gate is not cycle-specific. While using the I/O PLL clock
gate, expect a delay of several clock cycles between the assertion or deassertion
of the clock gate and the corresponding change to the clock signal. The number of
delay cycles is non-deterministic because the enabled signal must be synchronized
into the clock domain of the output clock. This ensures a glitch-free gate. For
more information, refer to the Stratix 10 Clocking and PLL User Guide.
Power Savings while Using Transceivers
Stratix 10 devices feature power-efficient, high-bandwidth, and low latency
transceivers. For optimal static and dynamic power savings, Intel recommends using
the lowest transceiver voltage (VCCR / T_GXB) that supports your respective data rate
and protocol requirements..
Designing for Stratix 10 Devices with Power in Mind
5
1 Designing for Stratix 10 Devices with Power in Mind
1.2 Power Estimation with Resource Utilization
The following estimates are based on hypothetical designs (referenced from previous
generation devices) for Stratix 10 devices. Both static and dynamic power
consumption values are derived using Intel’s PowerPlay® Early Power Estimator.
Table 1.
•
Device—SG280 with F43 package
•
Device type—X
•
Junction Temperature—100°C
Power Estimation for Core Logic/FPGA Fabric
Dynamic Power
Resource Configuration
Static
Power 1
Low utilization ~ 50%
800K half-ALMs
High utilization ~ 90%
1.7M half-ALMs
Low speed configuration:
• 500 MHz (Max. CLK)
• 312 MHz (Avg. weighted CLK)
NA
18 W
40 W
High speed configuration:
• 750 MHz (Max. CLK)
• 468 MHz (Avg. weighted CLK)
NA
27 W
56 W
Table 2.
Power Estimation for M20K—20Kb Internal Memory Blocks
Dynamic Power
Resource Configuration
Static
Power
Low utilization ~ 40%
4600 memory blocks
High utilization ~ 70%
8500 memory blocks
Low speed configuration:
• Single port
• 500 MHz, 40% toggle
• 70% R/W, 70% enable
• 5 bits wide, 4096 bits deep
2W
3.7 W
7W
High speed configuration:
• True dual port
• 800 MHz, 40% toggle
• 70% R/W, 70% enable
• 20 bits wide, 1024 bits deep
2W
16.5 W
30 W
Table 3.
Power Estimation for DSP Block
Dynamic Power
Resource Configuration
Low speed configuration:
Static
Power
2.5/4 W
Low utilization ~ 40%
2300 DSP blocks
High utilization ~ 70%
4000 DSP blocks
5W
8.6 W
continued...
1 Independent of logic usage.
Designing for Stratix 10 Devices with Power in Mind
6
1 Designing for Stratix 10 Devices with Power in Mind
Dynamic Power
Resource Configuration
•
•
•
•
Static
Power
Low utilization ~ 40%
2300 DSP blocks
High utilization ~ 70%
4000 DSP blocks
40 W
66 W
500 MHz, 15% toggle
3 registered stages
Without pre-adder
With coefficient
High speed configuration:
• 800 MHz, 15% toggle
• 0 registered stages
• With pre-adder
• Without coefficient
Table 4.
2.5/4 W
Power Estimation for Transceivers
Dynamic Power
Resource Configuration
Static
Power
Low utilization
16 channels
High utilization
96 channels
Low speed configuration:
• 16 channels
• 16 channels (PCIe Gen3)
• 40 channels of 10G Ethernet with 1588
• 24 channels @ 17.4 Gbps
2W
5W
40 W
High speed configuration:
• 16 channels (PCIe Gen3)
• 80 channels @ 17.4 Gbps
2W
7W
45 W
Table 5.
Power Estimation for Clocks
Resource Configuration
Static
Power
Dynamic Power
Low utilization ~ 50%
High utilization ~ 90%
Low speed configuration:
• 254 MHz Avg. weighted CLK
• 75% global + local enabled (with some
CLK gating)
1W
3W
6W
High speed configuration:
• 364 MHz Avg. weighted CLK
• 100% global + local enabled (without
CLK gating)
1W
7W
13.5 W
Stratix 10 devices are significantly bigger and faster in terms of density and
performance than previous generations of FPGA devices. Correspondingly, there is a
increase in power consumption even with the increase in power efficiency. Therefore,
you should leverage the power reduction capabilities discussed in this application note
and ensure that you plan for the thermal implications of the power consumption in
your Stratix 10 FPGA designs. For more information on thermal solutions for the
Stratix 10 device designs, consult your Intel support team.
Related Links
Stratix 10 Device Overview
Provides more information about Stratix 10 devices.
Designing for Stratix 10 Devices with Power in Mind
7
Document Revision History
Document Revision History
Table 6.
Revision History
Date
2016.06.14
Changes
Initial release.
©
2016 Intel Corporation. All rights reserved. Intel, the Intel logo, Altera, Arria, Cyclone, Enpirion, MAX,
Megacore, NIOS, Quartus and Stratix words and logos are trademarks of Intel Corporation in the US and/or
other countries. Other marks and brands may be claimed as the property of others. Intel warrants performance
of its FPGA and semiconductor products to current specifications in accordance with Intel's standard warranty,
but reserves the right to make changes to any products and services at any time without notice. Intel assumes
no responsibility or liability arising out of the application or use of any information, product, or service
described herein except as expressly agreed to in writing by Intel. Intel customers are advised to obtain the
latest version of device specifications before relying on any published information and before placing orders for
products or services.
ISO
9001:2008
Registered