GR740 Quad-Processor LEON4FT System-on-Chip

The most important thing we build is trust
ADVANCED ELECTRONIC SOLUTIONS
AVIATION SERVICES
COMMUNICATIONS AND CONNECTIVITY
MISSION SYSTEMS
GR740 Quad-Processor LEON4FT System-on-Chip Overview
Cobham Gaisler
Date: 2017-April-12
Presenter:
Outline
•  Processor evolution European/ESA/Cobham perspective
•  GR740 overview
– Highlights, Development board, Benchmark results, Power
consumption, Radiation tolerance
•  GR740 schedule and product table
•  Features summary and architecture overview
•  How to use GR740 / New features
•  Key Performances
•  GR740 vs. UT699/UT699E/UT700, GR740 vs. GR712RC
•  Conclusion
1
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
Friendly Reminder
Is there a later version?
Please see http://www.gaisler.com/GR740 for the latest
version of this overview presentation
2
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
Processor Evolution – ESA Perspective
•  1989: MDC281 (MIL-STD-1750), 2.5 um CMOS/SOS, 0.5 MIPS
•  1991: MA31750 (MIL-STD-1750), 1,5 um CMOS/SOS, 2 MIPS
•  1990-92: SPARC V7 architecture selected as ESA baseline
•  1995: Three-chip ERC32 (TCS691/2/3), 10 MIPS
•  1997: Single-chip ERC32 (TSC695), 15 MIPS
•  1997: SPARC V8 LEON VHDL development begins
•  2000: First LEON1FT, 0.35 um, 50 MIPS
•  2002: First LEON2FT, 0.18 um, 100 MIPS
•  2004: First LEON3FT, 0.20 um, 125 MIPS
•  2009: First dual-core LEON3FT, 180 nm, 250 MIPS (GR712RC)
•  2010: First LEON4 (LEON4-DEMO / GR-LEON4-ITX)
•  2012: First quad-core LEON4FT, 45 nm (LEON4-N2X)
•  2016: First quad-core LEON4FT rad-hard silicon
3
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
LEON Space Processor Evolution
Background
Timeline
Description
1995-1997
ERC32 (single & multichip)
SPARC V8 LEON Development (32bit)
2000-2002
2004
LEON1FT, LEON2FT
LEON3FT (deeper pipeline, AMBA 2.0 bus)
2007
First LEON3FT by Cobham Aeroflex, UT699 (250nm)
UT699 (QML-V)
2009
First Dual core LEON3FT (180nm)
GR712RC (Class S-like)
2014
Enhanced (faster, lower power) LEON3FTs (130nm)
UT699E, UT700
(QML Q/Q+)
2016
First Quad-Core Standard Product LEON4FT (65nm)
GR740 (prototype)
4
Standard Product(s)
GR740 Quad-Core LEON4FT History
ESA Next Generation Microprocessor
•  Part of ESA’s Next Generation Microprocessor (NGMP) activity
–  Meet demands for increased processing power
–  Roadmap for standard Microprocessor components
–  Target technology: European 65nm (STMicroelectronics)
5
Timeline
Milestone
Part or
Board
2010
Development board with non FT quad LEON4
GR-LEON4-ITX
2012
First Quad Core LEON4, 45nm
LEON4-N2X
2014
Aeroflex Implementation, Quad LEON4FT; SEC for
Boeing 90nm RHBD Library Qual
UT840
2016
Quad Core LEON4FT, 65nm First Silicon and
Validation & Radiation Test Campaign
GR740
GR-CPCI-GR740
2017
2018
GR740 second silicon revision
QML Qualification
GR740
Highlights of LEON3FT vs LEON4FT
High Level Architectural Improvements
GR712RC LEON3FT
•  Dual core
•  Shared AMBA AHB (32 bit) for
CPUs, Memory, Debug, etc.
GR740 LEON 4FT
•  Quad Core
•  Wider (128 bit) CPU/Memory bus
•  Dedicated Debug bus
•  Addition of L2-Cache
•  Hardware Memory Scrubber
•  Improved partitioning
•  Integrated SpW router
(8 port)
•  Performance counters
•  Supports AMP & SMP
•  Improved Support for boot
over PCI/RMAP
6
GR740 Quad-Core LEON4FT
Highlights
•  Quad-core LEON 4FT rad-tolerant SoC device
• 
• 
• 
• 
• 
• 
•  4x LEON 4FT with dedicated FPU and MMU
•  128 KiB L1 caches connected to 128-bit bus
•  2 MiB L2 cache, 256-bit cache line, 4-ways
•  64-bit SDRAM memory I/F (+32 check bits)
•  8-port SpaceWire router with +4 internal ports
•  32-bit 33 MHz PCI interface
•  2x 10/100/1000 Mbit Ethernet
•  Debug links: Ethernet, JTAG, SpaceWire
•  MIL-STD-1553B, CAN 2.0B, 2 x UART
•  SPI master/slave, GPIO, Timers & Watchdog
625 CLGA Ceramic Package, wirebond, ~30 mm x 30mm
Supply Voltage Core: 1.2V & I/O: 2.5V/3.3V
Supported OS: RTEMs, VxWorks, Linux 3.10+
Prototype Devices & Development Board Available Now
Documentation and Tools Available Now
Preliminary Validation/Radiation Test Results Available
7
Development Board and Development SW
•  Development board with GR740 device.
–  6U CPCI format (Double Eurocard).
–  Boxed version for bench top development.
–  http://www.gaisler.com/gr-cpci-gr740
–  256 MiB SDRAM, 8 MiB NOT Flash
–  2x Ethernet, 8x SpaceWire, PCI, UART,
CAN, 1553, PROM/IO, GPIO and debug
interfaces available
•  Operating systems
–  RTEMS
–  WindRiver VxWorks
–  Linux 3.10+
•  Other OSs/environments already ported to LEON include:
–  Bare C, ThreadX, PikeOS, XtratuM, ..
•  Supported by GRMON2 debug monitor
•  Board user manual available at: http://gaisler.com/gr-cpci-gr740
8
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
GR740 Quad-Core LEON4FT
Package
•  GR740 uses a CLGA625 package
– Hermetic ceramic package
– Wire-bond connection to ASIC
– Approx 30 mm x 30 mm
9
12 april 2017
Cobham plc
GR740 Quad-Core LEON4FT
Advantages and Highlights on Performance / Functional Validation
•  Advantages:
–  More Processing power with Higher integration, addressing Size Weight and Power
–  Software reuse from prior LEON designs
–  Supported by Gaisler Support tools & Industry standard OSs
•  Worst-case frequency of 250 MHz in production tests, over full temp range (-40 to +125oC)
–  >380 MHz operation over temperature range tested
•  Benchmarking Results (~1000 MIPS, 1700 DMIPS)
–  Shows approaching 4X performance with quad core vs single core
•  Power Consumption Results (including some I/O) at 25°C:
–  4x CPU: 1.8 W
•  Various Radiation Testing Validation
–  Target: TID: 300 krad (Si)
–  Target: Single-Event Latch-Up (SEL) Immunity to an effective LETTH of 60 MeV-cm2/mg
–  Single Event Effects Test Results to be shared
10
GR740 Benchmark Results
Single Core Results (@ 250MHz) and Comparison to LEON 3FT devices:
Benchmark
Value
(Single
Core)
Description
Dhrystone
459 DMIPS
1.84 DMIPS/
MHz
Classic integer benchmark,
normalized to a historic
processor architecture
Whetstone
200 MWIPS
0.8 MWIPS/
MHz
Classic floating-point
benchmark
Linpack
23 MFLOPS
Floating-point benchmark
based on matrix ops
EEMBC
CoreMark
512 CoreMarks
Improved integer benchmark
EEMBC
CoreMarkPro
73 CoreMarks
Improved integer and FP
benchmark for
high-performance
architectures
EEMBC
AutoBench
112 AutoMarks
Comprehensive suite of 16
benchmark kernels
EEMBC
FPMark
189 FPMarks
Improved floating-point
benchmark
EEMBC: Embedded Microprocessor Benchmark Consortium
11
Speedup: amount of time it would take the part do
to the same amount of work:
•  Old execution time/ New execution time
•  GR740 is ~5X better than UT699
GR740 Benchmark Results
Multicore Results using PARSEC 2.1 Benchmarks
•  Multi-threaded benchmark representative of shared-memory programs for multicore processors
•  The suite used contained 10 programs from many real world applications, such as computer
vision, video encoding, financial analytics (some use parallelism, great use of multiple CPUs)
•  For some, Quad cores shows ~3.5-3.8x improvement over Single CPU (core)
GR740 Technical Note on Benchmarking and Validation:
http://www.gaisler.com/doc/gr740/GR740-VALT-0010.pdf
12
Micro-benchmark example
Results from one of the custom benchmarks
•  Non-blocking L2 cache using SPLIT protocol
– New feature for L2 cache in GR740 (improvement from functional
prototype NGFP/LEON4-N2X)!
– CPU waiting on an L2 cache miss does not block remaining CPU:s
from being served.
– Reduce execution time interference between processors.
•  Micro-benchmark from BSC showing 3.3x improvement in a
extreme scenario (one CPU hits always, others miss always)
13
GR740 Power Consumption Results
Function of Workload, CPU & Memory speed + I/O
Workload
CPU/SDRAM Freq
Core power
Core + I/O power
Reset
250 MHz / 100 MHz
41 mW
621 mW
Idle
250 MHz / 100 MHz
306 mW
912 mW
Single core SPECCPU2000
250 MHz / 100 MHz
645 mW
1213 mW
250 MHz / 100 MHz
1161 mW
1804 mW
Single core SPECCPU2000
50 MHz / 50 MHz
156 mW
645 mW
Quad core SPECCPU2000
50 MHz / 50 MHz
292 mW
765 mW
Quad core SPECCPU2000
•  SPEC CPU2000 is a software benchmark product produced by the Standard Performance
• 
• 
• 
Evaluation Corp. (SPEC)
–  Designed to provide performance measurements that can be used to compare
compute-intensive workloads (real user applications)
Measurements at room temperature, nominal voltage
I/O power includes digital and LVDS I/Os
Comparison to UT700 single LEON3FT:
CPU Speed
14
Core power
50M MHz
380 mW
100 MHz
1041 mW
Power Consumption
•  Power consumption measurement results: 1.8 W with all 4 cores at full
speed and LVDS enabled (room temperature)
•  Power consumption described in
http://gaisler.com/doc/gr740/GR740-VALT-0010.pdf
15
GR740 Radiation Tolerance Features
Summary
•  GR740 implemented on ST 65nm bulk CMOS using the C65SPACE cell libraries
–  Library contains radiation-hardened variants of the gates, SRAM blocks, rad
hardened PLL and IO buffers
–  Previously tested results on this technology: SEL immune to an effective
LETTH of 60 MeV-cm2/mg at 125°C Tj and Vdd max; TID up to 300 krad(Si)
–  Layout and back end design used specific rules and techniques for Space
applications
•  Various trade offs performed between hardness level and functional
performance on key aspects of the product implementation
–  Mix of radiation hardened library cells and commercial cells in TMR to
achieve power efficient and performant design
–  Hardened PLLs and Hardened Clock tree
–  CPU integer and FPU register file built out of flip flops using triplication of a
whole register file
16
GR740 Radiation Tolerance Features
Summary (Continued) - Memory Aspects
•  All on-chip memories used single bit error mitigation (varying methods) to
handle memory SEUs
– Memory blocks built with sufficiently high spatial distribution (mux
factor) of adjacent bits to avoid multi-bit upsets in the same memory
word
– L1 Cache uses parity protection
– L2 Cache: uses single error correct, double error detect (SECDED)
protection via BCH, along with periodic (programmable) scrubbing
– Other memories in the design use error correction
•  Duplication with parity
OR
•  Triplication and majority voting of the data on each memory
address
17
GR740 Radiation Test Campaign
First Silicon Testing in 2016
•  Radiation environment has been verified through a Series of Single Event
Effects test campaigns with both heavy-ions and protons in 2016
– Focus: SEU testing on 1st silicon
Month
Test
Facility/Location
April
1st Heavy Ion
RADEF/ Finland
June
2nd Heavy Ion
UCL (HIF) / Belgium
Sept./Oct
3rd Heavy Ion
Lab / Sweden
Nov.
4th Heavy Ion
UCL (HIF) / Belgium
Nov.
Medium energy Proton tests
UCL (LIF) / Belgium
•  Errata encountered with L2 Cache
– Handling of correctable tag errors and uncorrectable errors
– Reproduced with error injection. Respin to fix issue in final silicon
(ongoing)
– Software workarounds proposed and prototyped
18
GR740 Radiation Test Results
First Silicon Testing in 2016
•  Excluding L2 Cache issue above, Worst-case results, for LEO orbit, polar
inclination, are as follows:
– 1E-4 events/device-day
– This results in a mean time between events, i.e. two software
crashes due to uncorrectable errors, of over 27 years in actual
operation
– All errors are recoverable by resetting the device
Results
Orbit
1E-4 events/ device - day
LEO- Polar
850km, 98.7 degree inclination, AP8 min, Solar minimum, Z=1-92
Al equivalent shielding = 1g/cm2
– For GEO & LEO orbits (lower angle of inclinations): error rates will be
lower
19
GR740 Radiation Tolerance
Results of radiation tests of first silicon
•  Device operation in a radiation environment has been verified through a series of Single
• 
• 
20
Event Effects test campaigns with both heavy-ions and protons. As part of these tests,
errata have been found in the way the Level-2 cache handles correctable tag errors and
uncorrectable errors. These corner cases require additional mitigation at software level if
the device is to be exposed to either heavy-ions or protons. The issues found in this first
silicon have been analysed and reproduced with error injection simulations that targeted
the Level-2 cache and/or the external SDRAM. A solution has been developed, possibly
requiring a software reset, and the issues are planned to be corrected in a future revision
of the device.
Excluding the effect of the errata mentioned above, the worst-case Single Event Effect
rate estimated for a LEO-Polar orbit environment (850km, 98.7 degree inclination, AP8
min, Solar minimum, Z=1-92, Al equivalent shielding = 1g/cm2) is 1E-4 events per device
and day. This results in a mean time between events, i.e. two software crashes due to
uncorrectable errors, of over 27 years in actual operation. All errors are recoverable by
resetting the device. In GEO and for LEO orbits with lower inclinations than polar ones
the rates will be lower.
The C65SPACE technology used for this device has been experimentally confirmed to be
SEL-free up to LET = 60MeV/mg/cm2 at 125°C Tj and Vdd max. TID robustness up to
300 krad (Si) has also been confirmed using test vehicles. Both SEL immunity and TID
robustness are planned to be validated for the GR740 product during radiation tests of
the next silicon revision.
GR740 Schedule
Availability and Qual Schedule
•  First silicon prototype devices & Development board: Available Now
•  Supported by Cobham Gaisler software packages and development tools
•  Schedule:
Description
Time line
Comment
Tape out of final silicon
17Q3
Minor respin to address L2 cache issue
New silicon prototypes
17Q4 / 18Q1
Tested prototypes available Q1 2018
Full Radiation testing of new silicon
Q1 2018
QML Qual Start
Q1 2018
Qual completion anticipated by Q4 2018
•  Daisy-chain package lead time: 5 months (need to manufacture substrate)
•  The Qualification is Funded via ESA and Cobham IRAD
–  Swedish Funding via ESA (Swedish National Space Board)
•  Latest news: http://www.gaisler.com/GR740
21
GR740 Product Table
Available models
22
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
GR740 Product Table
Ordering legend
23
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
Features Summary
Core components
•  System-on-chip
– 4 x LEON4 fault tolerant CPU:s with L1 cache, MMU and FPU
– 2 MiB Level-2 cache
– 96/48-bit SDRAM controller with EDAC and scrubber
– 8/16-bit PROM/IO controller with EDAC
– 5 x Timer, 5 x IRQ controller
– On-chip AHB bus infrastructure
– IOMMU for peripheral DMA
– PLLs for clock generation
– Communication interfaces
24
12 april 2017
Cobham plc
Features Summary
Block diagram
•  Architecture block diagram (simplified)
25
12 april 2017
Cobham plc
Features Summary
Interfaces
•  Interfaces
– 8-port Spacewire router with on-chip LVDS
– 2 x 1Gbit/100Mbit Ethernet MAC (MII/GMII)
– PCI master/target with DMA, 33 MHz
– Dual-redundant CAN
– MIL-STD-1553B interface (bus A/B)
– 2 x UART
– 16 x GPIO
•  Debug interfaces (for GRMON connection)
– Ethernet EDCL (using either of the two MACs above)
– JTAG
– Spacewire RMAP (using separate GRSPW2 for debug only)
26
12 april 2017
Cobham plc
Features Summary
Interface restrictions
•  Some functions have been multiplexed onto the same pins to fit
into package pin count
•  Either PCI or second Ethernet (not both) can be enabled only
when SDRAM is in 48-bit mode.
– Configured “hard” via bootstrap signals.
– Selection of
(1) 96-bit SDRAM + 1xETH
(2) 48-bit SDRAM + 2xETH
(3) 48-bit SDRAM + 1xETH + PCI
•  CAN,1553,UART,SpwDebug are shared with PROM top address
bits and part of 16-bit PROM data bus unused in 8-bit mode.
– Configurable pin-by-pin between PROM or peripheral function.
– Pins that are not used for either function can be used as additional
GPIO.
27
12 april 2017
Cobham plc
GR740 Device Overview
Block Diagram
•  Bus topology of five AMBA AHB buses: Processor, Memory,
Master IO, Slave IO and Debug. Low-speed peripherals via APB
•  Debug AHB bus and corresponding core are gated-off in flight.
28
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
GR740 Device Overview
LEON4FT and GRFPU
•  LEON4FT – IEEE1754 SPARC V8 compliant 32-bit processor
– 7-stage pipeline, multiprocessor support
– 128-bit AHB bus interface
– Compare-and-swap (CASA) instruction support (from SPARCv9)
– 1.7 DMIPS/MHz, 0.6 Wheatstone MFLOPS/MHz
– Estimated 0.35 SPECINT/Mhz, 0.25 SPECFP/MHZ (CPU2000)
•  GRFPU
– High-performance FPU integrated into LEON4 pipeline
– Hardware DIV and SQRT
– Floating-point controller (FPC) decouples FP operations from
pipeline, allowing CPU and FPU to work in parallel
29
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
GR740 Device Overview
Caches
•  Level-1 cache
– Separate L1 integrated into each LEON4 core
– Multi-set with configurable LRU/LRR/RND policy
– Write-through operation
– Bus snooping with physical tags to maintain coherency
•  Level-2 cache
– Designed as a bridge in the bus topology
– Highly configurable in caching behaviour
– Supports copy-back operation
– Locked ways, allowing part or whole to be used as on-chip RAM
– Can be partitioned based on bus master indexes
30
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
GR740 Device Overview
Memory Subsystem
•  Memory controller
– PC100 SDRAM with 64/32 data bits and 32/16 check bits
– Full width or half-word selected via bootstrap signals
– Powerful interleaved 16/32+8bit ECC giving 32 or 16 checkbits
•  Scrubber
– Fast initialization of memory and checkbits
– Background scrubbing
– Error reporting to CPU and statistics collection
•  Memory error handling (memory controller, scrubber, CPU)
– Rapid regeneration of contents after SEFI
– Graceful degradation of failed byte lane
– Example code available for RTEMS
•  Boot memory provided via PROM/IO
interface (same controller as UT699,
GR712RC)
31
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
GR740 Device Overview
I/O Interfaces
•  Large number of I/O interfaces
– SpaceWire router
– PCI master/target with DMA
– Gbit Ethernet
– MIL-STD-1553B
– CAN 2.0B
– UART, SPI, GPIO
•  Debug interfaces
– Ethernet
– SpaceWire (RMAP)
– JTAG
32
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
GR740 Device Overview
I/O interfaces – connected through IOMMU
•  IOMMU
– Connects all DMA capable I/O masters through one interface to
the Processor bus
– Performs pre-fetching and read/write combining
– Provides address translation and access restriction
– Uses separate page tables from processor
– Masters can be placed in groups where each group has its own set
of page tables
– Master traffic can also be routed directly to Memory bus,
bypassing Level-2 cache.
33
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
GR740 Device Overview
SpaceWire Rouuter
•  SpaceWire router
– Four internal AMBA ports, compatible with GRSPW2
– Eight external SpaceWire ports
– Same IP core as used in GR718 SpaceWire router ASIC device
– SpaceWire link speed: 300 Mbit/s
34
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
GR740 Device Overview
PCI master(initiator)/target with DMA
•  Provides PCI master/target interface
– Provided by GRPCI2 core (vs. GRPCI for UT699/UT699E/UT700)
– 32-bit interface supporting 33 MHz operation
– Not fully compliant to PCI 2.3 due to lack of suitable pads and pin
multiplexing.
– Target has three configurable PCI BARs. BAR0 and BAR1 default to
prefetchable 128 MiB BARs and BAR2 defaults to a nonprefetchable 8 MiB BAR.
•  Pins shared with SDRAM interface: If PCI is enabled then the
data width of the SDRAM interface is reduced to 32-bits. Pins are
also shared with second Gigabit Ethernet interface.
35
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
GR740 Device Overview
Gigabit Ethernet
•  Gigabit Ethernet interfaces
– 2x Ethernet interfaces
– Supports 10/100/1000 Mbit in both full- and half-duplex
– DMA engine for both receiver and transmitter
– Internal buffer allows core to buffer complete packet
– Supports MII and GMII interface to external transceiver
– Supports scatter/gather IO and IPv4 checksum offloading
– Provides Ethernet Debug communication link
– EDCL can also be connected to Debug bus
•  Pins of second Ethernet interface are shared with SDRAM
interface: If second Ethernet interface is to be used then the
data width of the SDRAM interface is
reduced to 32-bits. Pins are also shared
with PCI interface (second Ethernet
interface only).
36
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
GR740 Device Overview
MIL-STD-1553B, CAN 2.0B, UART, SPI, GPIO
•  MIL-STD-1553B controller provided BM/BC/RT functionality with
dual redundant buses.
– Has internal DMA engine.
•  CAN 2.0B controller with internal DMA engine
•  Two 8-bit UARTs with 16 byte FIFOs
•  SPI master/slave controller
– Configurable word length (3-32 words)
•  Two general purpose I/O ports
37
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
GR740 Device Overview
Debug interfaces and Debug bus
•  Debug bus
– Debug support unit
– PCI trace buffer
– AHB trace buffer, monitoring Master IO bus
– APB bridge allows direct access to performance counters
•  Debug links
– JTAG Debug Communication Link
•  Bandwidth: 500 kb/s
– RMAP target
•  Bandwidth 20 Mb/s
– Ethernet Debug links
•  Bandwidth: >100 Mb/s
•  Can optionally be connected to Master
IO bus
38
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
GR740 Device Overview
Improved Debug Support
•  Debug support improved compared to earlier LEON devices
– High-speed debug interfaces
– Non-intrusive debugging through dedicated Debug bus
– AHB trace buffer with filtering
– Instruction trace buffer with filtering – can be read during
execution
– Hardware data watchpoints, Data area monitoring
•  Improved profiling support – with support for filtering
– I/D cache/TLB miss/hold
– Data write buffer hold, Branch prediction miss
– Total/Integer/FP instruction count
– Total execution count
– L2 accesses, misses
– AHB bus statistics
– Interrupt time stamping
39
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
GR740 Device Overview
Resource partitioning
•  Resource partitioning allows running separated software
instances
– The architecture has been designed to support both SMP, AMP and
mixtures (example: 3 CPU:s running Linux or VxWorks SMP and
one running RTEMS)
– The L2 cache can be set to 1 way/CPU mode. Cache has fence
registers that can be used to protect software.
– IRQs can be masked/routed separately to each CPU
– The I/O peripherals’ register interface are located at separate 4k
pages to allow (via MMU) restricting user-level software from
accessing the “wrong” peripheral
– IOMMU allows placing DMA peripherals
into groups and offers modes with
protection and address translation
40
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
How to use GR740
Taking advantage of the four LEON4FT
•  Advantage: More processing power, more functions on one chip
•  Design goal of maximum average performance has a cost in
jitter/predictability
•  Linux/VxWorks/eCos has SMP support.
– Developers hesitant to trust SMP kernel
– RTEMS SMP development ongoing
•  UP instances of RTEMS/VxWorks/eCos/Bare-C/Other can be used
by linking images to separate memory areas
– Booting multiple images is supported by MKPROM2
– May need static MMU tables to enforce (space) separation
– Developer needs to assign HW resources
– Apart from added set up work, no news
•  More functions on one chip
•  Cost is added jitter
41
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
How to use GR740
PROM-less / SpW applications
•  PROM-less booting possible via SpaceWire
– Connect via RMAP
– Configure main memory controller
– Use HW memory scrubber to initialize memory
– Enable L2 cache
– Upload software
– Assign processor start address(es)
– Start processor(s)
•  SpaceWire router, with eight external ports, is fully functional
without processor intervention.
•  Device can also act as a software/processor-free bridge between
SpaceWire and PCI/SPI/1553 etc.
– IOMMU can be used to restrict RMAP
access.
42
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
How to use GR740
Clock gating
•  Clock gating is controlled via clock gating unit
– Automatic clock gating of processor cores that are in idle mode
– Separate gating of floating-point units. FPU is gated-off when it is
disabled.
– Clock gating unit also controls clock and reset for the following
peripherals:
•  Ethernet controllers
•  SpaceWire router
•  PCI target/initiator with DMA unit
•  MIL-STD-1553B controller
•  CAN 2.0B controller
•  UARTs
•  SPI controller
•  PROM/IO memory controller
•  Debug bus is gated-off when DSU is
disabled.
43
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
GR740 New Features
Summary of (some of the) new features
•  Features in GR740 not found in most present day LEON/LEONMP architectures:
– Quad-core LEON4FT
– L2 cache with locking
– Wide AMBA buses
– Improved support for partitioning
•  IOMMU
•  Per-processor timers and interrupt controllers
– Improved debug support (#links, filters, performance counters)
– Improved support for AMP (address mapping, number of cores)
– Boot options (PROM, RMAP, PCI)
– Interrupt time stamping
– Hardware memory scrubber
44
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
Key Performances
Clock frequencies
•  System clock (CPU:s, L2Cache, on-chip buses)
– Nominal frequency is 250 MHz, generated by PLL from external 50
MHz clock (STA and prod. test)
– Full temp range (-40 to +125 Tj) with margins for aging and clock
jitter
– 4 CPUs x 250 MHz x 1.7 DMIPS/MHz = 1700 DMIPS
•  Memory clock
– 100 MHz supported internally and achieved on evaluation board
(using commercial SDRAMs and external clock buffer).
– Achievable clock frequency on space-grade board will depend on
I/O timing and clocking scheme.
– Some mitigation techniques have been implemented to support
high-load scenarios (2T command signalling, duplicated CS# lines)
45
12 april 2017
Cobham plc
Key Performances
Clock frequencies
•  Spacewire PHY: 400 MHz
– Generated by separate PLL from external clock input (50 MHz
nom)
– Receiver is sampling with DDR
– Successfully received packets at over 600 Mbit/s in tester
– Point-to-point transfer at 400 Mbit/s verified on prototype.
•  Gigabit Ethernet
– Fixed 125 MHz input clock for Gbit RX and TX
46
12 april 2017
Cobham plc
GR740 vs. Existing Cobham Processors
Aeroflex Colorado Springs/Gaisler
47
Processor
DUAL LEON3FT
LEON3FT
LEON3FT
LEON3FT
QUAD LEON4FT
Identifier
GR712RC
UT699
UT699E
UT700
GR740
Foundry
Tower
TSMC
TSMC
TSMC
ST
Clock Frequency
100
66
100
166
250
DMIPS/
Core
140
92
140
233
425
Cache I/D
16/16
8/8
16/16
16/16
16/16
MMU
Yes
Yes
Yes
Yes
Yes
SpaceWire
Up to 6 x 200 Mb/
s DMA/
RMAP
2 x DMA, 2 x DMA/
RMAP
4 x DMA/ RMAP
4 x DMA/ RMAP
4x DMA / RMAP
Router with 8x SpW
ports
CAN
2
2
2
2
2
PCI
No
1
1
1
1
1553
1
No
No
1
1
Eval board
Available
Available
LEAP
LEAP
Available
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
GR740 vs UT699/UT699E/UT700
•  LEON4 in GR740 improves performance (1.7 DMIPS/MHz vs. 1.4
DMIPS/MHz).
•  Maximum frequency increase: > 250 MHz for GR740
•  Quad-processor system provides additional performance
improvement. Up to a speed-up of four but in reality lower due
to shared bus and SW synchronization requirements.
•  UT* has 10/100 Mbit Ethernet. GR740 has 10/100/1000 Mbit.
•  UT699/UT699E lacks MIL-STD-1553B. Present in GR740 and
UT700.
•  GR740 provides four AMBA ports and eight SpaceWire ports with
a router. UT* has four SpaceWire interfaces.
48
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
GR740 vs GR712RC
•  LEON4 performance improvement over LEON3FT
•  250 MHz GR740 vs 100 MHz GR712RC
•  Quad-core system with Level-2 cache vs. dual-core system with
shared memory controller.
•  Level-2 cache reduces impact of shared memory.
•  GR712RC has shared resources for memory controller, timer unit.
GR740 improves HW support for partitioning by mapping
addresses on 4k boundaries and including additional HW units.
•  Timing / interference analysis possible for dual-core GR712RC
system as demonstrated by CNES. Shared L2 cache more difficult
to analyze but this is mitigated by inclusion
of performance counters to count
accesses to shared resources and L2
partitioning.
49
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document
Conclusion
•  Development board and prototype devices are available
•  FM qualification planned for 2018
•  Quad-processor system @ 250 MHz (125 deg C, 20y lifetime)
•  GR740 is immediately supported by Cobham Gaisler software
packages and development tools.
•  Latest news: http://www.gaisler.com/GR740
50
Cobham Proprietary
Use or disclosure of this information is subject to the restrictions on the title page of this document