SCALABLE MULTICORE DESIGN FOR TEST INTERFACE DESIGN
CHONG SHI HOU
A project report submitted in partial fulfillment of the
requirements for the award of the degree of
Master of Engineering (Computer and Microelectronics System)
Faculty of Electrical Engineering
Universiti Teknologi Malaysia
NOVEMBER 2007
iii
To my beloved family
iv
ACKNOWLEDGEMENTS
Throughout the period of doing this project, I received a lot of
encouragements and assistance from my project supervisor, Professor Dr. Abu Khari
b. A’ain. I would like to express my most sincere gratitude to him, for his guidance,
support, motivation and help throughout this project.
I would also like to express my heartiest appreciation to my beloved mother
and sister for their encouragement, love, understanding, care and ceaseless supports
in all my endeavors.
Thirdly, I am very thankful to my department for sponsoring my master
program and my managers, Patrick Tan Teck Wee and Chew Huat Chin, in
particular, for their understanding and support for me to pursue this part-time Master
study.
Finally, I would like to thank all my friends for their help and special thanks
to University Technology Malaysia lecturers for sacrificing their weekends to travel
to Penang for lecture.
v
ABSTRACT
With industries moving towards converged core design, and multicore
processors products, DFx design and testing strategy need to be able to catch up with
the pace of product development cycle, increase in test content and test time, as well
as converged core design reuse in proliferation of products. As such, core-level test
content must be reusable, multicore testing have to be done concurrently, while allow
the choice of core isolation, as well as DFx multicore interface that are scalable to
facilitate proliferation of products. The main objectives of this project are to
investigate on major problems of multicore Design For Testability interface design,
to analyze and understand pros and cons of multiple industrial multicore Design For
Testability interface designs, to propose and to implement a novel design that can
tackle three major problems in term of multicore Design For Testability interface
design scalability, concurrent testability as well as trace reusability.
vi
ABSTRAK
Memandangkan industri masa kini menuju kepada arah rekabentuk “converged-core”
dan product-product pemproses berjenis multi-core, DFx sebagai antara satu strategi
rekabentuk dan pengujian produk semikonductor menjadi semakin penting. Kini,
fungsi DFx harus berselari dengan kemajuan dalam pembangunan produk yang
semakin singkat, menampungi peningkatan test time dan test content, dan juga harus
bersepadu dengan penggunaan semula rekabentuk “coverged-core”. Objektif utama
project ini bertujuan untuk mengkaji dan menyelidik masalah-masalah yang wujud di
rekabentuk antaramuka bagi multi-core “Design For Testability” (DFT), menganalisa
dan memahami kelebihan serta keburukan aneka rekabentuk multicore DFT yang
terdapat di industri masa kini. Di samping itu, hasil project ini juga membabitkan
cadangan dan implementasi satu rekabentuk baru yang dapat bertujuan untuk
mengatasi tiga masalah utama dalam rekabentuk multi-core DFT, iaitu “scalability”,
“concurrent testability” dan “trace reusability”.
vii
TABLE OF CONTENTS
CHAPTER
TITLE
PAGE
DECLARATION
DEDICATION
ACKNOWLEDGEMENTS
ABSTRACT
ABSTRAK
TABLE OF CONTENTS
LIST OF TABLES
LIST OF FIGURES
LIST OF ABBREVIATIONS
LIST OF APPENDICES
ii
iii
iv
v
vi
vii
ix
x
xii
xiii
1
PROJECT OVERVIEW
1.1 Background and Research Motivation
1.2 Problem Statements
1.3 Objectives
1.4 Scopes of Works
1.5 Constraints and Assumptions
1.6 Significance of Work and Project Contributions
1.7 Research Methodology, Techniques and Tools
1.8 Organization of Project Report
1
1
3
5
5
6
7
8
10
2
BACKGROUND AND LITERATURE REVIEWS
2.1 Background of Various Multicore and DFX Interface Designs
2.2 Literature Reviews of Various Industry Multicore DFX
2.2.1 IEEE P1500 Standard for Embedded Core Test
2.2.2 Whetsel’s Multiple TAP Architecture
2.2.3 Oakland’s Multiple TAP Architecture
2.2.4 Parulkar et al.’s Multiple TAP Architecture
2.3 Analysis of Industry Multicore DFX Architecture
2.4 Requirement of A Better Multicore DFX Architecture and Design
12
12
13
13
14
17
18
20
21
3
MULTICORE DFX INTERFACE DESIGN
3.1 Modular Approach of The Design
3.2 Requirements, Challenges and Design
22
22
23
viii
3.2.1 Serial Mode
3.2.2 Concurrent Mode
3.2.3 Trace Reusability
3.2.4 Scalability
3.3 Overall High Level Diagram of The Proposed Multicore DFX
23
27
28
29
31
4
DESIGN IMPLEMENTATION AND RTL CODING
4.1 Choice of Language and Simulator
4.2 High Level Modular Block Design
4.3 TAP FSM Design and System Verilog Coding
4.4 Design and RTL Coding of IEEE1194.1 Standard
4.4.1 The Use of System Verilog Macros
4.4.2 Instruction Register
4.4.3 Instruction Decode Control Logic
4.4.4 Bypass Register
4.4.5 Device Identification Register (IDCODE)
4.4.6 IEEE1194.1 Compliant Instructions
4.4.7 TDO MUX
4.5 Design and RTL Coding of Multicore DFX Logics
4.6 Design and RTL Coding of MISR
33
33
36
37
42
43
44
45
46
47
48
48
49
49
5
SIMULATION, VERIFICATION AND ANALYSIS
5.1 Simulation, Verification and Analysis for Serial Mode
5.2 Simulation, Verification and Analysis for Parallel Mode
5.3 Challenges and Solutions
5.4 Discussion
52
53
57
61
66
PROPOSAL FOR PRE-SILICON VALIDATION
METHODOLOGIES AND TOOLS
6.1 Purpose and Importance of Pre-Silicon Validation for DFX Design
69
69
6.2 Proposal for Pre-Silicon Validation Methodologies, Flow and Tool
70
CONCURRENT TESTABILITY AND TEST CONTENT
REUSABILITY MODEL
7.1 Concurrent Testability
7.2 Test Content Reusability
74
74
78
SUMMARY AND FUTURE WORK
8.1 Summary
8.2 Future Works
83
83
84
6
7
8
REFERENCES
Appendices A - B
85
86-187
ix
LIST OF TABLES
TABLE NO.
2.1
2.2
3.1
3.2
4.1
4.2
4.3
4.4
5.1
5.2
TITLE
All Possible Scan Chain Mode Configurations
Comparison of Multicore TAP Architectures
Serial Mode Configurations
Concurrent Mode Configurations
Comparison of various RTL Modeling Languages
Comparison of various RTL Modeling Languages
Table of Bits Representing Each of the States for the TAP FSM
Dedicated TAP Pins
Enabling Items for Simulation and Verification
Comparisons to Illustrate the Advantage of My Design
PAGE
19
21
26
28
34
35
40
42
53
67
x
LIST OF FIGURES
FIGURES NO.
2.1
2.2
2.3
2.4
2.5
3.1
3.2
3.3
3.4
3.5
3.6
3.7
4.1
4.2
4.3
4.4
4.5
4.6
4.7
4.8
4.9
5.1
5.2
5.3
5.4
5.5
5.6
5.7
6.1
7.1
7.2
7.3
7.4
7.5
7.6
7.7
TITLE
System Chip with P1500 Wrapped Cores
Whetsel’s TAP Architecture
Circuit Implementation of TLM
Oakland’s TAP Architecture
Parulkar’s Parallel Test Data Architecture
Full Daisy-Chain Connectivity in Serial Mode
Core2 TAP bypassed in Serial Mode
Proposed Serial Mode Interface Logics
Proposed Concurrent Mode Interface Logics
Upward Lateral Scalability for Serial Mode Logics
Upward Hierarchical Scalability for Concurrent Mode Logics
Proposed Multicore DFx Interface
High-Level Modular Block Diagram
TAP Controller Finite State Machine
TAP FSM Interface Signals
Simplified Block Diagram of TAP
TAP Instruction Register and Shift Register
Operation of the TAP Instruction Register
Decoding of TAP Instructions
Structure of the Device Identification Register
A 16-bit MISR Design
Dualcore Serial Mode Simulation Results
Quadcore Serial Mode Simulation Results
Dualcore Parallel Mode Simulation Results
Quadcore Parallel Mode Simulation Results
Mace Environment Diagram
Contemporary Design #1 on an industry Multi-core CPU (2003)
Contemporary Design #2 on an industry Multi-core CPU (2004)
Proposed Pre-Silicon Validation Flows
Traditional Testing with Most Functionality on the Tester
Multicore Testing with Most Functionality on the Chip
Test Flow Routine demonstrating on-die comparison
Test Flow Routine demonstrating on-tester comparison
Sample assembly code test content reusable for any core
Sample test routine code showing test content at chip-level reuse
Sample test routine code showing test content at core-level reuse
PAGE
13
15
16
17
19
24
24
26
27
30
31
32
36
37
41
43
44
45
46
47
50
55
56
59
60
65
66
67
71
74
76
77
78
80
81
82
xi
LIST OF ABBREVIATIONS
API
−
Application Programming Interface
ATPG
−
Automated Test Pattern Generation
BIST
−
Built-in Self Test
DFT
−
Design For Test
DFX
−
Collective term for Design for Test (DFT), Design for Debug
(DFD) and Design for Manufacturing (DFM)
DR
−
Data Register
FRC
−
Functional Redundancy Check
FSM
−
Finite State Machine
HVM
−
High Volumn Manufacturing
IP
−
Intellectual Property
IR
−
Instruction Register
IDCODE
−
Identification Code
MISR
−
Multiple Inputs Shift Register
MUX
−
Multiplexer
RTL
−
Register Transfer Language
SECT
−
Standard for Embedded Core Test
SoC
−
System-on-Chip
TAP
−
Test Access Port
TLM
−
TAP Linking Module
TTM
−
Time to Market
xii
LIST OF APPENDICES
APPENDIX
TITLE
PAGE
A
RTL coding for design implementation
86
B
MACE coding for test writing for verification
124
CHAPTER 1
INTRODUCTION
This project report proposes Scalable Multicore DFx Interface Design. The
purpose of the design is to tackle various testing issues give rise from multicore
designs and products. In this chapter, the issues and problem statements are
discussed, providing a framework for the objectives of this project. This chapter
covers the background and research motivation, problem statements, scope of work,
constraints and assumptions, significant of the work, research methodology and
finally the report organization.
1.1
Background and Research Motivation
Multicore processors are not a new invention in the 21st century. There is
general consensus that the embedded market has been the leading innovator for
architecting single-chip, multiprocessor systems. Since at least 1995 when the Texas
Instrument (TI) TMS320C80 video processor was shipped, there have been off-theshelf multicore CPUs on the market. Even prior to 1995, companies like Siemens,
Phillips, Fujitsu, NEC, etc. might have built customized multicore CPU chips.
2
The exponential growth of cellphones, storage devices, consumer electronics,
general purpose and server computing as well as automotive applications is driving
the demand for multicore processing. Multicore processing is a growing industry
trend as single core processors rapidly reach the physical limits of possible
complexity and speed. Companies that have produced or are working on popular
multicore products include Intel, AMD, ARM, Broadcom, Sun and IBM.
Homogeneous multicore products have been a common place in server
computing as demonstrated by IBM, Sun, Intel and AMD over the years.
Nevertheless, in recent years, it starts creeping into the desktop and mobile
computing realms too. ARM MPCore Processor has also been offered for consumer
devices from set-top boxes to cell phones.
Heterogeneous multi-core computing itself isn't particularly new. Such
systems have been around since the mid-80 where a problem's workload is split
between a general-purpose processor and one or more specialized, problem-specific
processors. Notable historical examples include Floating Point Systems' array
processors, the Inmos "Transputer" and the Connection Machine. Today's attached
processor systems, besides GPUs, include ClearSpeed's accelerator systems and the
Ageia PHYSX physics processing unit. In the processor realm, the IBM Cell
Broadband Engine (a.k.a., "Cell BE" or simply, "Cell") is the best example of an
entirely heterogeneous multi-core processor. The difference today is packaging:
these processor systems are delivered as systems-on-a-chip (SOC). The
heterogeneous multi-core SOC integration trend is very likely to continue in the
future if IBM's Cell, the AMD/ATI merger or Intel in the GPGPU domain are
indications of commercial trends.
Multicore architecture can increase efficiency of simultaneous processing of
multiple tasks and can enable the designers to optimize computation and data flow
with homogeneous or heterogeneous architectures. However, it also gives rise to the
issues of duplicated front-end design efforts in converged core architectures, growing
test contents in product development, increase in product development engineering
headcounts, growing test time and tester platform costs, increase in the complexity of
debugging multicore and intercore failures, etc.
3
Although dual-core and quad-core are just becoming a norm in recent
months, the trend of increasing homogeneous or heterogeneous cores in
microprocessor and SoC products will not stop here. As such, the research
motivation in the area of multicore DFx interface design becomes clear – help to
increase converged core design scalability, improve test content reusability, reduce
test time, improve debugability and achieve significant test cost saving.
1.2.
Problem Statements
With increase in the complexity of multicore processor design, such as more
architectural features and more transistors per each physical core, as well as increase
in the number of homogeneous and heterogeneous physical cores, such as dual-core,
quad-core, eight-cores and even multi-heterogeneous cores SoC, many multicore
related engineering issues have appeared, incurring duplicated design efforts, high
number of product development headcounts, growth in the production test time,
significant increment in the production test contents, additional tester costs, longer
time-to-market, etc.
To be more technically specific, the issues can be categorized into the
following problem statements.
(i)
The problem of scalability of multicore DFx interface – Regardless of
homogeneous multicore, heterogeneous multicore, multi-chip package
products or multicore SoC, scalable multicore DFx interface has to be
planned and designed upfront in the early stage of the project, otherwise
the design will not be able to scale to more core, more hierarchical multichip packaging, or even reducing the cores for lower-end product
segment. A non-scalable multicore DFx interface design will require a lot
of design rework for product proliferations. The cost of redesign is huge
in term of engineering resource, time and money.
4
(ii)
The problems of trace reusability – Homogeneous or heterogeneous
multicore products have their respective unique core-level traces. In many
practical cases in the industries, these cores are not redesign from scratch,
but they are rather instantiations of improvement from previous designs.
As a result, high percentage of core-level and potentially chip-level trace
reuse from their predecessors is expected. Low percentage of trace reuse
can be directly translated into increase in test content volume, growing
test time and engineering headcounts, increase in tester equipments, and
as a result, significant increase in the production cost!
(iii)
The problem of concurrent testability – For homogeneous multicore
design, generally the cores are logically identical. Without the capability
of concurrent testing, production testing in high volume manufacturing
(HVM) will become multiple times of the single-core product. This issue
will have serious impact to both the long test time as well as additional
cost for more testers or testing platforms. In addition, without concurrent
testing and comparing mode, engineers may need to go through iterations
of pass-fail flow to determine the failing signature.
(iv)
The problem of debugability – Without any multicore DFx interface
capability for debug purpose, any core failures in the product will be
tedious to debug as engineers may not have easy to use mechanism to
quickly determine failing core and to further isolate the particular failing
core. Without multicore DFx interface, more troublesome pass-fail flow
needs to be used to determine the failing signature.
Based on limited time frame, as well as relative new research scope in this
area, this research project is narrowed down to specifically focusing on the top three
problems mentioned above, namely scalability, test content reusability and
concurrent testing.
5
1.3.
Objectives
After knowing the problem statements clearly, project objectives can be
appropriately set as the following:
(i)
The logical design of the multicore DFx interface must support upward
lateral scaling, downward lateral scaling, as well as hierarchical scaling.
(ii)
The design must also allow unique core test content as well as chip-level
trace reuse up to certain significant extend, without substantially incurring
any additional cost in terms of test time, test equipments and engineering
resources.
(iii)
Concurrent testing mechanism must also be supported to allow on-die
comparison of test result for parallel testing, as well as the flexibility of
choosing any healthy homogeneous core’s signature as reference.
1.4.
Scopes of Work
Regarding the above-mentioned objectives, the scope of work for this project
will include:
(i)
Performing architectural analysis of various open industry standard
multicore DFx interface design, identify their pros and cons, and decide
areas of improvement for a better multicore DFx interface architectural
for design implementation.
(ii)
Determining a parallel DFx feature (such as MISR) to be implemented
together with Test Access Port (TAP) and the multicore DFx interface
logics.
(iii)
Determining a good choice of coding language. In this case, System
Verilog is chosen over VHDL and Verilog95. In addition, determining a
good choice of design simulator, in this case, Synopsys’ VCS is chosen
6
over Altera’s Quartus II and Mentor Graphics’ Modelsim. This is explaint
later in the thesis.
(iv)
Implementing the prototype of the behavioral and gate-level logic designs
of TAP FSM, performing logic simulation, and verifying the fundamental
design correctness within the desirable functionalities.
(v)
Researching and determining a good choice of pre-silicon validation
language and tool, in this case low-level logic validation will use System
Verilog language and VCS simulator, and high-level usage model
validation will use e-Language and Specman, over the choice of userdefined Perl-Macro and user-defined API with VCS. Besides, effort will
also be spent on proposing an appropriate combination of pre-silicon
validation methodologies, flow and tools.
(vi)
Researching and proposing test content reuse and concurrent testing
strategies with multicore DFx design.
1.5.
Constraints and Assumptions
The field of Multicore DFx Interface design can be very broad and
complicated, especially when tens of serial and parallel DFx features are involved in
the integration, or when complicated multiple frequency domains crossings are
involved. In order to focus this project onto the Multicore DFx Interface design itself,
the following list of constraints and assumptions are put in place to avoid the
research and project from going astray.
(i)
This project will focus on multiple homogeneous cores only.
(ii)
All multicore DFx logics are operating in TCLK (TAP clock) domain.
The TAP Clock and Core Clock are assumed to operate at a safe ratio, as
such that the distribution delay of TAP signals in the core domain will not
contribute to any speed path.
7
1.6.
Significance of Work and Project Contributions
This project will contribute directly to the objectives mentioned above. By
achieving the support for multicore DFx interface scalability, design cost will reduce
significantly. In recent years, proliferation of multiple products from a converged
core design is a common place. Any proliferation of the converged core design, such
as increasing number of cores for high end market segment, say from 4 to 8, or
reducing the number of cores for low end market segment, say from 2 to 1, will
require lateral scalability of the multicore DFx interface support, without needing
additional design resource or even significant redesign. As a result, the contribution
with respect to this area can be as significant as allowing a very quick time to market
(TTM) response for many proliferations or product line items deliveries, allowing a
company to quickly respond to market demand, gaining various market segment
shares or even responding aggressively to competitors’ products with minimum
design costs.
On the other hand, the capability of the multicore DFx interface design
supporting test content reuse and parallel testing can be appreciated directly in term
of the significant reduction in engineering resources of test content re-generation for
many product proliferations of the same converged core design. It also allows
significant saving in high volume manufacturing (HVM) test time and multi-million
dollars of savings in functional, structural and system level tester platforms as well as
product engineering headcounts.
For high volume microprocessors products such as general purpose CPUs and
embedded cellphone processors, the savings contributed from the benefits of such
multicore DFx interface design can easily be as much as a few hundred thousand
dollars to tens of millions of dollars. As such, this importance to a company’s
operating cost and profit margin is undeniable.
8
1.7.
Research Methodology, Techniques and Tools
In order to make the progress of this project smooth and achieve desirable
objectives, a structured and realistic planning must be put in place. All working
procedures, work loads and time lines shall be identified upfront. Time lines will be
tracked separately using Gantt chart.
The initial stage of this project will be focusing on architectural research of
multicore DFx interface designs, as well as tools and coding language research. Such
research shall not take too long, yet they are very important in laying down the right
foundation and pave the right way for subsequent project stages. Research will be
done by reviewing various engineering journals and IEEE standards, as well as
different vendor and in house tools and coding languages for design and validation.
Any proprietary tools or languages will be avoided, to prevent unnecessary technical
issues, difficulties in getting appropriate engineering support or even difficulties in
portability of the design of this project.
For architectural research, four industry standards of multicore DFx interface
designs that will be looked upon are IEEE P1500 Standard for Embedded Core Test,
Whetsel’s Multiple TAP Architecture, Oakland’s Multiple TAP Architecture and
Parulkar et al.’s Multiple TAP Architecture. Their respective designs and
implementations will be scrutinized and analyzed, and conclusion will be drawn with
respect to their pros and cons in terms of scalability, test content reusability and
concurrent testability.
Next, modular design approach will be used for behavioral and logical
designs. The designs will be sub-divided into multiple logical blocks, so that any
prototyping codes can be written in corresponding design modules later. The design
logics shall be modularized or broken down into blocks in such a way that each
module are logically and functionally meaningful (such as TAP FSM), allow
modular or unit level validation, has minimal interface connection with adjacent
modules and has limited dependency on logical changes of other modular blocks.
9
These will ensure design progresses are smooth and systematic, minimize
unnecessary changes and ease to manage design coding.
The multicore DFx interface logics will be designed first, follow by other
logical blocks, such as Test Access Port (TAP) Finite State Machine (FSM),
Boundary Scan feature, Instruction Registers, Bypass registers, etc. All the
behavioral and logical designs will be analyzed with respect to their functionalities
and usage models. Any potential design, implementation, simulation, validation and
usage model challenges will also be discussed, together with potential mitigation
plan proposals.
After the preliminary behavioral and logical design, research focus will be
shifted towards pre-silicon validation domain, whereby various industry pre-silicon
validation tools, methodologies and flows will be reviewed, analyzed. With that, a
practical pre-silicon validation tools, methodologies and flows for this multicore DFx
interface design will be proposed.
Apart from that, test content reusability and concurrent testability will also be
focused from the perspective of production testing. Various testing platforms and
usage models will be reviewed and discussed. Subsequently, a practical test content
reuse and concurrent testing strategy will be proposed.
With the choice of System Verilog as the design coding language and
Synopsys’ VCS as the simulator, preliminary prototyping System Verilog coding
will be attempted. Simulations will be run and results will be collected and analyzed.
This section of the work can be tedious, and much iteration maybe needed, especially
initial coding from scratch can be buggy. Coding, simulation, basic verification,
debug and recoding cycles will be repeated until a fundamentally functional design
implementations are produced. A lot of engineering effort and hours are expected for
this part of the project. Once a functional prototype design has been produced, its
corresponding simulation result will be discussed, together with potential mitigation
plan proposed with respect to any challenges arise.
10
Last but not least, with lots of time and effort spent up to this stage of the
project, a lot of conclusion can be drawn upon. Such technical experiences and
implementation hardship will be very useful for proposing any future work for the
continuity and improvements of this project.
1.8.
Organization of Project Report
This report is organized into eight chapters. The first chapter is the
introduction which covers the background, problem statements, objectives, scopes,
the significant and contributions of the project. End of the chapter deals with the
methodology, tools and techniques employed in this project.
Chapter 2 provides literature reviews of various industry multicore DFx
interface architectures. Analysis will be made with respect to their scalability, test
content reusability and concurrent testability. With that, detailed requirement for a
better multicore DFx architecture design will be laid down, paving for the key
milestone of this project.
Chapter 3 demonstrates real design and implementation of the multicore DFx
interface architectures. It will start with the choice of modular design approach and
selection of a good design language as well as a simulator to begin with; follow by
high level block diagram and detailed logic implementation. Besides analysis the
design itself, discuss will be done upon challenges encountered and corresponding
mitigation plans.
Chapter 4 shows design and implementation with TAP (Test Access Port)
Finite State Machine (FSM), Boundary Scan, as well as their integration with the
multicore DFx interface. Similar to previous chapter, these design and
implementation make user of modular approach, and the same choice of coding
language and simulator. Analysis on design and discussion upon challenges
encountered and corresponding mitigation plans will be covered too.
11
Chapter 5 shows simulation, verification and analysis of the simulated result.
Simulation will be done to prove the success of implementation on dual-core serial
mode, quad-core serial mode, dual-core parallel mode as well as quad-core parallel
mode.
Chapter 6 briefly discusses the importance of pre-silicon validation and gives
a literature review of various industry validation tools and methodologies. It then
discusses the proposal of pre-silicon validation for this multicore DFx design.
Chapter 7 then brings forward the importance of test content reusability and
concurrent testability, literature review of their role in term of multicore testing
requirements as well as the proposal for test content reuse and concurrent testing
strategy.
Last but not least, final Chapter summarizes the works done as well as
proposes future work for whoever intending to carry the research in similar scopes.
CHAPTER 2
BACKGROUND AND LITERATURE REVIEWS
2.1.
Background of various Multicore and DFx Interface Designs
Chips comprising reusable cores have become an important part of recent ICbased system design trend. The increase in using pre-designed IP cores or modules in
chips adds to the complexity of design and test. The integration of multiple predesigned IP modules on a single chip, each having an IEEE 1149.1-compliant debug
interface, introduces multiple TAP controllers on this chip. Some IP providers or predesigned modules already use their own chip-level approach that supports debugging
a single IP module via a single chip-level TAP.
To enable concurrent multi-core testing and debug, a standard chip-level
approach for accessing all TAP controllers is required. Until recently, there is still no
proper industrial standard multi-core interface design. However, multiple proposals
have been brought up in IEEE conferences. We will investigate four of the popular
proposals, namely IEEEP1500 Standard, Whetsel’s, Oakland’s and Parulkar et al.’s
Multiple TAP Architecture.
13
2.2.
Literature Reviews of various Industry Multicore DFx Interface
Architectures
2.2.1.
IEEE P1500 Standard for Embedded Core Test
As mentioned above, in current multicore and system-on-a-chip development,
no standard access mechanism exists for testing embedded logic cores. Each core
provider develops its own process for isolating the core and testing it. These methods
are sometimes in conflict, making it difficult to use cores from multiple vendors in
the same SoC design.
To improve the situation, the IEEE P1500 Standard for Embedded Core Test
(SECT) working group was established. This team'
s goal is to provide an
independent, openly defined, DFT method for the "automatic identification and
configuration of testability features in integrated circuits containing embedded
cores." These features will offer a standard method for routing test data and
commands from external pins on the device to any selected core within the SoC.
Figure 2.1
System Chip with P1500 Wrapped Cores
14
The IEEE P1500 Standard for Embedded Core Test is a scalable standard
architecture for enabling test reuse and integration for embedded cores and
associated circuitry. It foregoes addressing analog circuits and focuses on facilitating
efficient test of digital aspects of systems on chip (SoCs).
It describes core wrapper architecture with a boundary scan chain which
intercepts the core I/O. It has a serial port which connects to an instruction register, a
bypass register, the boundary scan chain, or any other shift register inside the core. It
also provides for a parallel test data port for higher-bandwidth test features built into
the core. The core wrapper requires a chip-level TAP controller to operate it and a
TAM to connect it to the chip pins.
However, it is important to take note that the scope of P1500 covers only
standardizing core test mechanisms, for core access and isolation, including
protocols and test mode control. It does not cover system chip test access
mechanism, which shall be defined by the system chip integrator. In addition, it also
does not cover the scope of core test method (such as scan, BIST, etc), which is
defined by core provider.
As a result, a few more papers have been published to provide better system
chip test access mechanism.
2.2.2.
Whetsel’s Multiple TAP Architecture [3]
One of the common methods used in the industry to enables access to
multiple TAP controllers on a single chip is to concatenate the individual test data
inputs (TDI) and outputs (TDO) in one long serial chain. The control signals TMS,
TCK and TRST are shared by all TAP controllers. However, this type of daisy-chain
implementation of TAP controllers on a single chip does not conform to the IEEE
15
1149.1 standard. According to the 1149.1 standard, during the selection of the
mandatory BYPASS instruction, a bypass register must be selected with a single shift
register stage between the chip’s TDI and TDO pins. The concatenation of n TAP
controllers results in a non-compliant n-bit shift register.
To resolve the problem, and at the same time provide lateral and hierarchical
scalabilities, Whetsel [3] proposed an approach based on a TAP Linking Module
(TLM), as shown in the following figure.
TRST
TAP LINKING MODULE
TDI
TDI TDO
TCK
TMS
TDI TDO
TCK
TMS
TDI TDO
TCK
TMS
TDI TDO
TCK
TMS
TAP 3
TAP 2
TAP 1
TAP 0
TCK
Figure 2.2
TMS
TDO
Whetsel’s TAP Architecture
The architecture shows the connection of multiple TAP controllers on a chip
with a TAP Linking Module (TLM). The TLM is an interconnect layer that allows
one or more of the TAP controllers to be included in a daisy-chain that is connected
to the chip’s TAP pins. As part of the requirement for IEEE 1149.1 compliance, the
TLM will connect the first TAP controller to the TAP pins after power up or TAP
reset, making the chip to appear to have a single TAP controller. An instruction in
each TAP controller accesses a shared data register in the TLM which controls the
TAP connectivity, allowing the daisy chain to be reconfigured on the fly.
16
Figure 2.3
Circuit Implementation of TLM
The TLM circuit includes a TAP Linking Module (TLM) TAP controller, a
decode logic, a shift register, and a link update register.
Such IEEE 1149.1 compliance resolves the problem of seen in previous
implementation using daisy-chain-only design. Besides, Whetsel’s proposed
architecture is also laterally and hierarchically reusable.
Excluded TAP controllers are forced into the RunTest/Idle state, allowing
them to run BIST in the background while selected TAP controllers execute other
instructions in the foreground. Such daisy chains do not support chip-level trace reuse because the pattern depends on the sequential depth in the daisy chain of the core
being tested. Furthermore, only tests which are BIST-able support concurrent core
testability. In other words, the concurrent core testability is only limited to BIST
feature.
With respect to the disadvantages in term of no chip-level trace re-usability
and poor concurrent core testability, this architecture is not considered a decent
multicore DFx interface design.
17
2.2.3.
Oakland’s Multiple TAP Architecture [4]
Oakland proposed a different multiple TAP architecture, which does not use a
TAP Linking Module. Instead, the states of all TAP controllers stay synchronized.
For the last embedded TAP controller, its TDO output is connected both to a TDO
multiplexing logic and to a chip-level instruction register segment (IR0).
In the ShiftIR state, all of their instruction registers are connected. The
instruction shifted into the chip-level TAP controller determines whether a chip-level
data register such as the bypass bit or the boundary scan chain, or a daisy chain of the
data registers selected by the instructions in the core-level TAP controllers, is
connected between the chip TDI and TDO. Each TAP controller independently
executes its own instruction, but all TAP controllers stay in the foreground.
TAP 0 DATA REGISTER A
TAP 0 DATA REGISTER B
TDI TDO
TRST
TCK
TMS
TDI TDO
TRST
TCK
TMS
TDI TDO
TRST
TCK
TMS
TAP 3
TAP 2
TAP 1
IR 0
TRST
TCK
TMS
TDI
TRST
TCK
Figure 2.4
TMS
TAP 0
TDO
Oakland’s TAP Architecture
In addition to decoding chip-level instructions, the chip-level instruction
decode logic allows for one or more IR0 codes that select the embedded processors
as a Test Data Register (TDR). The length of the TDR is the sum of the lengths of
the TDRs selected within each processor core. For example, if the BYPASS
18
instruction has been loaded into each of the four processor cores’ instruction
registers, then the TDR has length four.
An advantage of Oakland’s architecture is that it allows access to all TAP
controllers of embedded processors without the overhead of additional pins. This
architecture also supports lateral and hierarchical scalability.
Nevertheless, there are a few drawbacks with this design. In this daisy-chain
implementation, concurrent core testability is impossible. Besides, the data registers
of the chip-level TAP controllers are connected in parallel to the data registers in the
other TAP controllers, making it impossible to access both chip-level data registers
and embedded IP cores’ data registers, both at the same time. As a result, this
approach is very likely to cause synchronization problem, instruction cycle overhead,
as well as no chip-level trace reusability.
2.2.4.
Parulkar et al.’s Multiple TAP Architecture [5]
Parulkar et al. apply Oakland’s solution to a multi-core processor and add
parallel test data ports to the cores as provisioned by P1500. Multiplexers on the
parallel test buses provide two options for connecting the ports of the chip and those
of the cores as shown in the following figure.
The first option daisy chains them between cores.
The second option
broadcasts the chip input to the core input ports, selects one core’s output port to
connect to the chip outputs, and compares the cores’ outputs and sends a match bit to
a chip output pin. Cores can be disabled to make the chip appear to be a single-core
processor, and functional test patterns can be re-used for each one-core
configuration.
19
C1_SEL = 0
C0_SEL = 1
0
1
0
0
1
1
0
CORE 0
0
CORE 1
SI [1:N]
1
1
LOCKSTEP = 1
LOCK_RESULT
XOR
0
0
1
NON-CORE
SO [1:N]
1
NC_SEL = 0
Figure 2.5
Parulkar’s Parallel Test Data Architecture
It is worth mentioning that for IC designs which adopt Scan ATPG,
Parulkar’s architecture is a good choice for the multicore DFx interface
implementation, as the core0, core1 and non-core architecture itself takes good care
of the necessary scan partitioning and bypasses configurations. The following table
shows all the possible configurations of the scan partitions.
Table 2.1
All Possible Scan Chain Mode Configurations
SI to SO Paths
C0_SEL
C1_SEL
NC_SEL
Not used
0
0
0
Core0
1
0
0
Core1
0
1
0
Non-core
0
0
1
Core0 + Non-core
1
0
1
Core1 + Non-core
0
1
1
Core0 + Core1
1
1
0
Core0 + Core1 + Non-core
1
1
1
20
Another scan chain mode that can take advantage of hierarchical ATPG
involves the use of a scan lockstep capability. Lockstep mode can also be called as
On-die Functional Redundancy Check (FRC) mode. This mode has a one-bit signal
(LOCKSTEP) that allows each core to be tested with one set of ATPG patterns in
parallel with other identical cores. This means that concurrent testability for scan
ATPG is supported, whereby scan chains in core0 and core1 receive the same scan-in
data, and the responses scanned out of these two cores are compared internally. Any
mismatch is reported by a fail pin named LOCK_RESULT.
It is very obvious that this architecture supports chip-level trace reusability
very well. In addition, it also provides decent support for concurrent core testability
to a good extend. In other words, any DFT feature (such as Scan ATPG, BIST,
MBIST, Array DFT features, etc) that can work in lockstep mode can be tested
concurrently, with on-die comparison for their output responses.
Nonetheless, this architecture does not fundamentally support design
scalability. Since this design supports only daisy-chain and two-way broadcast
configuration, any proliferation of the design will require significant re-design of the
multicore interface to allow lateral or hierarchical scalability.
2.3.
Analysis of Industry Multicore DFx architecture
From the above literature reviews and design analysis, each of the four multiTAP architectures clearly has its advantages and drawbacks.
As mentioned in chapter 1, our focuses for the multi-TAP and multicore DFx
architecture are on design scalability (both lateral and hierarchical), concurrent core
testability and chip-level trace reusability. Hence, a scoreboard is tabulated as follow,
to summarize their respective grading against our objectives.
21
Table 2.2
Comparison of Multicore TAP Architectures
Multicore
Chip-Level
Concurrent Core
Scalability
TAP
Trace
Testability
(Lateral and
Architecture
Reusability
Hierarchical)
P1500
No (core-level)
Yes
Some (Lateral only)
Whetsel
No (core-level)
No (except for BIST)
Yes
Oakland
No (core-level)
No (Daisy-chain)
Yes
Parulkar
Yes
Some (Lockstep mode)
No (2-way broadcast)
2.4.
Requirement of a better Multicore DFx Architecture and Design
From the above table, none of the proposed architectures perfectly meet our
objectives in this project. However, they serve as very good design references, as
they reveal various advantages and disadvantages with respect to our design goal.
Of the four, Parulkar’s architecture seems to be the most successful of all, due
to it’s capability to support chip-level reusability and concurrent core testability very
well. In this context, we will use Parulkar’s architecture as our design starting point,
to improve its drawback in term of scalability. This can be achieved by adopting the
scalable design from Whetsel and Oakland.
CHAPTER 3
MULTICORE DFx INTERFACE DESIGN
3.1
Modular Approach of the Design
To begin the design of multicore DFx interface, modular approach is adopted.
Design will be broken down into multiple modules, where each module can be
design, coded and simulate independently at the early stage of the design phase,
before combining together into a final stage design.
Such approach has been a commonplace nowadays in digital system designs.
The benefits of modular designs include allowing designer to focus attention on a
single module at a time, without being hindered by the complexities of the entire
circuit, allowing initial design from a high-level perspective in which concepts are
important and low-level details can be temporarily ignored, as well as allowing
similar design blocks to be instantiated from an already coded module without
needing to recode from scratch.
23
3.2
Requirements, Problems and Design for Each Goal
3.2.1 Serial Mode
Serial mode is one of the most fundamental requirements for a multicore DFx
interface. Fundamentally, the serial mode can be used to concatenate the individual
test data inputs (TDI) and outputs (TDO) in one long serial chain, where all the IP
cores and possibly the non-core logics are being tested. Serial mode is intended for
use when the JTAG port is the only available tester connectivity – for example, in
burn-in or in a system.
Both Oakland and Whetsel’s architecture support serial mode, whereby they
allow cores and non-core TAP to be concatenated in daisy-chain configuration.
However, more detailed scrutiny reveals a problem in the Oakland design. In
Oakland’s multi-TAP architecture, all the core TAP must be connected in a specific
sequence, without any bypass option. This lack of bypass flexibility will not allow
any core disable features in the chip design, resulting in lack of option for a
manufacturer to simply disable one or two core of a multicore chip by fusing to
target for a lower market segment. In business term, the company will loose
competitive edge in obtaining quick engineering response to market segment
demand. In addition, such lack of bypass flexibility can also significant problem
during high volume production testing, whereby if one or more cores in the multicore
chip is faulty or dead, the entire serial mode can no more be tested.
Whetsel’s multi-TAP architecture, on the other hand, does not have this
problem. Whetsel implement some multiplexers between the cores TDI and TDO
connections, allowing control over bypassing a specific core or a few cores, while
maintaining the serial mode daisy-chain connection. Such design feature is highly
desirable. As a result, this underlines our fundamental design reference and
requirements for the serial mode connection.
24
Generically, the following figures illustrate serial mode connection of cores
and non-core TAP, in both full daisy-chain connectivity, and bypass daisy-chain
connectivity.
Figure 3.1
Full Daisy-Chain Connectivity in Serial Mode
Figure 3.2
Core2 TAP bypassed in Serial Mode
In order for the serial mode to be functional and to be compatible with IEEE
1149.1 standard, more refined requirements are needed. In fact, the design shall work
in such a way that all the TAP FSMs always stay in synchronization. Besides, the
design must be pin-compatible, meaning package-wise, it must have only 5 TAP
pins, namely TDI, TDO, TCK, TMS and TRST.
25
To achieve the above-mentioned three additional requirements, all the cores
and non-core TAP must use the same Finite State Machine design, with similar state
transitions. All of the TAPs must also be connected to the same TMS, TCK and
TRST pins without any gating logics in between to avoid the state machines of the
TAP controllers from diverging from each other. These same broadcasts of TMS,
TCK and TRST are key to maintain the coherent states as expected by many tools.
While Oakland is implementing exactly this, Whetsel’s architecture is not. In fact,
the problem lies with the TRST signal, where in Whetsel’s design TRST goes into
the TAP Linking Module (TLM) before distributing the intermediate control signals
to multiplexers for controlling the TAP serial or bypass selection. This may give rise
to the TAP coherency issue if design implementation is not taken with great care. In
fact, such logic implementation is not desirable and shall be avoided.
Another issue seen in the Whetsel’s design is that each embedded TAP
controller needs to be modified to work with the TAP Linking Module (TLM). In a
situation where hard cores are used, such a modification will not be feasible.
Considering all these requirements and problems in the original design of
Oakland And Whetsel’s architectures, our design requirements are clear: to leverage
the bypass-enabled serial mode design from Whetsel’s architecture without
implementing the TLM, and at the same time leverage the synchronized TCK, TMS
and TRST implementation from Oakland’s architecture.
As a result, the proposed design on the serial mode portion of our multicore
DFx interface will be as illustrated in Figure 3.3, implementing on a dual-core design
consisting of a non-core TAP and two core-TAPs.
In this proposed implementation on a dual-core product, the chip has only one
5-pin JTAG port. TMS, TRST, and TCK are broadcast to all TAP controllers to keep
them in a coherent state as expected by many test tools. The non-core TAP controller
has additional control logics and registers providing three signals, namely MODE,
CORECONNECT[0] and CORECONNECT[1].
26
Figure 3.3
Proposed Serial Mode Interface Logics
In order to enter serial mode, the MODE signal is set to 1. By doing so, the
TAP controllers in the chip are configured into a daisy chain between the chip TDI
and TDO pins with the non-core TAP controller always being first. To get a full
daisy-chain connectivity, both the CORECONNECT[0] and CORECONNECT[1]
signals will be set to 1. On the other hand, to bypass any of the core-TAP, the
CORECONNECT signal will be set to 0. The disconnected TAP controller’s TDI pin
is forced to logic 1 so that it always loads the BYPASS instruction defined by IEEE
Standard 1149.1.
All supported configurations of the serial mode are summarized in the
following table.
Table 3.1
Serial Mode Configurations
MODE
CORECONNECT[0]
CORECONNECT[1]
DESCRIPTION
0
X
X
Not allowed in serial mode
1
1
0
Core1 TAP bypassed
1
0
1
Core0 TAP bypassed
1
1
1
Full daisy-chain
27
3.2.2 Concurrent Mode
Concurrent mode is important to allow concurrent testing for multicore
products. This mode is especially vital for test time reduction in high volume
manufacturing testing in many-core chips.
From the literature review and analysis in chapter 2, it is understood that both
Whetsel and Oakland’s architectures do not support concurrent mode. This is due to
the fact that their designs are serial-mode centric, and do not have additional logics to
multiplex TDOs in parallel to allow comparisons among each other.
On the other hand, Parulkar’s architecture does provide concurrent mode
capability when the chip enters lockstep mode. Such a design idea is useful, but may
make our design more complicated. To implement concurrent mode logic that is
compatible with our serial-mode design, we propose the design as follow.
Figure 3.4
Proposed Concurrent Mode Interface Logics
28
For simplicity, MODE signal is not included in Figure 3.4. However, it is
important to take note that in order to enter concurrent mode, MODE signal must be
set to 0. Serial and concurrent mode cannot be invoked at the same time.
By selecting MODE=0, the logics are configured in such a way that the core
TAP controllers are in parallel with all of their TDI signals connected to the uncore
TAP controller’s TDO signal. Once again, a core TAP controller with CONNECT=0
simply runs the BYPASS instruction. Core0 can be selected as the reference core
with CORESELECT=0, and Core1 can be selected as the reference core with
CORESELECT=1. In any of the CORESELECT setting, its TDO is then compared
with the TDO of all other cores. The result appears on COREMATCH. Any core
whose CORECONNECT=0 is masked. TDO_SELECT chooses whether to connect
the chip TDO to the reference core’s TDO, the non-core’s TDO, or COREMATCH.
Table 3.2
MODE
0
Concurrent Mode Configurations
CORE-
CORE-
CORE-
CONNECT[0]
CONNECT[1]
SELECT
X
1
0
DESCRIPTION
Core0
is
reference,
both
comparing results
0
1
X
1
Core1
is
reference,
both
comparing results
1
X
X
X
Not allowed in Concurrent Mode
3.2.3 Trace Reusability
Trace reusability is one of our key objectives. A multi-TAP architecture with
proper multicore DFx interface design will allow high trace reusability. From
Whetsel and Oakland’s architectures, it is worth noting that they are only designed
29
for core-level trace reusability, but not chip-level trace reusability. The reasons lie in
their design. For Whetsel’s design, its daisy-chains do not support chip-level trace reuse because the pattern depends on the sequential depth in the daisy chain of the core
being tested. Oakland’s architecture does not have proper non-core logic
implementation or lockstep mechanism to allow chip-level trace reuse. On the other
hand, Parulkar’s architecture comes with lockstep mode; allow good trace reusability
for functional testing.
In our proposed design, we do not adopt lockstep mode to avoid making the
design complicated. However, our design makes use the non-core TAP controller as
the master control for various multi-TAP configurations. This allows us to maintain
the bus for the same non-core, and the same connections between the non-core and
the cores, therefore minimizing impact to trace. Our design also has a symmetric
non-core TAP controller and logics, this is very important to ensure core traces
match for all cores. As a result, our design definite stands for good chip-level trace
reusability.
3.2.4 Scalability
Scalability, as mentioned in chapter 1, is important to allow proliferation of a
converged core design, with minimum design cost and fast time to market. Whetsel
and Oakland’s architecture support both lateral and hierarchical scalability. However,
Parulkar’s design does not support design scalability as its daisy-chain and two-way
broadcast configuration would have required significant re-design.
Since we adopt the scalable concept from Whetsel and Oakland’s design, our
multi-TAP architecture is support upward lateral and downward lateral scalability, as
well as hierarchical scalability.
30
Upward lateral scalability for serial mode logics is illustrated in Figure 3.5,
using an example of proliferating a dual-core multi-TAP configuration into a quadcore multi-TAP configuration. Design cost is very minimal in terms of logics and
routings. Similar scaling applies to concurrent mode logics too.
Figure 3.5
Upward Lateral Scalability for Serial Mode Logics
Upward hierarchical scalability for concurrent mode logics is illustrated in
Figure 3.6. Again, design cost is very minimal.
31
Figure 3.6
3.3
Upward Hierarchical Scalability for Concurrent Mode Logics
Overall High Level Diagram of the Proposed Multicore DFx
By combining the serial mode and concurrent mode logics together in a highlevel diagram, as illustrated in Figure 3.7, the proposed multicore DFx interface will
successfully serve our key objectives of trace reusability, concurrent core testability
and scalability, at the same time maintain pin compatibility as well as IEEE1149.1
compatibility.
Figure 3.7
Proposed Multicore DFx Interface
CHAPTER 4
DESIGN IMPLEMENTATION AND RTL CODINGS
4.1
Choice of Language and Simulator
There are many languages available for RTL modeling, some proprietary
modeling languages developed by some design companies, and some others are
publicly adopted. Among all, the more popular ones are VHDL, Verilog 95 and
System Verilog.
When choosing the right modeling languages, there are a few criteria to look
upon, among them being the level of adoption in the industry, portability of the
language, the availability of decent simulator to work with the language, the level of
abstraction, the capability of the language to model required logics, etc.
No doubt, VDHL, Verilog 95 and System Verilog are highly portable, as their
level of adoption in the industry is very wide. However, in recent years, as System
Verilog becomes more and more powerful, a lot of high tech design companies have
started migration towards System Verilog. As such, over time, System Verilog will
be a wise choice, in term of design portability, level of support available, and level of
industrial level adoption.
34
VHDL and Verilog 95, while still very useful, are somewhat loosing out to
System Verilog in term of the capability to code and model logics at higher level of
abstraction and being more object-oriented-programming like. For example, two or
three lines of C-Programming like for-loop syntax in System Verilog can easily and
handsomely model very large and complicated array logics. From validation
perspective, if the fewer lines are needed to code the logics, the likelihood to
introduce bugs will be reduced.
The table below summarizes the comparison among various RTL modeling
languages. Looking at the advantages of System Verilog, it becomes a very obvious
choice for the modeling language to be adopted for this project.
Table 4.1
Comparison of various RTL Modeling Languages
Proprietary
VHDL
Verilog 95
System
Verilog
Adoption in the industry
Low
Wide
Wide
Very wide
Design Portability
Low
High
High
High
Decent Simulator support
Low
Wide
Wide
Wide
Abstraction Level
Low
Medium
High
Very High
A simulator is a tool or software packages used for verification of the logic
modeling, which behaves or operates as if the model is a functional system when
provided with a set of input stimulus. To choose a suitable simulator for this project,
a few popular choices are looked upon, such as VCS from Synopsis, Modelsim from
Mentor Graphics and Quartus II from Altera.
While all the three simulators are very powerful and popular, licensing, the
ease for debug and the compatibility with other validation languages becomes the
decisive factors.
35
After some investigation and research, it is understood that the older version
of Quartus II which provides free license, is not able to support System Verilog
language at all; whereas the newer version does support System Verilog, but only
provide free license for 60 days. On the other hand, from the past experience of a
Master project student, Quartus II does not provide good debug feature. As a result, it
is not considered a good choice.
On the other hand, Modelsim and VCS are equally good candidates, as both
are available in my company, hence licensing is not an issue. In addition, both have
decent support to some popular validation languages, making them more validation
friendly. The tool bundle also has a wide range of debug features, which can be
extremely useful to aid the RTL coder to root-cause modeling issue during the
modeling phase.
However, since recent years, there is a strong trend of the company migrating
towards VCS, and VCS has greatly improvised the performance of its simulator, it
naturally stands out as a better choice. While Modelsim is still in use, VCS has
received a lot more technical support. In the foreseeable future, VCS may become
the only choice of RTL modeling language in use in my company.
The table below summarizes the comparison among various simulators.
Looking at the advantages of VCS, it becomes a very obvious choice for the
simulator to be adopted for this project.
Table 4.2
Comparison of various RTL Modeling Languages
Quartus II
Modelsim
VCS
License Availability
Poor
Very Good
Very Good
Good Debug Features
Poor
Good
Very Good
Unknown
Very Good
Very Good
No
Good
Very Good
Decent Validation Language
Support
Company’s Adoption
36
4.2
High Level Modular Block Design
As mentioned before, modular design approach will be used for behavioral
and logical designs. By sub-dividing the design into multiple logical blocks, any
prototyping codes can be written in corresponding design modules at any stage,
without needing major re-work in the overall design. The design logics will be
modularized into logically and functionally meaningful blocks, as well as with
minimal interface connection with adjacent modules.
This dual-core design will be broken down into a few high-level modular
blocks, which includes a core0 Test Access Port (TAP), a core1 TAP, a non-core
TAP, a Multicore DFx module, and a MISR in each core. In each TAP, there are also
some IEEE1149.1 compliant registers and mux such as Instruction Register (IR),
Instruction Decode Control Logic, Bypass Register, Device ID (IDCODE) Register,
and TDO mux. Besides, each TAP has a standalone TAP Finite State Machine
(FSM) module.
Figure 4.1
High-Level Modular Block Diagram
37
4.3
TAP FSM Design and System Verilog Coding
TAP Finite State Machine (FSM), which is also called the TAP controller
logic, is the heart of the JTAG logics. According to IEEE1149.1 standard, the TAP
controller is a synchronous finite state machine that responds to changes at the TMS
and TCK signals of the TAP and controls the sequence of operations of the circuitry
defined by this standard.
Figure 4.2
TAP Controller Finite State Machine
This finite state machine, which is shown in Figure 4.2, contains a reset state,
a run-test/idle state, and two major branches. These branches allow access either to
the TAP Instruction Register or to one of the data registers implemented in the TAP
and the core. The TMS pin is used as the controlling input to traverse this finite state
machine. TAP instructions and test data are loaded serially (in the Shift-IR and ShiftDR states, respectively) using the TDI pin.
38
Following is a brief description of each of the states of the TAP controller
state machine, summarized from the IEEE1149.1 standard. Refer to the IEEE 1149.1
standard for detailed descriptions of the states and their operation.
Test-Logic-Reset: In this state, the test logic is disabled so that normal operation of
the processor can continue unhindered. In this state, the instruction in the Instruction
Register is forced to IDCODE. No matter what the original state of the controller, the
controller enters Test-Logic-Reset when the TMS input is held active for at least five
clocks. The controller also enters this state immediately when TRST# is pulled
active, and automatically upon power-up of the processor. The TAP controller cannot
leave this state as long as TRST# is held active.
Run-Test/Idle: This is the idle state of the TAP controller. In this state, the contents
of all test data registers retain their previous values.
Select-IR-Scan: This is a temporary controller state. All registers retain their
previous values.
Capture-IR: In this state, the shift register contained in the Instruction Register
loads a fixed value (of which the two least significant bits are “01”) on the rising
edge of TCK. For this design, the fixed value is “0000001”. The parallel, latched
output of the Instruction Register (“current instruction”) does not change in this state.
Shift-IR: The shift register contained in the Instruction Register is connected
between TDI and TDO and is shifted one stage toward its serial output on each
rising edge of TCK. The output arrives at TDO on the falling edge of TCK. The
current instruction does not change in this state.
Exit1-IR: This is a temporary state. The current instruction does not change in this
state.
Pause-IR: Allows shifting of the Instruction Register to be temporarily halted. The
current instruction does not change in this state.
39
Exit2-IR: This is a temporary state. The current instruction does not change in this
state.
Update-IR: The instruction which has been shifted into the Instruction Register is
latched onto the parallel output of the Instruction Register on the falling edge of
TCK. Once the new instruction has been latched, it remains the current instruction
until the next Update-IR (or until the TAP controller state machine is reset).
Select-DR-Scan: This is a temporary controller state. All registers retain their
previous values.
Capture-DR: In this state, the Data Register selected by the current instruction may
capture data at its parallel inputs.
Shift-DR: The Data Register connected between TDI and TDO as a result of
selection by the current instruction is shifted one stage toward its serial output on
each rising edge of TCK. The output arrives at TDO on the falling edge of TCK.
The parallel, latched output of the selected Data Register does not change while new
data is being shifted in.
Exit1-DR: This is a temporary state. All registers retain their previous values.
Pause-DR: Allows shifting of the selected Data Register to be temporarily halted
without stopping TCK. All registers retain their previous values.
Exit2-DR: This is a temporary state. All registers retain their previous values.
Update-DR: Data from the shift register path is loaded into the latched parallel
outputs of the selected Data Register (if applicable) on the falling edge of TCK. This
(and Test-Logic-Reset) are the only controller states in which the latched parallel
outputs of a data register can change.
40
In order to code the RTL for the TAP Finite State Machine, a few items need
to be determined or decided.
Firstly, a number of bits need to be allocated, to represent each of the sixteen
states in the FSM. Although various decoding algorithms can be used, such as binary
code and grey code, however, for this case, it will not help to save many bits. To
simplify the design, no decoding algorithm will be used. Instead, sixteen bits will be
used, with each bit directly map to each states of the FSM.
Table 4.3
Table of Bits Representing Each of the States for the TAP FSM
Bits Representing Each State
TAP FSM States
16'b0000000000000001
TEST_LOGIC_RESET
16'b0000000000000010
RUN_TEST_IDLE
16'b0000000000000100
SELECT_DR_SCAN
16'b0000000000001000
SELECT_IR_SCAN
16'b0000000000010000
CAPTURE_DR
16'b0000000000100000
SHIFT_DR
16'b0000000001000000
EXIT1_DR
16'b0000000010000000
PAUSE_DR
16'b0000000100000000
EXIT2_DR
16'b0000001000000000
UPDATE_DR
16'b0000010000000000
CAPTURE_IR
16'b0000100000000000
SHIFT_IR
16'b0001000000000000
EXIT_IR
16'b0010000000000000
PAUSE_IR
16'b0100000000000000
EXIT2_IR
16'b1000000000000000
UPDATE_IR
Secondly, key interface signals, which include input and output signals of the
TAP FSM module, need to be determined. In terms of the input signals, the FSM
41
does not need TDI signal. However, it will require the control of TCK, TMS and
TRST signals. On the other hand, in terms of the output signals, the FSM will not
generate TDO, but instead will generate a range of TAP control signals, such as
shift-DR, capture-IR, etc. The following block diagram gives a view of all the
required interface signals.
Figure 4.3
TAP FSM Interface Signals
With these, the module will be given a name called tap_fsm. The System
Verilog RTL file name will be called tap_fsm.vs, for consistency. Please refer to the
appendices section for the RTL codes.
According to the IEEE 1149.1 specification on the TAP controller operation,
the TAP controller shall change state only in response to the following events:
(i) A rising edge of TCK;
(ii) A transition to logic 0 at the TRST# input (if provided); or
(iii) Power-up
42
Hence, the last section of the TAP FSM RTL code is written to fulfill this
requirement.
4.4
Design and System Verilog Coding of IEEE1194.1 Standard
Beside the finite state machine controller, the TAP logic also consists of a
serially accessible instruction register, instruction decode logic, TDO mux and data
registers. The set of data registers includes those described in the 1149.1 standard,
namely the bypass register, device ID register (IDCODE) and BIST result register.
The TAP logic and all test data registers are accessed serially through 5
dedicated pins:
Table 4.4
TAP pins
Dedicated TAP Pins
Description
TCK
The TAP clock signal
TMS
“Test Mode Select,” through which the TAP FSM is controlled
TDI
“Test Data Input,” through which test instructions and data are entered
serially
TRST#
“Test reset,” for TAP logic reset
TDO
“Test Data Output,” through which test output is read serially
TMS, TDI and TDO operate synchronously with TCK (which is independent
of any other system clock). TRST# is an asynchronous input signal. This 5-pin
interface operates as defined in the 1149.1 specifications. The overall simplified
block diagram of the TAP is illustrated in Figure 4.5.
43
Figure 4.4
Simplified Block Diagram of TAP
4.4.1 The Use of System Verilog Macros
The IEEE 1149.1 requirements specify the need of state elements such as
flip-flop or latch to properly latch certain TAP signals, in order to prevent
unexpected changes to these signals during certain instructions or operations.
Since latches and flip-flops will be frequently used here, to simplify the
System Verilog coding, a specific macro file will be used. The macro file will create
macros of some frequently used state elements, such as a latch, a Master-Slave FlipFlop with Enabled Signal, an Master-Slave Flip-Flop with Asynchronous Reset, etc.
44
Any System Verilog file that includes this macro file will be able to use the
macro directly. This will simplify the state-element coding greatly. Pls refer to the
appendices section for the macros coded in a System Verilog file called macro.vs.
4.4.2 Instruction Register
According to the IEEE 1149.1 standard, the TAP Instruction Register (IR)
allows an instruction to be shifted into the design. The instruction is used to select
the test to be performed or the test data register to be accessed or both. A number of
mandatory and optional instructions are defined by this standard. Further designspecific instructions can be added to allow the functionality of the test logic built into
a component to be extended.
In this particular design implementation, the size of this register is set to be 8
bits long, and is dictated by the parameter called TA_IR_SIZE. The TAP Instruction
Register is asynchronously reset to the IDCODE instruction by entry into the
TLReset state. The TAP instruction register synchronously changes during either the
TLReset or UpdateIR states of the TAP fsm. During TLReset, the IR loads the
IDCODE instruction, and during UpdateIR, the Instruction Register loads the
contents of the IR shift register.
(MSB)
Parallel output
(LSB)
Actual Instruction Register
TDI
Shift Register
TDO
Fixed capture value
Figure 4.5
TAP Instruction Register and Shift Register
45
The above figure shows the simplified physical implementation of the TAP
instruction register. This register consists of a 8-bit shift register (connected between
TDI and TDO), and the actual instruction register (which is loaded in parallel from
the shift register). The parallel output of the TAP instruction register goes to the TAP
instruction decoder. This architecture conforms to the 1149.1 specification.
TAP states which select the instruction register are CaptureIr, ShiftIr and
UpdateIr. During the CaptureIr operation, a fixed value of (1b'00000001) will be
loaded into the Instruction Register. During the ShiftIr operation, IrShift will shift
right by one bit, meaning it will load Most Significant Bit (MSB) with TDI, whereas
the Least Significant Bit (LSB) will go to TDO. During the UpdateDr operation,
IrShift will be latched into the Instruction Register.
(a) Capture-IR
Figure 4.6
(b) Shift-IR
(c) Update-IR
Operation of the TAP Instruction Register
4.4.3 Instruction Decode Control Logic
The Instruction Decode Control Logic receives eight parallel bits from the
Instruction Register and decodes into corresponding control signals.
46
The BYPASS instruction is not decoded because it is the default instruction.
If none of the recognized TAP instructions decode to true, then the TAP acts as if the
!"
bypass instruction is loaded.
Figure 4.7
Decoding of TAP Instructions
4.4.4 Bypass Register
According to the IEEE 1149.1 standard, the bypass register contains a single
shift-register stage and is used to provide a minimum-length serial path between the
47
TDI and the TDO pins of a component when no test operation of that component is
required. This allows more rapid movement of test data to and from other
components on a board that are required to perform test operations. When the
BYPASS instruction is selected, the operation of the test logic shall have no effect on
the operation of the on-chip system logic.
These JTAG rules can be easily satisfied by keeping the Bypass Register
enabled for all TAP instructions, and by letting TDO select the Bypass Register for
connection to TDO, when no other explicit data register connections are required.
The size of the Bypass Register is only one bit. During the CaptureDR
operation, the register content is set to 0. During the ShiftDR operation, the TDI pin
value is shifted into register. During the UpdateDR operation, nothing will be
performed.
Instructions which explicitly select this data register includes iBypass,
iClamp and iHighZ.
4.4.5 Device Identification Register (IDCODE)
The Device Identification Register allows a code to be serially read from the
component that shows the manufacturer’s identity, the part number and the version
number for the part. To conform to IEEE 1149.1 standard specified in section 12.1.1,
the register size will be set to 32 bits long.
Figure 4.8
Structure of the Device Identification Register
48
During the CaptureDR and ShiftDR operations, the Device Identification
values will be loaded into the Device Identification Register. During the UpdateDR
operation, nothing will be performed.
4.4.6 IEEE 1149.1 Compliant Instructions
SAMPLE/PRELOAD, EXTEST and HIGHZ are mandatory featurea of the IEEE
1149.1 standard and shall be included in each component that claims conformance to
this standard. It consists of a number of cells equal to the number of shift-register
stages contained in the register. According to IEEE 1149.1 rules, instructions that
select Boundary-Scan Data Register for connection between TDI and TDO include
EXTEST and SAMPRE. Instructions which cause the Boundary-Scan Register
contents to be driven onto the package pads include EXTEST, and CLAMP.
Instruction which causes tri-state outputs is HIGHZ.
4.4.7 TDO Mux
As suggested by the IEEE 1149.1 standard, all Data Register chains are
multiplexed together, prior to latching for the final data out from the TAP. Please
refer to the appendices section for RTL coding implementations.
49
4.5
Design and RTL Coding of Multicore DFx Logics
The designs of the Multicore DFx logics have been illustrated in figure 3.3
and figure 3.4, for serial and parallel mode respectively. Although the illustrations
describe them in two separate modes, RTL coding and design implementation will
have to implement them together in dual-core configuration, and subsequently
proliferate them into quad-core configuration.
The implemented RTL coding will need to meet the supported modes as
illustrated in table 3.1 and table 3.2 respectively. Both modes, namely MODE=1 for
serial mode, and MODE=0 for parallel cannot active at the same time. Please refer to
the appendices section for RTL coding implementations
4.6
Design and RTL Coding of MISR
Common usage DFX MISR modules are provided for the Unit MISR
implementation. The following figure illustrates the DFX bit MISR module. The 32
bit MISR is constructed of two 16 bit MISRs each implementing the 16 bit
polynomial: x16+x5+x3+x2+1. The Unit user provides the basic inputs: observation
data-in, observation clock, and system reset. For simplicity, the MISRs are
implemented in the same TCLK domain. For future design expandability and
scalability, any combination of 32/64/128/256 bit MISR modules may be constructed
from the 16-bit fundamental MISR building block or module, with addition of some
compression logics.
Each of the MISR0 and MISR1, implemented from 16 bit MISR macro, has
16-bit inputs. Generally, each of the inputs will be used to observe different internal
nodes of a Design-Under-Test (DUT), in order to improve the observability of the
system for coverage improvement. Observability can be easily explained as a
50
measure of how easy it ss to observe a fault effect. MISR comes in handy especially
in cases whereby some internal faults of a Design-Under-Test (DUT) cannot
propagate to primary outputs (POs). In this scenario, the Design-Under-Test (DUT)
will loose coverage if there’s no MISR implemented. Fault propagation can be
explained as enabling logic so a fault effect may be seen.
Figure 4.9
A 16-bit MISR Design
51
Only the lower 16-bits of the MISR inputs will be unused in this project. The
upper 16-bits are coded for redundancy. These unused inputs will be tied to zero
values.
For the ease of recognizing MISR output from each core, specific input bits
are tied in each core in order to produce unique signatures.
•
Core0 MISR signature = EFEB
•
Core1 MISR Signature = EFFD
•
Core2 MISR signature = FDFE
•
Core3 MISR Signature = FEAF
CHAPTER 5
SIMULATION, VERIFICATION AND
ANALYSIS
After the proposed designs are implemented via RTL coding as elaborated in
chapter 3 and chapter 4, a number of simulations need to be carried out in order to
prove the functionality of the implementations. Such verification requires a few
components, including a stable simulation environment capable of compiling the
RTL codes (which are in System Verilog in this project), a properly selected RTL
simulator (which is VCS in this case), a well-selected verification language and
environment collaterals (which is MACE in this case), as well as a layer of wellconfigured API for proper interfacing and hand-shaking between the verification
language and the simulator (which is Csim API and SIMIX in this case).
The following table shows the above-mentioned items and their respective
roles in the simulation and verification.
53
Table 5.1
5.1
Enabling Items for Simulation and Verification
Simulation, Verification and Analysis for Serial Mode
To simulate the design in serial mode configurations, we first begin with the
dual-core design configuration. This allows us to prove the functionality of the serial
mode in a less complicated design. A dual-core CPU configuration takes shorter time
to compile and simulate as compared to a quad-core configuration. In addition, shall
there be any failure, it is less complicated to debug.
When invoking the VCS tool to compile the RTL codes, any syntax or logic
error will be flagged. Debugs need to be carried out and fixes need to be
implemented in order to provide a clean and compliable RTL model. Compilation
may take minutes to hours on a high end machine running Linux operating system,
depending on the complexity of the design. In this case, the compilation took around
10 minutes as the design is kept neat and slim.
Once the compilation has completed, a verification test written in MACE
code will initiate the simulation. The MACE codes work by pounding the TCLK to
toggle and initiate a sequence of operations, including TAP initialization, shifting in
the TAP instructions and data via IR and DR, operate the transition of TMS states in
the TAP finite state machine (FSM), check for proper signal activations and so on.
For detail verification MACE coding, please refer to appendices section.
54
In the beginning, various errors are encountered, and a lot of times were spent
on debugging and fixing the tests as well as the simulation environment. The
challenges and debug hardship will be elaborated further in section 5.3. As the test
and simulation environment have been clean up, the simulation time took around 10
minutes for proper completion without any error.
When the test exited simulation gracefully, a file containing internal signal
dumps will be generated. This file consists of all desired internal signals, with their
states and values in every simulation cycles, from the beginning of the simulation run
till the end when the test completed and exited gracefully. This signal dump file is
very important as it can be opened and viewed in a Graphic User Interface (GUI)
debug tool called RTLWAVE to display the waveform.
Figure 5.1 illustrates the waveform output from the dual-core serial mode
simulation. Some of the important signals have been pulled-out for display to
demonstrate the inter-operational relationship as well as to prove the correct
functionality of the design.
The first red circle shows that in serial mode, with MODE=1, Core0 TAP has
been successfully bypassed. The 2-bit CoreConnect signal correctly reveals that
Core1 is the current active core. This is further proven with the MISR signature readout, which is EFFD. This particular signature is unique only for Core1.
Similarly, with MODE=1 during serial mode and CoreConnect signals
configured to 10, Core1 TAP has been successfully bypassed. Core0 is proven as the
active core, with the correct unique EFEB signature being read out.
Figure 5.1
Dual-core Serial Mode Simulation Results
55
Figure 5.2
Quad-core Serial Mode Simulation Results
56
57
As part of the objective to prove the scalability of the design, the dual-core
CPU design configuration is proliferated to a quad-core configuration with very
minimal design changes. Simulation once again proved that the serial mode
implementation still stands.
This is illustrated in figure 5.2. The first red circle highlights that Core0, 2
and 3 TAPs are successfully bypassed, leaving the right core, namely Core1 as the
active core, and the unique EFFD signature is correctly read out. The similar proofs
are shown for all supported serial mode configurations.
As such, the simulation has achieved it’s goal on serial mode implementation
from lateral scalability perspective, and has proven functionally correct.
5.2
Simulation, Verification and Analysis for Parallel Mode
On the other hand, figure 5.3 illustrates the waveform output from the dualcore parallel mode simulation. Some of the important signals have been pulled-out
for display to demonstrate the inter-operational relationship as well as to prove the
correct functionality of the design.
The first red circle shows that in parallel mode, with MODE=0 and Core
Selected signal = 1, Core1 TAP has been successfully activated. This is further
proven with the MISR signature read-out, which is EFFD. This particular signature is
unique only for Core1.
58
Similarly, with MODE=0 during parallel mode and Selected signal = 0,
Core0 TAP has been successfully activated. Core0 is proven as the active core, with
the correct unique EFEB signature being read out.
Parallel mode also can be proven from lateral scalability perspective. The
dual-core CPU design configuration is proliferated to a quad-core configuration with
very minimal design changes. Simulation once again proved that the parallel mode
implementation still stands.
This is illustrated in figure 5.4. The first red circle highlights that Core1, 2
and 3 TAPs are successfully bypassed, leaving the right core, namely Core0 as the
active core, and the unique EFEB signature is correctly read out. The similar proofs
are shown for all supported parallel mode configurations.
As such, the simulation has achieved it’s goal on parallel mode
implementation from lateral scalability perspective, and has proven functionally
correct.
Figure 5.3
Dual-core Parallel Mode Simulation Results
59
Figure 5.4
Quad-core Parallel Mode Simulation Results
60
61
5.3
Challenges and Solutions
As the project went on for the past one year, a lot of challenges and
difficulties have been faced. These challenges were not only limited to the early
design phase, but a number of road blocks were encountered during RTL coding and
simulation phases as well.
Overall, there are three major challenges that are worth highlighting. Each of
them is a major road block, and could have hampered the project, shall the solution
or workaround not able to be found.
The first major challenge was encountered during the early design phase,
whereby the design conditions and requirements to meet scalability, test concurrency
and content reusability for both serial and parallel modes are very tedious to hit.
Initial design might have met the serial mode configuration and was scalable, but had
failed to comply with the parallel mode configurations.
A lot of times are spent on cross-referencing various industrial designs such
as Whetsel, Oakland and Parulkar. Many trials and errors were carried out with
different logical configuration and implementations. At last, the efforts and
persistence prevails, as these challenges where finally resolved by decision to
implement the following design changes:
•
TAP coherency
- Leverage from Whetsel/Oakland’s synchoronized TCK and TMS
•
Scalability
- Leveraged from Oakland’s JTAG and DR design
•
Concurrency
- Leveraged from Parulkar’s additional Core-Select TAP instruction
62
•
Logic Integration
- Design additional internal signals for “mode” and “CoreConnect”
The second major challenge was encountered during implementation phase,
whereby there is a doubt on how to implement a feature to illustrate correct
functionalities and to prove the success of scalability from dual-core to quad-core.
Lots of times were spent on understanding various serial and parallel DFX
features, including BIST, Scan ATPG, Boundary Scan, Memory BIST, Local Direct
Access Mode (LDAT), Multiple-Input-Shift-Register (MISR), Scanout Signature
Mode, etc. These researches have incurred a significant half-a-month of unexpected
effort and time overhead in the project. Finally a decision was drawn, i.e. to use
parallel MISR in each core.
Built-in-Self-Test (BIST) has multiple inputs and parallel testing capability,
which can serve as a potential candidate. However, it was not chosen due to the fact
that it is a DFT feature on the Automated Pattern Generation (APG) side, which is
not able to produce unique signature to distinguish the output from a particular core.
Scan ATPG, also has multiple inputs and parallel testing capability.
Architectural review shows that it will not be a good candidate for this project, as
designing a Scan Controller System, which requires a Chip-level central Scan
Controller and multiple Core-level Scan hubs, is already too big a task by itself. In
addition, a functional Scan ATPG design also requires scan selection, scan insertion,
scan design-rule-check (DRC), scan clock generation and balancing, scan stitching
and as well as scan routing. Such a mega task would have gone beyond the level of
this project.
On the other hand, Boundary Scan is always seen as a serial DFT feature.
Despite having the possibility of being designed in multiple independent Boundary
Scan chains, Boundary Scan cells are normally located on the PAD and IO area,
63
which in this case, on the uncore side. As a result, it is very unlikely to implement
Boundary Scan features on CPU Cores which do not require any IO buffer or
Boundary Scan cells.
Memory BIST, or MBIST, is a potential parallel DFT feature capable of
doing concurrent testing of different array structures in multiple cores. However,
research shows that implementing MBIST would require some array structures
coding, which in turn requires array access mechanism and memory wrapper design
for address, data and tag. This will add unnecessary complexity and overhead to the
design, which will not be helpful for proving the parallel capability of this multicore
design. Hence MBIST was not chosen as the candidate.
Local Direct Access Testing, or LDAT, is another potential parallel DFT
feature capable of doing concurrent testing of different array structures in multiple
cores. Preliminary architectural research and analysis shows that a functional LDAT
DFT design requires the design and implementation of not only multiple array
structures and memory wrappers, but also complicated DAT mode controller. The
DAT mode controller can either be hard-wired Finite-State-Machine based, or
microcode based. Again, such overkill and tedious complexity deemed this feature as
not a good choice.
Last but not least, Scanout system implemented together with Signature
Mode is an interesting feature commonly used to provide serial and parallel
capability of capturing multiple inputs and compressed for unique signatures for
improvements of observability of a CPU core. However, preliminary architectural
research and analysis shows that a functional Scanout and Signature-Mode design
requires not only a scanout controller design, but also scanout cell design and
insertion and scanout chain connection. This will require quite some big effort.
However, a more in-depth scrutiny reveals that similar parallel and multiple input
features can be achieved via simple MISR. As a result, MISR is finally chosen as the
best candidate without any unnecessarily overhead and overkill.
The difficulties did not stop here, as while MISR is good for configuration in
both serial and parallel mode, but something extra needs to be done in order to prove
64
that various configuration modes are indeed functioning correctly. At the end, after
investigating into various polynomial equations, the final decision is to tie the MISR
inputs in each core to some properly selected values in order to generate a unique
signature specific for each core. As such, the unique signature values read out from
each core can be used as a clear indicator of the right Core and the right TAP being
activated in a desired mode, in both dual-core and quad-core configurations.
For the unique MISR calculation, an in-house MACE code was modified to
do the signature calculation to simplify the tediousness of manually calculating as
many as 16 bits. The code is responsible for initializing MISR, setting up the XOR
structures based on the designated polynomial function, taking in customized input
values for each core, loop through 16 cycles of the XOR operations for the values to
propagate and compressed. The output, which is the residue, will be the unique
signature for each core. Please refer to appendices B, on the “MACE Codes for
MISR Signature Calculation” section for the calculation codes and subroutines.
The third major challenge came by during simulation phase, whereby some
errors and incompatibility of in-house simulator with test writing environment has
caused a big hurdle in the progress. The seriousness of the problem was realized
when it was found out that System verilog language is used in-house for cluster level
validation, but not full-chip usage model level validation. Hence we cannot use
system Verilog to demonstrate the TAP instruction for the Multi-core modes.
On the other hand, Mace is used in-house for usage model level validation,
but it is not compatible with VCS simulator. As a result we need to modify the
SIMIX API for hand-shaking. It requires additional hard work. This has never been
done before. Such an unknown scenario has almost threatened the progress to a
catastrophic halt. Other simulators and languages are not good options due to limited
System Verilog support, incompatible environment, lack of free licensing, etc. At
this point of time, it would have been too late to restart from scratch.
When the going gets tough, the tough gets going. Extra effort has been
invested to on various path-finding, and finally it is decided to manually enhance and
adapt the in-house SIMIX and Mace library to interface with VCS simulator. This
65
required a few weeks of engineering work, and some in-depth understanding of the
Perl and simulator interfacing modules, as well as programming. This is a tough
decision, as the effort incurred is huge, but it would have been better than restart
from scratch.
With weeks of code hacking and simulation trials, finally the efforts and
persistence paid off, as the following problems were resolved:
•
Mace will work for Usage Model level validation
•
Mace will interface with VCS via SIMIX API
•
VCS will guarantee full support for System Verilog RTL coding
•
VCS will work for RTL simulation with in-house license available
The following figure illustrate the final working simulation environment, which has
been successfully proven in this project, yielding satisfactory results and meeting all
desired project goals.
Figure 5.5
MACE Environment Diagram
VCS
= Vendor RTL simulator
Simix
= In-house Event API
66
Csim API
= In-house Test simulation API
Mace Library = Test writing language
5.4
Emacro
= Reusable Mace functions
Tests
= Hand-written test in mace to validate RTL in simulation
Discussion
During the RTL coding, implementation and simulation phases, two
contemporary designs involving multi-core CPU in two of the latest industry CPU
projects where noticed. Each of them is illustrated in figure 5.6 and figure 5.7
respectively.
Prescott Die 0
Prescott Die 1
Disable
Disable
TDO
TDI
TDI TDO
TRST
TCK
TMS
TDO
TDO
TDI TDO
TRST
TCK
TMS
TMS
TCK
TRST
TMS
TCK
TRST
TDI
TDI
TMS
TCK
TRST
Figure 5.6
Contemporary Design #1 on an industry Multi-core CPU (2003)
67
TBOX
TDO
TDI
TMS
TCK
TRST
TDI TDO
TRST
TCK
TMS
TNC = Test Nerve Center
TNC
TNC
TNC
TNC
TNS
TNS
TNS
TNS
TNS = Test Nerve Station
Core 0
Figure 5.7
Contemporary Design #2 on an industry Multi-core CPU (2004)
As a demonstration of the success of my design in this project, a comparison
table has been put up as follow. From all three perspective of design scalability (both
lateral and hierarchical), concurrent core testability and chip-level trace reusability, it
is clearly seen that my design has been a success over all other tabulated multi-TAP
architectures. Parulkar’s design might have come close in the comparison, but its
hardcoded 2-way broadcast was clearly a disadvantage. Other than that, none other
designs have demonstrate themselves as a favorable solution since they fail to fulfill
all the three major requirements for a novel multi-TAP architecture in a multicore
CPU design.
Table 5.2
Comparisons to Illustrate the Advantage of My Design
68
Furthermore, it is worth mentioning that such advantages come with very
minimal gate count and die-area overhead. While various challenges face might have
been painful, however the efforts bear fruits whereby simulations have proven the
functionalities as well as meeting all desired project goals. As a conclusion, this
project is undoubtedly a success.
CHAPTER 6
PROPOSAL FOR PRE-SILICON VALIDATION
METHODOLOGIES AND TOOLS
6.1
Purpose and Importance of Pre-Silicon Validation for DFx Design
In an ideal world, a design shall be fully functional as soon as its RTL coding
is completed. By the time first silicon comes out of fabrication and assembly, the
silicon shall perform as if it is being design, without any functional flaw.
However, this is only an ideal case we can dream of. In reality, bug can be
introduced anytime along the design and RTL coding phase. The later a bug or
functional problem is found, the higher risk it will impose to the project. Functional
bug can be as serious as causing a product to be killed causing billions of dollars,
literally.
For DFx design, bugs may not result in a product recall or killed, but the costs
it can incur can still be in the hundreds of millions. For example, if multi-TAP
feature is not working as expected, it can easily incur longer test time, more test
equipment requirement, etc, resulting in additional hundreds of millions of dollar in
production and equipment costs. As a result, it is undeniable that pre-silicon
70
validation is very important. The purpose is clear: to ensure features are fully
functional prior to TapeOut, and to ensure healthy first silicon.
6.2
Proposal for Pre-Silicon Validation Methodologies, Flow and Tool
As mentioned above, the purpose of pre-silicon validation is to catch any
existing or hidden bug as early as possible to minimize the impact to design health
and project schedule. As such, validation effort shall begin in the early phase of RTL
coding.
With the modular RTL design approach, it makes early validation effort
possible, as each modular block can be viewed as an independent, self-contained,
functional blocks, regardless of the progress of any adjacent blocks. Therefore,
proposing a low level logic validation using System Verilog language and VCS
simulator during the RTL coding phase is a good idea.
As design stabilized over time, a higher level validation against the usage
model shall be carried out. In order to perform high level validation on usage model,
fullchip RTL model readiness is important. With fullchip RTL model available, most
of the modular blocks are already connected together via interfaces. However,
flexibility shall be allowed, whereby any late design blocks can still be block-boxed,
while allowing the validation to proceed with other design blocks at fullchip level.
This is important in cases where late design specification changes causing the delay
in progress of certain modular block’s functionality and RTL coding.
At the last mile of the project or a month or two prior to TapeOut, it may also
be important to perform some corner cases or paranoia checking, to ensure nothing
really being left behind, and to increase the level of confidence for healthy TapeOut.
71
The simplified flow chart below shows the proposed pre-silicon validation
flow in various stages.
!
"
!
$
Figure 6.1
#
%
Proposed Pre-Silicon Validation Flows
For the low-level validation, System Verilog language and VCS simulator is
the automated choice, without doubt. This low-level validation shall be performed by
72
RTL coders, during their coding phase. For example, when a RTL coder is coding his
TAP Finite State Machine block, without even interfacing to any other modular
blocks around, the RTL coder shall implement a few lines of codes to ensure internal
TAP signals, state transitions and control signal generations are working fine.
During this early validation phase, RTL coder can use various tricks to
simplify the validation. They can use signal injection or signal hardcoding whenever
necessary. This is because their main focus is not the entire usage model, but rather
localized functional logics in small design blocks.
As design phase continues, at one stage, fullchip RTL model shall be
available. Although the fullchip model may yet be stable and fully functional, high
level validation can already get started.
To perform high level usage model type of validation, System Verilog
language may not be sufficient anymore. This is due to the fact that at this level,
many considerations need to be taken care of, such as the level of abstraction for the
validation test coding, as well as capability to emulate silicon and production testing
environment. System Verilog is simply too low level and too unfriendly for these
respects.
As such, potential choices of language and tool for high level validation are
MACE macros and environment as well as e-Language and Specman tool. MACE
macros and it’s collateral environments have been elaborated in the previous chapter,
being used in this project for simulation on VCS. On the other hand, another good
option not used in this project is e-Language and Specman tools from Vericity. One
of the key reasons of choosing e-Language and Specman tool is because of the good
integration of them with System Verilog and VCS. In addition, e-Language and
Specman provide many other benefits that are highly desirable, such as ObjectedOriented Programming centric, capability to randomize values, automated and
manual constraints generations, data and protocol checking mechanism, automated
stimulus generation, coverage based validation and even powerful coverage hole
analysis and debug capability. However, it is a relatively new language for DFT
validation field, and has only been recently used in some new Intel projects.
73
As modular level and fullchip level tests are written, passing and centrally
archived, they shall be run as regression suite regularly against any new RTL model
releases. This is important to ensure latest RTL models are healthy, as any bug can
be introduced anytime, even if a particular logic has been validated in the past.
Along the design and validation process, fullchip model health shall improve,
so is the regression passing rate. By the time the project is a month or two before
TapeOut, it is very critical that there is no hidden bug or test hole, which can
potentially impact project TapeOut, or even silicon health. As a result, it is proposed
that having corner cases checking and paranoia checking is vital. To begin such
checking, a workgroup or task force shall be formed, and shall be participated by
micro-architect, logic designer, RTL coder, validator and usage model engineers,
such as engineers from production testing that will use the DFx functions or features
in post-silicon testing. The task force shall brainstorm all possible scenarios and put
up a checklist, prioritize the collected items, and assign ownership.
By the time the paranoia checklist items are completed, and no new bug is
caught by the on-going regression, the design health shall hit a very high confidence
level, and can be declared TapeOut ready.
CHAPTER 7
CONCURRENT TESTABILITY AND
TEST CONTENT REUSABILITY MODEL
7.1
Concurrent Testability
The basic job of testing is to provide a stimulus, observe a response, apply a
mask to that response to filter out nondeterministic X values, compare the masked
response with an expected response, and compress the result of the comparison into
one pass/fail bit per vector and/or one per test.
STIMULUS
MASK
EXPECTATION
RESULT
TESTER
Figure 7.1
FILTER
FAIL
COMPARE
PASS /
COMPRESS
TEST
CONTENT
OBSERVATION
CORE 0
RESPONSE
DUT
Traditional Testing with Most Functionality on the Tester
75
Traditionally, all filtering, comparison and compression has been performed
on the tester as shown in Figure 7.1. However, the presence of multiple instances of
the same core on the chip together with test content reusability provides an
opportunity to save test time and test data volume by moving some or all of this
functionality from the tester to the Design-Under-Test (DUT).
A variety of test resource partitioning schemes are possible, including
performing all comparison on-chip, interchanging the filter and compare steps, and
breaking up the compression into core-level, bit-level, and cycle-level compression
and moving the pieces to different stages in the flow.
Each test resource partitioning scheme has limitations, as illustrated by the
following two scenarios which assume that chip pin constraints allow the tester to
strobe either one core’s observation or the comparison between the cores but not both
at the same time.
The first scenario is observing only the comparator output on the tester while
the test is running can detect a defect in one core which causes the responses of the
two cores to mismatch. The tester can choose not to strobe on a cycle which is
known to have an X value in the core output. However, this approach can not detect
a design-induced speed path which causes both cores to produce the same incorrect
response.
The second scenario is observing only one core’s output on the tester while
the test is running, but comparing it on-chip with the other cores’ outputs and
compressing the result into a sticky bit, can detect an error in the unobserved core
provided the observed core is fault-free. It does not reveal which bit of the second
core failed or on which cycle of the test it failed.
It also does not prove the
unobserved core to be fault-free when the observed core is faulty, and a fail flow
must re-run the test while observing the second core’s output if testing does not stop
with the first failure. The sticky bit requires the core output to be X-free, and tests
must be run on one core at a time while debugging the X values.
76
As seen in Chapter 4, the design goal of this project is to observe both one
core’s response and other cores’ comparison results simultaneously as shown in
Figure 7.2. Concurrent core testability requires all cores to be initialized to exactly
the same state, be clocked at exactly the same frequency, and receive the same cycleby-cycle input stimulus.
STIMULUS
MASK
EXPECTATION
FAIL
RESULT
TESTER
Figure 7.2
FILTER
OBSERVATION
COMPARE
PASS /
COMPRESS
RESULT
CORE 0
RESPONSE
FILTER
FAIL
COMPARE
PASS /
COMPRESS
TEST
CONTENT
OBSERVATION
CORE 1
RESPONSE
DUT
Multicore Testing with Most Functionality on the Chip
Since this project is purely on simulation, there is no real silicon for test
vector verification. However, in order to show how would this Multicore DFX
design helps to allow on-die versus on tester comparison for concurrent testability,
sample codes of test flow routines will be shown. The test flow routine is an
important part of test vector generation, whereby it dictates how the sequence of
CPU and TAP instructions and test vectors would be in place for proper test program
release to tester for silicon testing.
The following test flow routine sample shows the quad-core test program
making use of the on-die comparison capability, i.e. the on-die “COREMATCH”
77
comparison prior to outputting the pass/fail indicator to the last TDO. In this
scenario, we can clearly see that the tests are loaded in parallel to each core, and
executed in parallel as well. Once the tests completed execution in all cores, the test
flow routine will read out only a one-bit pass/fail indicator from the TDO. This onebit pass/fail indicator is generated when the internal COREMATCH compares all the
signatures from all cores via comparator logics. No tester-side comparison is needed.
### Initialization routine ###
&power_up(default);
&reset_sequence(default);
&multicore_init(default);
### Test loading (test vector would have contain COREMATCH instruction) ###
¶llel_test_load(core0, core1, core2, core3);
### Test execution ###
¶llel_test_run(core0, core1, core2, core3);
### Test pass/fail check on TDO ###
### tdo_out = 0 indicates pass, tdo_out = 1 indicates fail ###
&tdo_out_check(default);
### continue next test loading###
………
Figure 7.3
Test Flow Routine demonstrating on-die comparison
On the other hand, the following test flow routine sample shows that the
quad-core test program does not make use of the on-die comparison capability. In
this scenario, additional test flow routine calling the on-tester comparison function
need to be inserted to compare results from each core for pass/fail indication.
78
### Initialization routine ###
&power_up(default);
&reset_sequence(default);
&multicore_init(default);
### Test loading (test vector contains no COREMATCH instruction) ###
¶llel_test_load(core0, core1, core2, core3);
### Test execution ###
¶llel_test_run(core0, core1, core2, core3);
### TDO read-out from each core ###
&tdo_out_read(core0, core1, core2, core3);
### Test compare function call ###
&tester_compare(tdo_out0, tdo_out1, tdo_out2, tdo_out3);
### continue next test loading###
………
Figure 7.4
7.2
Test Flow Routine demonstrating on-tester comparison
Test Content Reusability
Any multicore design has some unique test content for coverage in its noncore. It is likely to have some unique test content targeting logic which can only be
exercised when cores interact with each other. Core-level test content should be as
reusable as the core design itself. It should be reusable across multiple instances of
79
the core in a single design, across multiple products created from fuses and chops of
a single design, or across multiple designs with different non-cores. It may also be
reusable across different configurations of the design; for example, a one-core
configuration used during a fail flow entered after detecting an error in the multicore
configuration.
Reusable core-level test content comes in three forms:
the test
source, the chip-level trace, or the core-level trace.
The test source is either an ATPG tool or a test which is hand-written in a
high-level language. The trace is created by applying the source to a model of the
design and capturing the behavior of a set of signals such as the chip interface, core
interface, or scan nodes.
The test source allows the most uncore design flexibility but is the most
costly to back-end development. It is very tolerant of asymmetric core-and-non-core
interfaces and differences in the non-core design across products. Reusing the test
source avoids having to repeat the early stages of test development; however, a
unique set of traces must be maintained and possibly fault graded separately for each
product.
The follow is a sample test content of a simple functional assembly codes for
basic PCI configuration register write operations. Such assembly codes are CPU core
specific, and do not bother about the CPU pin definitions. Hence the codes are
completely reusable for any of the cores, as well as any core configurations, such as
single core, dual-core or quad-core.
&pci_config_wr_test (
mov dx , 0cf8h
mov eax, 0800000d4h
out dx , eax
mov dx , 0cfch
80
mov eax, 000000019h
out dx, eax
mov dx , 0cf8h
mov eax, 0800000d0h
out dx , eax
mov dx , 0cfch
mov eax, 0e00100f0h
out dx, eax
)
Figure 7.5
Sample assembly code test content reusable for any core
The chip-level trace is produced by simulating the test source on the design in
a 1-core configuration. The trace is then applied to multiple configurations of the
design which have different subsets of cores chopped or disabled. Chip-level trace
reusability allows the least uncore design flexibility but is also the least costly to
back-end development. It requires front-end design to make the non-core transparent
or at least symmetric using DFx hooks in the mission mode logic or a dedicated Test
Access Mechanism. A transparent non-core enables trace reuse with any non-core
by exposing the core interface signals necessary for application of the test at the chip
pins while isolating other core interface signals from the uncore to remove any core
behavior dependence on the core’s environment.
A symmetric non-core enables trace reuse across products with the same
uncore by establishing cycle-by-cycle identical behavior in any 1-core configuration.
Reusing the chip-level trace avoids having to repeat the trace generation and fault
grade process, reduces the volume of test data and the number of testers required to
store it, and requires fewer people to qualify and maintain the traces.
81
The following sample code for trace generation flow demonstrates the reuse
of the above-mentioned assembly code at chip-level, due to the symmetrical property
of the multicore and multi-TAP design, as well as the commonly shared TAP pins.
Such a high degree of reuse completely eliminates the need of trace re-generation.
&load_pin_def(xxtms, xxtck, xxtrst_b, xxtdi0, xxtdi1, xxtdi2, xxtdi3, xxtdo0,
xxtdo1, xxtdo2, xxtdo3);
For (N=0, N<4, N++)
{
&multicore_init;
& pci_config_wr_test();
}
Figure 7.6
Sample test routine code showing test content at chip-level reuse
The core-level trace is also produced by simulating a 1-core design, but it is
then translated into each version of the chip-level trace either by a script which
models the uncore design variations or by simulating the full-chip model with the
core black-boxed. Core-level trace reuse is a compromise between non-core design
flexibility and efficient test generation and maintenance. Reusing the core-level trace
speeds up the chip-level trace generation process as compared to test source reuse
and avoids the need to fault grade the core for each product configuration.
However, it does not relieve the tester and headcount costs as does chip-level
trace reuse. Furthermore, the hard-to-validate script or the black-box model needs to
be maintained as the design changes. Core-level trace reuse is common among IP
core designers who bundle test content with the core but can make very few
assumptions about the chip in which the core will be embedded, leaving the Test
Access Mechanism design to the System-on-Chip integrator.
82
With the symmetrical property of the multicore and multi-TAP design, as
well as the commonly shared TAP connectivities, reuse at core-level is also
achievable with similar manner. The following sample code for trace generation flow
demonstrates the reuse of the above-mentioned assembly code at core-level,
eliminating the need of trace re-generation for any core configurations
For (N=0, N<4, N++)
{
&load_core_interface(
tms_N
tck_N
trst_b_N
tdi_N
tdo_N
);
&multicore_init;
& pci_config_wr_test();
}
Figure 7.7
Sample test routine code showing test content at core-level reuse
The design in this project shall be able to support all three types of test
content and trace reusability to a certain extend as demonstrated in various sample
codes above.
CHAPTER 8
SUMMARY AND FUTURE WORK
8.1
Summary
From this project, the Multicore DFx literal review and analysis has been
completed. As a result of the analysis, pros and cons of various industrial multicore
DFx and multi-TAP architectures have been identified. A proposed design
improvement is designed, and has successfully met with all the necessary objectives,
namely design scalability, trace reusability and concurrent testability.
As part of the design implementation via RTL coding and prove of concept
via simulation, the functionality of the system has been proven via scalable serial
mode and parallel mode Multicore interface in dual-core and quad-core
configurations. In addition, pre-silicon validation tool, flow and methodology have
been proposed. Besides, test content reuse and concurrent testing strategy with
Multicore DFx Design has also been proposed.
Throughout this project, the challenges of multicore DFx and multi-TAP
design as elaborated in Chapter 5 have been learnt. The hardship of initial RTL
modeling of the TAP logics as well as simulation in VCS have also been
84
experienced. As a conclusion, this project is a success, meeting all the desired
objectives.
.
8.2
Future Work
Since this Multicore DFx Interface design is a considerably new topic, there
are a lot more that can be done for anyone interest to extend researches on this topic.
The following are some suggestions on what are the possible future works.
•
Multicore DFx design can be further refined for structural test mode (for cost
saving structural testing platform)
•
It can be further improved for allowing high speed parallel test modes
•
It can also be refined to support multi-heterogeneous core
•
Proper core isolation integration will improve chip-level trace reusability
significantly
•
Core isolation can even help Multicore DFx design to improve it’s
debugability
•
Effort can also be spent to improve compatibility and features external or
proprietary tester-level and system-level debug tools to take full advantage of
Multicore DFx interface benefits
85
REFERENCES
1. Francisco DaSilva, Yervant Zorian, Lee Whetsel, Karim Arabi, and Rohit Kapur.
“Overview of the IEEE P1500 Standard”. 2003 International Test Conference, pp.
988-997.
2. IEEE P1500 Web Site. http://grouper.ieee.org/groups/1500/.
3. Lee Whetsel. “An IEEE 1149.1 Based Test Access Architecture for ICs with
Embedded Cores”. 1997 International Test Conference, pp. 69-78.
4. Steve Oakland. “Considerations for Implementing IEEE 1149.1 on System-on-aChip Integrated Circuits”. 2000 International Test Conference, pp. 628-637.
5. Ishwar Parulkar, Thomas Ziaja, Rajesh Pendurkar, Anand D’Souza, and Amitava
Majumdar. “A Scalable, Low Cost Design-for-Test Architecture for UltraSPARC
Chip Multi-Processors”. 2002 International Test Conference, pp. 726-735
6. Yuejian Wu and Paul MacDonald, Member, IEEE. “Testing ASICs with Multiple
Identical Cores”. 2003 IEEE Trans. Comput., pp327-336
7. Bart Vermeulen, Tom Waayers, and Sjaak Bakker. “IEEE 1149.1-compliant Access
Architecure for Multiple Core Debug on Digiral System Chips”. 2002 International
Test Conference, pp.55-63
8. Adam Osseiram. “Test Standards (With Focus on IEEE1149.1)”. 1996 IEEE ,
pp.708-711
9. “IEEE Standard Test Access Port and Boundary-Scan Architecture”, IEEE
Computer Society, February 1990.
APPENDICES A
87
TAP FSM RTL Coding
//
// Module tap_fsm
//
module tap_fsm(
CkTapT1N00,
tmsT731H,
trstT731H_b,
cap_DRT731H,
cap_IRT731H,
rtidleT731H,
shift_DRT731H,
shift_IRT731H,
tlresetT731H,
update_DRT731H,
update_IRT731H
);
// Internal Declarations
input
input
input
output
output
output
output
output
output
output
output
CkTapT1N00;
tmsT731H;
trstT731H_b;
cap_DRT731H;
cap_IRT731H;
rtidleT731H;
shift_DRT731H;
shift_IRT731H;
tlresetT731H;
update_DRT731H;
update_IRT731H
// Module Declarations
// State encoding parameters
localparam
TEST_LOGIC_RESET = 16'b0000000000000001 ,
RUN_TEST_IDLE = 16'b0000000000000010 ,
SELECT_DR_SCAN = 16'b0000000000000100 ,
SELECT_IR_SCAN = 16'b0000000000001000 ,
CAPTURE_DR = 16'b0000000000010000 ,
SHIFT_DR = 16'b0000000000100000 ,
EXIT1_DR = 16'b0000000001000000 ,
PAUSE_DR = 16'b0000000010000000 ,
EXIT2_DR = 16'b0000000100000000 ,
UPDATE_DR = 16'b0000001000000000 ,
CAPTURE_IR = 16'b0000010000000000 ,
SHIFT_IR = 16'b0000100000000000 ,
EXIT_IR = 16'b0001000000000000 ,
PAUSE_IR = 16'b0010000000000000 ,
EXIT2_IR = 16'b0100000000000000 ,
UPDATE_IR = 16'b1000000000000000 ;
88
//-------------------------------------------------------------// Next State Block for machine csm
//-------------------------------------------------------------always_comb
begin : csm_next_state_block_proc
// Default Assignment
// Default Assignment To Internals
// Combined Actions
(* name = "nhm_tap_fsm_1" *)
unique casex (psntst)
TEST_LOGIC_RESET:
begin
(* name = "nhm_tap_fsm_2" *)
unique casex(1'b1)
tmsT731H == 1'b0:
nextst = RUN_TEST_IDLE;
tmsT731H == 1'b1:
nextst = TEST_LOGIC_RESET;
endcase
end
RUN_TEST_IDLE:
begin
(* name = "nhm_tap_fsm_3" *)
unique casex(1'b1)
tmsT731H == 1'b1:
nextst = SELECT_DR_SCAN;
tmsT731H == 1'b0:
nextst = RUN_TEST_IDLE;
endcase
end
SELECT_DR_SCAN:
begin
(* name = "nhm_tap_fsm_4" *)
unique casex(1'b1)
tmsT731H == 1'b1:
nextst = SELECT_IR_SCAN;
tmsT731H == 1'b0:
nextst = CAPTURE_DR;
endcase
end
SELECT_IR_SCAN:
begin
(* name = "nhm_tap_fsm_5" *)
unique casex(1'b1)
89
tmsT731H
nextst
tmsT731H
nextst
endcase
== 1'b1:
= TEST_LOGIC_RESET;
== 1'b0:
= CAPTURE_IR;
end
CAPTURE_DR:
begin
(* name = "nhm_tap_fsm_6" *)
unique casex(1'b1)
tmsT731H == 1'b0:
nextst = SHIFT_DR;
tmsT731H == 1'b1:
nextst = EXIT1_DR;
endcase
end
SHIFT_DR:
begin
(* name = "nhm_tap_fsm_7" *)
unique casex(1'b1)
tmsT731H == 1'b1:
nextst = EXIT1_DR;
tmsT731H == 1'b0:
nextst = SHIFT_DR;
endcase
end
EXIT1_DR:
begin
(* name = "nhm_tap_fsm_8" *)
unique casex(1'b1)
tmsT731H == 1'b0:
nextst = PAUSE_DR;
tmsT731H == 1'b1:
nextst = UPDATE_DR;
endcase
end
PAUSE_DR:
begin
(* name = "nhm_tap_fsm_9" *)
unique casex(1'b1)
tmsT731H == 1'b1:
nextst = EXIT2_DR;
tmsT731H == 1'b0:
nextst = PAUSE_DR;
endcase
end
90
EXIT2_DR:
begin
(* name = "nhm_tap_fsm_10" *)
unique casex(1'b1)
tmsT731H == 1'b1:
nextst = UPDATE_DR;
tmsT731H == 1'b0:
nextst = SHIFT_DR;
endcase
end
UPDATE_DR:
begin
(* name = "nhm_tap_fsm_11" *)
unique casex(1'b1)
tmsT731H == 1'b1:
nextst = SELECT_DR_SCAN;
tmsT731H == 1'b0:
nextst = RUN_TEST_IDLE;
endcase
end
CAPTURE_IR:
begin
(* name = "nhm_tap_fsm_12" *)
unique casex(1'b1)
tmsT731H == 1'b0:
nextst = SHIFT_IR;
tmsT731H == 1'b1:
nextst = EXIT_IR;
endcase
end
SHIFT_IR:
begin
(* name = "nhm_tap_fsm_13" *)
unique casex(1'b1)
tmsT731H == 1'b1:
nextst = EXIT_IR;
tmsT731H == 1'b0:
nextst = SHIFT_IR;
endcase
end
EXIT_IR:
begin
(* name = "nhm_tap_fsm_14" *)
unique casex(1'b1)
tmsT731H == 1'b0:
nextst = PAUSE_IR;
tmsT731H == 1'b1:
91
nextst = UPDATE_IR;
endcase
end
PAUSE_IR:
begin
(* name = "nhm_tap_fsm_15" *)
unique casex(1'b1)
tmsT731H == 1'b1:
nextst = EXIT2_IR;
tmsT731H == 1'b0:
nextst = PAUSE_IR;
endcase
end
EXIT2_IR:
begin
(* name = "nhm_tap_fsm_16" *)
unique casex(1'b1)
tmsT731H == 1'b1:
nextst = UPDATE_IR;
tmsT731H == 1'b0:
nextst = SHIFT_IR;
endcase
end
UPDATE_IR:
begin
* name = "nhm_tap_fsm_17" *)
unique casex(1'b1)
tmsT731H == 1'b1:
nextst = SELECT_DR_SCAN;
tmsT731H == 1'b0:
nextst = RUN_TEST_IDLE;
endcase
end
default: begin
nextst = TEST_LOGIC_RESET;
end
endcase
// Global Actions
cap_DRT731H
shift_DRT731H
update_DRT731H
cap_IRT731H
shift_IRT731H
update_IRT731H
rtidleT731H
tlresetT731H
=
=
=
=
=
=
=
=
psntst[4];
psntst[5];
psntst[9];
psntst[10];
psntst[11];
psntst[15];
psntst[1];
psntst[0];
end // Next State Block
92
//-----------------------------------------------------------// Clocked Block for machine csm
//-----------------------------------------------------------always_ff @(
posedge CkTapT1N00 or negedge trstT731H_b
) begin : csm_clocked_block_proc
if (!trstT731H_b) begin
psntst <= TEST_LOGIC_RESET;
// Reset Values
end
else
begin
psntst <= nextst;
end
end // Clocked Block
93
State-Element Macro RTL Coding
///==============================================
/// This file contains macro for state-elements
/// macro.vs
///==============================================
///
///
LATCH
(Dest, Src, Clk)
///
EN_MSFF
(Dest, Src, Clk, En)
///
ASYNC_RST_MSFF (Dest, Src, Clk, Rst)
/// LATCH MACRO
///
`define LATCH(q,i,clock)
always_latch
begin
if (clock) q <= i;
end
/// EN_MSFF MACRO
///
`define EN_MSFF(q,i,clock,enable)
always_ff @(posedge clock)
begin
if ((enable)) q <= i;
end
/// ASYNC_RST_MSFF MACRO
///
`define ASYNC_RST_MSFF(q,i,clock,rst)
always_ff @(posedge clock or posedge rst)
begin
if (rst) q <= '0;
else
q <= i;
end
\
\
\
\
\
\
\
\
\
\
\
\
\
94
TAP Instruction Register RTL Coding
///===============================================
///
Includes
///===============================================
`include "tap_fsm.vs"
'include "macro.vs"
//---------------------------------------------------// The Instruction Register - IEEE 1149.1, Chapter 6
//---------------------------------------------------//
// This code implements the 1149.1 instruction register.
// The TAP Instruction Register (IrT731L) is asynchronously reset
// to the IDCODE instruction by entry into the TLReset state.
//
// Value to be shifted into IR Shift Reg
node [7:0] IrShiftInT731H;
// Value of IR Shift Register
node [7:0] IrShiftT731H;
// Enable to IR Shift Register
node
IrShiftEnT731H;
// Input to instruction register
node [7:0] IrInT731H;
always_comb begin : IR_shift
// Shift one and load LSB with TDI
//
if (ShiftIRT731H == 1)
IrShiftInT731H = {TdiT731H, IrShiftT731H[7:1]} ;
else
//Fixed Value which is loaded into TAP IR on Caputre IR
IrShiftInT731H = 8'b00000001;
// enable shifting in CaptureIR or Shift IR state
//
IrShiftEnT731H = CaptureIRT731H | ShiftIRT731H;
unique casex(1'b1)
UpdateIRT731H: IrInT731H = IrShiftT731H;
TLResetT731H: IrInT731H = 8'b00000010;
default:
IrInT731H = IrT731L;
endcase // case(1'b1)
end // block: IR_shift
95
// Store the state of the incoming OPCODE as is shifts
//
`EN_MSFF(IrShiftT731H, IrShiftInT731H, CkTapT1N22,
IrShiftEnT731H)
// If TRST is pulled then go to IDCODE opdoce as per 1149.1
// Asynchronously set the instruction register with TRST.
// Outputs are the IR.
`ASYNC_RST_MSFF(IrT731L[0], IrInT731H[0], CkTapT1N11,
taResetT731H)
`ASYNC_SET_MSFF(IrT731L[1], IrInT731H[1], CkTapT1N11,
taResetT731H)
`ASYNC_RST_MSFF(IrT731L[7:2], IrInT731H[7:2], CkTapT1N11,
taResetT731H)
96
Instruction Decode Control Logic RTL Coding
//--------------------------------------------// Instruction Register Decoder
//--------------------------------------------//
struct {
node
iExtest;
node
iSampPre;
node
iIdcode;
node
iClamp;
node
iHighZ;
node
iBypass;
} idecT731L;
//
//
//
//
//
EXTEST instruciton select
Sample/Preload instruciton select
IDCODE instruction select
Clamp instruction select
HighZ instruction select
always_comb begin : TCLK_INST_DECODE
idecT731L = {default: '0};
//
// IEEE 1149.1 complaint instructions decoding
idecT731L.iExtest
=
({IrT731L[7],IrT731L[6],IrT731L[5],IrT731L[4],IrT731L[3],IrT731L[2],
IrT731L[1],IrT731L[0]} == 'b00000000) || (1'b0);
idecT731L.iSampPre
=
({IrT731L[7],IrT731L[6],IrT731L[5],IrT731L[4],IrT731L[3],IrT731L[2],
IrT731L[1],IrT731L[0]} == 'b00000001) || (1'b0);
idecT731L.iIdcode
=
({IrT731L[7],IrT731L[6],IrT731L[5],IrT731L[4],IrT731L[3],IrT731L[2],
IrT731L[1],IrT731L[0]} == 'b00000010) || (1'b0);
idecT731L.iClamp
=
({IrT731L[7],IrT731L[6],IrT731L[5],IrT731L[4],IrT731L[3],IrT731L[2],
IrT731L[1],IrT731L[0]} == 'b00000100) || (1'b0);
idecT731L.iHighZ
=
({IrT731L[7],IrT731L[6],IrT731L[5],IrT731L[4],IrT731L[3],IrT731L[2],
IrT731L[1],IrT731L[0]} == 'b00000111) || (1'b0);
//
// multicore dfx interface instruction coding
idecT731L.iMCImode
=
({IrT731L[7],IrT731L[6],IrT731L[5],IrT731L[4],IrT731L[3],IrT731L[2],
IrT731L[1],IrT731L[0]} == 'b00010000) || (1'b0);
//
// multicore selection and connection instruction coding
idecT731L.iCoreConnect
=
({IrT731L[7],IrT731L[6],IrT731L[5],IrT731L[4],IrT731L[3],IrT731L[2],
IrT731L[1],IrT731L[0]} == 'b00010001) || (1'b0);
97
idecT731L.iCoreSelect
=
({IrT731L[7],IrT731L[6],IrT731L[5],IrT731L[4],IrT731L[3],IrT731L[2],
IrT731L[1],IrT731L[0]} == 'b00010010) || (1'b0);
//
// MISR Enable and Read Out instruction coding
idecT731L.iMISREN
=
({IrT731L[7],IrT731L[6],IrT731L[5],IrT731L[4],IrT731L[3],IrT731L[2],
IrT731L[1],IrT731L[0]} == 'b00010011) || (1'b0);
idecT731L.iMsgBusRd
=
({IrT731L[7],IrT731L[6],IrT731L[5],IrT731L[4],IrT731L[3],IrT731L[2],
IrT731L[1],IrT731L[0]} == 'b00010100) || (1'b0);
//
// chosing bypass if instructions failed to be decoded
//
idecT731L.iBypass
=
({IrT731L[7],IrT731L[6],IrT731L[5],IrT731L[4],IrT731L[3],IrT731L[2],
IrT731L[1],IrT731L[0]} == 'b00000011) ||
({IrT731L[7],IrT731L[6],IrT731L[5],IrT731L[4],IrT731L[3],IrT731L[2],
IrT731L[1],IrT731L[0]} == 'b00000101) ||
({IrT731L[7],IrT731L[6],IrT731L[5],IrT731L[4],IrT731L[3],IrT731L[2],
IrT731L[1],IrT731L[0]} == 'b00000110) ||
({IrT731L[7],IrT731L[6],IrT731L[5],IrT731L[4],IrT731L[3]} ==
'b00001) || (1'b0);
end // block: TCLK_INST_DECODE
// Note:
//
The BYPASS instruction is not decoded because it is the
//
default instruction. If none of the below instructions
//
decode to true then the TAP acts as if the bypass
//
instruction is loaded
98
Bypass Register RTL Coding
//----------------------------------------------// The Bypass Register - IEEE 1149.1, Chapter 9
//----------------------------------------------// Only enable in the Capture-DR and
// Update-DR has no meaning
//
node
BypassT731H;
// bypass
node
BypassEnT731H; // bypass
node
BypassInT731H; // bypass
Shift-DR states,
register
enable
register input
always_comb begin
BypassEnT731H = CaptureDRT731H | ShiftDRT731H;
// The bypass flop input is TdiT731H if the TAP
// is in the Shift_DR state. Otherwise the flop
// input is zero.
//
BypassInT731H = ShiftDRT731H & TdiT731H;
end
// flop bypass when BYPASS insturction is active and
// in Capture_DR or Shift_DR.
//
`EN_MSFF(BypassT731H, BypassInT731H
CkTapT1N22, BypassEnT731H)
99
IDCODE Register RTL Coding
//------------------------------------------------------------// The Device Identification Register - IEEE 1149.1, Chapter 11
//------------------------------------------------------------node
node
node
node
node
[7:0]
[7:0]
ShfIdCodeT731H;
CapIdCodeT731H;
IdCodeSelT731H;
IdCodeInT731H;
IdCodeT731H;
always_comb begin
// Signals to control the shifting and capturing of the
// identification code. When any of these signals go
// high, the corresponding action is executed.
//
ShfIdCodeT731H = idecT731L.iIdcode & ShiftDRT731H;
CapIdCodeT731H = idecT731L.iIdcode & CaptureDRT731H;
IdCodeSelT731H = ShfIdCodeT731H | CapIdCodeT731H;
unique casex(1'b1)
//IDCODE=10101010101010101010101010101011
CapIdCodeT731H : IdCodeInT731H =
'b10101010101010101010101010101011;
ShfIdCodeT731H : IdCodeInT731H =
'b10101010101010101010101010101011;
default : IdCodeInT731H = '0;
endcase // casex(1'b1)
end // always_comb begin
`EN_MSFF(IdCodeT731H, IdCodeInT731H, CkTapT1N22, IdCodeSelT731H)
100
Boundary Scan Signals RTL Coding
//--------------------------------------------------------// Signals for the BScan Register - IEEE 1149.1, Chapter 10
//--------------------------------------------------------node BscanMuxSelT731L;
always_comb begin : Boundary_Scan_and_HighZ
unique casex (1'b1)
idecT731L.iSampPre : {BscanMuxSelT731L,
pttaBscanChainSelT731H} = 3'b1_01;
idecT731L.iClamp
: {BscanMuxSelT731L,
pttaBscanChainSelT731H} = 3'b0_10;
idecT731L.iExtest : {BscanMuxSelT731L,
pttaBscanChainSelT731H} = 3'b1_11;
default
: {BscanMuxSelT731L,
pttaBscanChainSelT731H} = 3'b0_00;
endcase // casex(1'b1)
pttaHighZSelT731H = idecT731L.iHighZ | PfPOCTriStateNnnnH;
end
always_comb begin
CapRunBistT731H = idecT731L.iRunBist & CaptureDRT731H;
ShfRunBistT731H = idecT731L.iRunBist & ShiftDRT731H;
UpdRunBistT731H = idecT731L.iRunBist & UpdateDRT731H;
unique casex(1'b1)
CapRunBistT731H : RunBistInT731H = RunBistResultU75nH;
ShfRunBistT731H : RunBistInT731H = TdiT731H;
default
: RunBistInT731H = BigShiftT731H[0];
endcase // casex(1'b1)
end
101
TDO Mux RTL Coding
//-------------------------------------------------// TDO driving pins - IEEE 1149.1, Chapter 5.2
//-------------------------------------------------node TdoDRMuxT731H;
// DR scan chains output
always_comb begin
// all DR scan chains are muxed together in this mux.
casex (1'b1)
idecT731L.iExtest :
idecT731L.iSampPre :
idecT731L.iIdcode :
idecT731L.iClamp
:
idecT731L.iHighZ
:
iBigShiftT731H
:
default
:
endcase // casex(1'b1)
TdoDRMuxT731H
TdoDRMuxT731H
TdoDRMuxT731H
TdoDRMuxT731H
TdoDRMuxT731H
TdoDRMuxT731H
TdoDRMuxT731H
=
=
=
=
=
=
=
BypassT731H;
BypassT731H;
IdCodeT731H[0];
BypassT731H;
BypassT731H;
BigShiftT731H[0];
BypassT731H;
//-------------------------------------------------// Other TDO Muxing
//-------------------------------------------------always_comb begin
// all DR scan chains are muxed together in this mux.
//
// Note: The BYPASS instruction is not decoded because it is the
//
default instruction. If none of the below insturctions
//
decode to true then the TAP acts as if the bypass
//
instruction is loaded.
unique casex (1'b1) // TDO Mux selext
BscanMuxSelT731L
: TdoDRMuxT731H =
pataBscanTdoT731H;
idecT731L.iMciMode
: TdoDRMuxT731H =
PttaMciTaTdoT731H;
idecT731L.iMISREn
: TdoDRMuxT731H =
PttaMIRSEnTaTdoT731H;
idecT731L.iMsbBusRd
: TdoDRMuxT731H =
pttaMsbBusRdOutTaTdoT731H;
idecT731L.iCoreConnect
: TdoDRMuxT731H =
pttaCoreConnectTnnnH[1:0];
idecT731L.iCoreSelect
: TdoDRMuxT731H =
pttaCoreSelectTnnnH;
default
: TdoDRMuxT731H =
BypassT731H;
endcase // casex(1'b1)
end
end
102
//-------------------------------------------------// Final TDO Muxing
//-------------------------------------------------//
//
node
All TDO sources must be muxed together prior to latching
for the final data out of this tap.
TapClkTdoT731H;
// TDO for tap clock logic
// mux to select between DR output mux and the IR shift chain
always_comb begin
casex (1'b1)
ShiftIRT731H
: TapClkTdoT731H = IrShiftT731H[0];
mltaTdoSelT731L : TapClkTdoT731H = mltaTdoT731H;
default
: TapClkTdoT731H = TdoDRMuxT731H;
endcase // case(1'b1)
end
//-------------------------------------------------// TDO output - IEEE 1149.1, Chapter 5.2
//-------------------------------------------------//
TDO is only drive low during ShiftIR and ShiftDR.
node TdoEnableT731H;
node TapTdoEnabledT731H;
assign TdoEnableT731H = ShiftDRT731H | ShiftIRT731H;
assign TapTdoEnabledT731H = TapClkTdoT731H | (~TdoEnableT731H);
//
Tap clock TDO data is mux selected is not sent directly to
`LATCH(TdoT731L, TapTdoEnabledT731H, CkTapT1N11)
103
MISR Macro Coding (32-bit inputs coded, upper 16-bits for redundancy)
//
//
//
//
//
//
//
//
dfx 32 bit MISR module w/0 input compression
-------------------------------------------4 MISR modules provided--dfx 32 bit MISR module w/0 input compression
upper 16-bits for redundancy are unused
unused bits will be tied to zeros
//-----------------------------------------------------------------//
Interface
//-----------------------------------------------------------------module dfxmisr32 (
// MISR I/Os
input logic
input logic
input logic
input logic
input logic
input logic
obs_clk,
obs_clk_usync,
[31:0] obs_din,
nctap_misrstart,
sys_reset,
msg_bus_wr,
output logic [31:0] msg_bus_rd
);
//-----------------------------------------------------------------//
Local Node/Variable Declarations
//-----------------------------------------------------------------logic [15:0] misr0_in;
logic [15:0] misr0_out;
logic [15:0] misr1_in;
logic [15:0] misr1_out;
logic misr_run;
logic misr_reset;
logic [15:0] misr_poly;
genvar misr_i;
//
//
//
//
//
//
//
MISR0 input
MISR0 output
MISR0 input
MISR0 output
MISR run
MISR reset
MISR Polynomial
//-----------------------------------------------------------------//
Main Procedure
//-----------------------------------------------------------------//-----------------------------------------------------------------// MISR
//-----------------------------------------------------------------// There are two 16 bit MISRs connected to a 32 bit message bus
//
// The 16 bit MISR characteristic polynomial is x16+x5+x3+x2+1
//
//
//
//
//
//
//
//
//
//
+-------+---+-------+-------------------------------------------------+
|
|
|
|
|
V
V
V
V
|
X-0-X-1-X-2-X-3-X-4-X-5-X-6-X-7-X-8-X-9-X-10-X-11-X-12-X-13-X-14-X-15-+ MISR
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
0
1
2
3
4
5
6
7
8
9 10
11
12
13
14
15 OBSERVATION
104
//
// MISR Controls
//
always_comb misr_reset = msg_bus_wr | sys_reset;
`EN_RST_MSFF
(misr_run,nctap_misrstart,obs_clk,obs_clk_usync,sys_reset)
// MISR
//
always_comb misr_poly = 16'b0000000000101101;
// MISR0
generate
for (misr_i = 15; misr_i >= 1; misr_i--) begin : misr0_loop
always_comb
if (misr_poly[misr_i]) begin
misr0_in[misr_i] = misr0_out[15] ^ misr0_out[misr_i-1] ^
obs_din[misr_i];
end
else
misr0_in[misr_i] = misr0_out[misr_i-1] ^ obs_din[misr_i];
end
always_comb misr0_in[0] = misr0_out[15] ^ obs_din[0];
endgenerate
`EN_SET_MSFF (misr0_out,misr0_in,obs_clk,misr_run,misr_reset)
// MISR1
generate
for (misr_i = 15; misr_i >= 1; misr_i--) begin : misr1_loop
always_comb
if (misr_poly[misr_i]) begin
misr1_in[misr_i] = misr1_out[15] ^ misr1_out[misr_i-1] ^
obs_din[misr_i+16];
end
else
misr1_in[misr_i] = misr1_out[misr_i-1] ^ obs_din[misr_i+16];
end
always_comb misr1_in[0] = misr1_out[15] ^ obs_din[0+16];
endgenerate
`EN_SET_MSFF (misr1_out,misr1_in,obs_clk,misr_run,misr_reset)
always_comb msg_bus_rd = {misr1_out,misr0_out};
endmodule
//==================================================================
// End of File
//==================================================================
105
CPU Coding on MISR tied values for Core0
module core0(
input
mclk,
input
bresetB,
input
vcc,
input
vss,
input
tamisrstart
wire [31:0]
misr_data;
wire
loadmisr;
.loadmisr(loadmisr),
.misr_data(misr_data),
dfxmisr32
dfxmisr32
(
.obs_clk(mclk),
.obs_din({32'b00000000000000000111111111010111}),
.tamisrstart(tamisrstart),
.sys_reset(~bresetB),
.msg_bus_wr(loadmisr),
.msg_bus_rd(misr_data)
);
)
endmodule
106
CPU Coding on MISR tied values for Core1
module core1(
input
mclk,
input
bresetB,
input
vcc,
input
vss,
input
tamisrstart
wire [31:0]
misr_data;
wire
loadmisr;
.loadmisr(loadmisr),
.misr_data(misr_data),
dfxmisr32
dfxmisr32
(
.obs_clk(mclk),
.obs_din({32'b00000000000000001110011111110111}),
.tamisrstart(tamisrstart),
.sys_reset(~bresetB),
.msg_bus_wr(loadmisr),
.msg_bus_rd(misr_data)
);
)
endmodule
107
CPU Coding on MISR tied values for Core2
module core2(
input
mclk,
input
bresetB,
input
vcc,
input
vss,
input
tamisrstart
wire [31:0]
misr_data;
wire
loadmisr;
.loadmisr(loadmisr),
.misr_data(misr_data),
dfxmisr32
dfxmisr32
(
.obs_clk(mclk),
.obs_din({32'b00000000000000001111111011010011}),
.tamisrstart(tamisrstart),
.sys_reset(~bresetB),
.msg_bus_wr(loadmisr),
.msg_bus_rd(misr_data)
);
)
endmodule
108
CPU Coding on MISR tied values for Core3
module core3(
input
mclk,
input
bresetB,
input
vcc,
input
vss,
input
tamisrstart
wire [31:0]
misr_data;
wire
loadmisr;
.loadmisr(loadmisr),
.misr_data(misr_data),
dfxmisr32
dfxmisr32
(
.obs_clk(mclk),
.obs_din({32'b00000000000000001111111110110111}),
.tamisrstart(tamisrstart),
.sys_reset(~bresetB),
.msg_bus_wr(loadmisr),
.msg_bus_rd(misr_data)
);
)
endmodule
109
Multicore DFX Interface Logics RTL Coding
module mci_router (
input
node
CkGridT1N00,
input node
TLResetT731H,
output node
PtMciTaTdoT731H,
// MCI TDO to uncore TAP
output node
ptNHTdiT731H[NHM_MAX_NUM_CORES-1:0], // TDI signals to cores
output node
ptNHTmsT731H,
// TMS signal to cores
output node
ptNHTrstT731H_b,
// TRST_b to cores
output node
CkGridCrTapT1N00,
// TCLK to cores
output node
MciParallelModeT731H,
// TCLK mci parallel mode
output node
iScanModeT731H,
// TCLK iScan Mode
output node
DatModeT731H,
// TCLK DAT Mode signal
output node
pttaMciBGFrunUWnnnL,
// Mci BGF run signal for all
MCI BGFs
// MCI Outputs
output node
MciPwrUpH733H,
// Mci HCLK Power enable
output node
MciPwrUpU733H,
// Mci UCLK Power enable
output t_mci_pd20pv
ptNHPdiUpH740H,
// MCI Parallel data to cores 1
& 3
output t_mci_pd20pv
ptNHPdiDnH740H,
// MCI Parallel data to cores 0
& 2
output t_mci_pd20pv
UncMciPdiH740H,
// MCI Parallel data to the
uncore
output node
PtTdoT731L,
// Final TDO to PADs
output node
PTMciVldOutH770H,
output node [MCI_PDO_WIDTH-1:0]
ptcsPdoH747H
// Final Parallel Data to Flitout
);
///========================================
/// Node declarations
///========================================
// iMCIMODE Chain
//
t_tapi_mcimode
DR structure
node
last bit of MCI Mode DR chain.
MciModeT731H;
// MCI Mode
MciModeTdoT731H;
// TDO from
110
node [TA_MCI_DATA_DUP_CNT_WIDTH - 1:0]
bit generater data duplication Counter
node
Serial/Concurrent TAP Mode
node [TA_MCI_MCIMODE_DR_WIDTH-1:0]
node [TA_MCI_MCIMODE_DR_WIDTH-1:0]
// MCICONFIG DR Chain
//
t_tapi_mciconfig
Config DR structure
node
from last bit of MCI Config DR Chain
node [TA_MCI_REF_CORE_SEL_WIDTH-1:0]
Reference core select
node [TA_MCI_CORE_CONNECT_WIDTH-1:0]
connect select
node [TA_MCI_OUT_SELECT_WIDTH-1:0]
routing select
node
power enable
node [TA_MCI_CHAINSEL_WIDTH-1:0]
// TAP steering nodes
//
node [3:0]
core output from steering logic
node [3:0]
selected core
node
output for tap concurrent mode
node
output for tdo_ref and tdo_uncore
// TAP compare nodes
node [3:0]
output matches reference
node
reference signal
node
TDOs from all cores match
DataDupCntT731H;
value
TapModeT731H;
//
MciModeResetNnnnH;
MciModeSetNnnnH;
MciConfigT731H;
// Mci
MciConfigTdoT731H;
// TDO
refcoreselectT731H; //
coreconnectT731H;
// Core
outselectT731H;
// Output
MciPwrUpT731H;
// TCLK
PtTaMciChnSelT732H;
TapSteerOutT731L;
// TAP
corebypassT731H;
// Bypass
tdo_mci_outT731L;
// tdo
tdo_unc_refT731L;
// mux
TmatchT731L;
// core
tdo_refT731L;
// tdo
TmatchAllT731L;
// all
// Parallel Data Control - clocks and clock enables
node
CkPdH1N44;
gated clock
node
CkPdC1N22;
node
CkPdC1N44;
node
gated clock
node
gated clock
node
running ph1
node
running ph1
// Valid
// ph1
CkMciPdiH1N66;
// ph1
CkMciPdoH1N66;
// ph1
CkFreeH1N44;
// free
CkFreeH1N66;
// free
for pdi path
for pdo path
clock
clock
111
node
Internal power good bar
// Parallel Data Control - data nodes
node [3:0]
enable for PDO input data
node
enable for PDO input data
t_mci_pd20pv
Flopped data from csi flit-in
t_mci_pd20pv
PDI Data and valid
node
PwrGoodNnnnH_b;
PdoEnableH745H;
//
// Flop
uncPdoEnableH745H;
// Flop
ptMciPdiH740H;
//
UncMciPdiUnn0H;
// UNC
MciPdiUnGatedH740H;
node
PdiValidH740H;
Generated valid bit to ship with data to cores/uncore
node [MCI_CORE_PDO_WIDTH-1:0]
NHptPdoH746H[3:0];
core pdo data on the input to the fub
node [NHM_MAX_NUM_CORES-1:0]
NHptPdoValH746H;
//
node [MCI_UNC_PDO_WIDTH-1:0]
uncore pdo out of the bgf
node
uncoer valid pdo out
UncPdoH746H;
// flop
UncMciPdoValH746H;
// flop
// flop
node [MCI_CORE_PDO_WIDTH-1:0]
PdoRefH746H;
// pdo
reference signal
node
PdoValRefH746H;
// pdo
valid ref signal
node [MCI_CORE_PDO_WIDTH-1:0]
CBitMatchH746H_b[NHM_MAX_NUM_CORES-1:0]; // All core bits match
(inverted)
node [MCI_CORE_PDO_WIDTH-1:0]
CBitMatchH746H[NHM_MAX_NUM_CORES-1:0]; // All core bits match
node
PMatchH746H[3:0];
// All
bits of a core match the reference
node
PMatchAllH746H;
// All
bits from all core match the reference
node [MCI_PDO_WIDTH-1:0]
PdoH746H;
// PDO
selected with outselect
node [MCI_PDO_WIDTH-1:0]
PdoFinalH746H;
// Final
PDO data prior to output flop
node
PdoValFinalH746H;
// Final
PDO valid prior to output flop
node
PMatchAll for TAP output
node
PMatchAll for TAP output
PMatchAllT731H;
//
PMatchAllT732H;
//
///========================================
/// mci Internal Signal Declarations
///========================================
node
`CLKBF(CkTapT1N22, CkGridT1N00)
CkTapT1N22;
112
assign
PwrGoodNnnnH_b = ~PfVttIntPwrGoodNnnnH;
always_comb begin
// TAP control distribution
CkGridCrTapT1N00 = CkTapT1N22;
ptNHTrstT731H_b = PaTrstT731H_b;
ptNHTmsT731H = PaTmsT731H;
end
///========================================
/// TAP DR Chains
///========================================
`MSFF(PtTaMciChnSelT732H, PtTaMciChnSelT731L, CkTapT1N22)
always_comb begin : MCI_GLOBAL_TAP
//Mux DR Chain outputs for Mode and Config instructions to Mci
TDO
unique casex (PtTaMciChnSelT732H)
2'b01
: PtMciTaTdoT731H = MciModeTdoT731H;
2'b10
: PtMciTaTdoT731H = MciConfigTdoT731H;
default : PtMciTaTdoT731H = '0;
endcase // casex(PtTaMciChnSelT732H)
end
always_comb begin : TAP_iMCIMODE_DR
MciModeT731H.shift
= pttaTapTclkCntlT731H.shift &
PtTaMciChnSelT731L[0];
MciModeT731H.capture = pttaTapTclkCntlT731H.capture &
PtTaMciChnSelT731L[0];
MciModeT731H.update = PfResetXXnnnL & PtTaMciChnSelT731L[0] &
pttaTapTclkCntlT731H.rti;
unique casex (1'b1)
MciModeT731H.shift : begin
MciModeT731H.DR_primin[TA_MCI_MCIMODE_DR_WIDTH-1] =
pttaTapTclkCntlT731H.tdi;
MciModeT731H.DR_primin[TA_MCI_MCIMODE_DR_WIDTH-2:0]
= MciModeT731H.DR_prim[TA_MCI_MCIMODE_DR_WIDTH-1:1];
end
MciModeT731H.capture :
MciModeT731H.DR_primin[TA_MCI_MCIMODE_DR_WIDTH-1:0] =
MciModeT731H.DR_shdw[TA_MCI_MCIMODE_DR_WIDTH-1:0];
default :
MciModeT731H.DR_primin[TA_MCI_MCIMODE_DR_WIDTH-1:0] =
MciModeT731H.DR_prim[TA_MCI_MCIMODE_DR_WIDTH-1:0];
endcase // casex(1'b1)
// MCIMODE DR chain is updated on reset.
MciModeResetNnnnH = (~MciModeT731H.DR_prim &
{TA_MCI_MCIMODE_DR_WIDTH{mcp_MModeUpdT731H}}) |
{TA_MCI_MCIMODE_DR_WIDTH{PwrGoodNnnnH_b}};
MciModeSetNnnnH
= (MciModeT731H.DR_prim &
{TA_MCI_MCIMODE_DR_WIDTH{mcp_MModeUpdT731H}}) &
113
{TA_MCI_MCIMODE_DR_WIDTH{PfVttIntPwrGoodNnnnH}};
MciModeTdoT731H = MciModeT731H.DR_prim[0];
the LSB of the Chain
// DR chain TDO is
TapModeT731H = MciModeT731H.DR_shdw[TA_MCI_SER_CON_TAP_MODE];
DataDupCntT731H =
MciModeT731H.DR_shdw[TA_MCI_DATA_DUP_CNT_MSB:TA_MCI_DATA_DUP_CNT_LSB
];
end // block: TAP_iMCIMODE_DR
// Sample MciModeT731H.update if the uncore tdo or core tdi values
are zero.
// These all will default to 1'b1 if no in shift IR|DR.
node MModeUpdT731H;
node h_MModeUpdT731H;
assign MModeUpdT731H = MciModeT731H.update;
node instTapShiftZeroT731L;
node instptNHTdiT731H;
`ifdef INST_ON
assign instptNHTdiT731H = ~ptNHTdiT731H[0] | ~ptNHTdiT731H[1]
|~ptNHTdiT731H[2] | ~ptNHTdiT731H[3];
assign instTapShiftZeroT731L = ~ptuncTdoT731L | ~NHptTdoT731L[0]
| ~NHptTdoT731L[1] | ~NHptTdoT731L[2] | ~NHptTdoT731L[3] |
instptNHTdiT731H;
`endif
`CUTMCP_HP1(mcp_MModeUpdT731H, MModeUpdT731H, PfVttIntPwrGoodNnnnH,
`INST_ARG(CkFreeT1N00_inst),
`INST_ARG(instTapShiftZeroT731L), 3,
`INST_ARG(TLResetT731H))
`MCP_HP1(h_MModeUpdT731H, PfVttIntPwrGoodNnnnH, CkFreeH1N22,
CkMciPdiH1N66, 99,
`INST_ARG(TLResetT731H), h_MModeUpdT731H)
// TAPMODE needs to hold past the LAST TDO this is zero.
one TCLK cycle MCO.
This is a
node [TA_MCI_CORE_CONNECT_WIDTH-1:0] instcoreconnectT732H;
node instTapModeT732H;
`ifdef INST_ON
`MSFF(instTapModeT732H, TapModeT731H, CkTapT1N22)
`MSFF(instcoreconnectT732H, coreconnectT731H, CkTapT1N22)
`endif // `ifdef INST_ON
`MSFF(MciModeT731H.DR_prim[TA_MCI_MCIMODE_DR_WIDTH-1:0],
MciModeT731H.DR_primin[TA_MCI_MCIMODE_DR_WIDTH-1:0],
CkTapT1N22)
always_comb begin : TAP_iMCICONFIG_DR
114
MciConfigT731H.shift
= pttaTapTclkCntlT731H.shift &
PtTaMciChnSelT731L[1];
MciConfigT731H.capture = pttaTapTclkCntlT731H.capture &
PtTaMciChnSelT731L[1];
MciConfigT731H.update = pttaTapTclkCntlT731H.update &
PtTaMciChnSelT731L[1];
// Primary input mux for MCICONFIG DR
unique casex (1'b1)
MciConfigT731H.shift : begin
MciConfigT731H.DR_primin[TA_MCI_MCICONFIG_DR_WIDTH1] = pttaTapTclkCntlT731H.tdi;
MciConfigT731H.DR_primin[TA_MCI_MCICONFIG_DR_WIDTH2:0]
=
MciConfigT731H.DR_prim[TA_MCI_MCICONFIG_DR_WIDTH-1:1];
end
MciConfigT731H.capture :
MciConfigT731H.DR_primin[TA_MCI_MCICONFIG_DR_WIDTH-1:0]
= MciConfigT731H.DR_shdw[TA_MCI_MCICONFIG_DR_WIDTH1:0];
default :
MciConfigT731H.DR_primin[TA_MCI_MCICONFIG_DR_WIDTH-1:0]
= MciConfigT731H.DR_prim[TA_MCI_MCICONFIG_DR_WIDTH1:0];
endcase // casex(1'b1)
// Shadow input mux for MCICONGIF DR
if (MciConfigT731H.update)
MciConfigT731H.DR_shdwin[TA_MCI_MCICONFIG_DR_WIDTH-1:0] =
MciConfigT731H.DR_prim[TA_MCI_MCICONFIG_DR_WIDTH-1:0];
else
MciConfigT731H.DR_shdwin[TA_MCI_MCICONFIG_DR_WIDTH-1:0] =
MciConfigT731H.DR_shdw[TA_MCI_MCICONFIG_DR_WIDTH-1:0];
MciConfigTdoT731H = MciConfigT731H.DR_prim[0];
is the LSB of the Chain
// DR chain TDO
refcoreselectT731H
=
MciConfigT731H.DR_shdw[TA_MCI_REG_CORE_SEL_MSB:TA_MCI_REG_CORE_SEL_L
SB];
coreconnectT731H
=
MciConfigT731H.DR_shdw[TA_MCI_CORE_CONNECT_MSB:TA_MCI_CORE_CONNECT_L
SB];
outselectT731H
=
MciConfigT731H.DR_shdw[TA_MCI_OUT_SELECT_MSB:TA_MCI_OUT_SELECT_LSB
];
MciPwrUpT731H
= MciConfigT731H.DR_shdw[TA_MCI_POWER_UP
];
MciParallelModeT731H =
MciConfigT731H.DR_shdw[TA_MCI_PARALLEL_MODE
];
end // block: TAP_iMCICONFIG_DR
node [TA_MCI_CORE_CONNECT_WIDTH-1:0] h_coreconnectT731H;
node [TA_MCI_REF_CORE_SEL_WIDTH-1:0] h_refcorselT731H;
node [TA_MCI_OUT_SELECT_WIDTH-1:0]
h_outselectT731H;
115
node
InstPdoData1H745H;
`ifdef INST_ON
// Is the PDO data 0 when the MCICONFIG register is changed?
assign InstPdoData1H745H = (|(NHptPdoH745H[0].data)) |
(|(NHptPdoH745H[1].data)) |
(|(NHptPdoH745H[2].data)) | (|(NHptPdoH745H[3].data)) |
(|(UncMciPdoH745H.data));
`endif
`CUTMCP_HP1(h_coreconnectT731H, coreconnectT731H,
PfVttIntPwrGoodNnnnH, CkFreeH1N22,
`INST_ARG(InstPdoData1H745H), 30,
`INST_ARG(TLResetT731H))
`CUTMCP_HP1(h_outselectT731H,
outselectT731H,
PfVttIntPwrGoodNnnnH, CkFreeH1N22,
`INST_ARG(InstPdoData1H745H), 30,
`INST_ARG(TLResetT731H))
`CUTMCP_HP1(h_refcorselT731H,
refcoreselectT731H,
PfVttIntPwrGoodNnnnH, CkFreeH1N22,
`INST_ARG(InstPdoData1H745H), 30,
`INST_ARG(TLResetT731H))
///========================================
/// Serial/Concurrent Mode TAP control
///========================================
node [TA_MCI_CORE_CONNECT_WIDTH-1:0] h_corebypassT731H;
always_comb begin : TAP_SER_CON_CNTL
// initialize TapSteerOutT731L prior to using in the comb block
TapSteerOutT731L = '0;
corebypassT731H = ~coreconnectT731H ;
h_corebypassT731H = ~h_coreconnectT731H ;
// core 0 input is from the uncore tap.
if (corebypassT731H[0])
ptNHTdiT731H[0] = 1'b1;
else
ptNHTdiT731H[0] = ptuncTdoT731L;
// Is core0 input bypassed or not?
if (corebypassT731H[0] | TapModeT731H)
TapSteerOutT731L[0] = ptuncTdoT731L;
else
TapSteerOutT731L[0] = NHptTdoT731L[0];
//Use generate loop too connect all core taps
// MCI is designed for 4 cores even for the 1, 2, and 3 core
parts.
for(int core = 1; core < 4; core++)
begin
// core inputs.
116
if (corebypassT731H[core])
ptNHTdiT731H[core] = 1'b1;
else
ptNHTdiT731H[core] = TapSteerOutT731L[core-1];
// Is core input bypassed or not?
if (corebypassT731H[core] | TapModeT731H)
TapSteerOutT731L[core] = TapSteerOutT731L[core-1];
else
TapSteerOutT731L[core] = NHptTdoT731L[core];
end // for (m = 1, m < 4; ++m)
end // block: TAP_SER_CON_CNTL
always_comb begin : TAP_COMPARE
// Compare each core tdo to the reference tdo
// When a core is bypassed or disabled its match bit driven high.
// NHptTdoT731L is an unpacked vector so each core TDO will need
to be done individually.
for (int i = 0; i < NHM_MAX_NUM_CORES; i++)
TmatchT731L[i] = corebypassT731H[i] | (NHptTdoT731L[i] ~^
tdo_refT731L);
// Gererate a signal match bit when all cores match tdo
reference.
TmatchAllT731L = &TmatchT731L;
end // block: TAP_COMPARE
always_comb begin : TDO_STEERING
unique casex (refcoreselectT731H)
2'b00 : tdo_refT731L = NHptTdoT731L[0];
2'b01 : tdo_refT731L = NHptTdoT731L[1];
2'b10 : tdo_refT731L = NHptTdoT731L[2];
2'b11 : tdo_refT731L = NHptTdoT731L[3];
default : tdo_refT731L = NHptTdoT731L[0];
endcase // casex(refcoreselectT731H)
// Mux to select between uncore tdo and tdo_ref
unique casex (outselectT731H[2])
1'b0 : tdo_unc_refT731L = ptuncTdoT731L;
1'b1 : tdo_unc_refT731L = tdo_refT731L;
endcase // casex(outselectT731H[2])
// Mux to select between tmatchall, pmatchall, and tdo_unc_ref
//
unique casex (outselectT731H[1:0])
2'bx1 : tdo_mci_outT731L = TmatchAllT731L;
//
2'b10 : tdo_mci_outT731L = PMatchAllT731H;
2'b10 : tdo_mci_outT731L = PMatchAllT732H;
2'b00 : tdo_mci_outT731L = tdo_unc_refT731L;
endcase // casex(outselectT731H[1:0])
// Final tdo mux select between parallel or serial tap tdo output
//
unique casex (TapModeT731H)
1'b1 : PtTdoT731L = tdo_mci_outT731L;
1'b0 : PtTdoT731L = TapSteerOutT731L[3];
117
endcase // casex(TapModeT731H)
end // block: TDO_STEERING
///========================================
/// Test Parallel Data control
///========================================
`MAKE_CLK_RPH1(CkFreeH1N44, CkFreeH1N22, 1'b1, 1'b0, 1'b0, 1'b1,
1'b1, HclkLcpCtrlTnnnH)
`MAKE_CLK_LOC(CkFreeH1N66, CkFreeH1N44, 1'b1, 1'b1, 1'b0)
// Sync st_modet731h to HCLK (use custom sync cell for these two
flops)
`METAFLOP_2(MciPwrUpH733H, MciPwrUpT731H, CkFreeH1N66)
// Make HCLKs Since we do not want to have
`MAKE_CLK_RPH1 (CkPdH1N44, CkFreeH1N22, MciPwrUpH733H, 1'b0, 1'b0,
1'b1, 1'b1, HclkLcpCtrlTnnnH)
`MAKE_CLK_LOC (CkMciPdiH1N66, CkPdH1N44, 1'b1, 1'b1, 1'b0)
`MAKE_CLK_LOC (CkMciPdoH1N66, CkPdH1N44, 1'b1, 1'b1, 1'b0)
// Clock generator
//
`MAKE_CLK_RPH1(CkPdC1N22, CkGridC1N00, MciPwrUpH733H, '0,
1'b1, 1'b1, HclkLcpCtrlTnnnH)
`CLKBF
(CkPdC1N44, CkPdC1N22)
'0,
///========================================
// st_mode global register assignments
///========================================
//
// Generate free running UCLK for power up synchronizer
`MAKE_CLK_RPH1(CkfreeU1N22, CkGridU1N00, 1'b1, 1'b0, 1'b0, 1'b1,
1'b1, UclkLcpCtrlTnnnH)
`MAKE_CLK_LOC(CkfreeU1N44, CkfreeU1N22, 1'b1, 1'b1, 1'b0)
`METAFLOP_2(MciPwrUpU733H, MciPwrUpT731H, CkfreeU1N44)
// Generate UW clock
`MAKE_DERIVED_UWCLK(CkMciUW1N44, CkGridU1N00, UncClkSyncUnn1H,
PwrGoodNnnnH_b,
1'b1, 1'b0, 1'b1, MciPwrUpU733H, 1'b0, 1'b0, 1'b1,
1'b1, UclkLcpCtrlTnnnH)
`CLKINV(CkMciUW1N55, CkMciUW1N44)
assign st_modeHXnnnH = CSptStModCrHXnnnH[0];
assign Cnt_OffsetHXnnnH = CSptStModCrHXnnnH[6:1];
`ASYNC_RST_MSFF(pttaMciBGFrunUWnnnL, CSptStModCrHXnnnH[7],
CkMciUW1N55, PwrGoodNnnnH_b)
assign Rx_Valid_MaskHXnnnH
= CSptStModCrHXnnnH[8];
assign Tx_Valid_MaskHXnnnH
= CSptStModCrHXnnnH[9];
118
assign Tx_Valid_LoopbackHXnnnH = CSptStModCrHXnnnH[10];
// Mci BGFrun is also st_modeHXnnnH
// moved to mci.vs level
// assign MciBGFrunHXnnnH = csptStModeCrHXnnnH[1];
///========================================
// Core and Uncore PDO Steering and compare
///========================================
// Create clocks enabled with the valid returning from the cores and
uncore
//
`CLKBF(CkPdoCore0H1N66,
CkPdH1N44)
`CLKBF(CkPdoCore1H1N66,
CkPdH1N44)
`CLKBF(CkPdoCore2H1N66,
CkPdH1N44)
`CLKBF(CkPdoCore3H1N66,
CkPdH1N44)
`CLKBF(CkPdoUncCoreH1N66, CkPdH1N44)
// Data flops for parallel
//
assign PdoEnableH745H[0] =
~st_modeHXnnnH);
assign PdoEnableH745H[1] =
~st_modeHXnnnH);
assign PdoEnableH745H[2] =
~st_modeHXnnnH);
assign PdoEnableH745H[3] =
~st_modeHXnnnH);
assign uncPdoEnableH745H =
~st_modeHXnnnH);
`EN_MSFF (NHptPdoH746H[0],
PdoEnableH745H[0])
`EN_MSFF (NHptPdoH746H[1],
PdoEnableH745H[1])
`EN_MSFF (NHptPdoH746H[2],
PdoEnableH745H[2])
`EN_MSFF (NHptPdoH746H[3],
PdoEnableH745H[3])
`EN_MSFF (UncPdoH746H,
uncPdoEnableH745H)
`MSFF(NHptPdoValH746H[0],
`MSFF(NHptPdoValH746H[1],
`MSFF(NHptPdoValH746H[2],
`MSFF(NHptPdoValH746H[3],
`MSFF(UncMciPdoValH746H ,
data returning from the cores and uncore.
NHptPdoH745H[0].valid | (MciPwrUpH733H &
NHptPdoH745H[1].valid | (MciPwrUpH733H &
NHptPdoH745H[2].valid | (MciPwrUpH733H &
NHptPdoH745H[3].valid | (MciPwrUpH733H &
UncMciPdoH745H.valid
| (MciPwrUpH733H &
NHptPdoH745H[0].data, CkPdoCore0H1N66,
NHptPdoH745H[1].data, CkPdoCore1H1N66,
NHptPdoH745H[2].data, CkPdoCore2H1N66,
NHptPdoH745H[3].data, CkPdoCore3H1N66,
UncMciPdoH745H.data,
CkPdoUncCoreH1N66,
NHptPdoH745H[0].valid,
NHptPdoH745H[1].valid,
NHptPdoH745H[2].valid,
NHptPdoH745H[3].valid,
UncMciPdoH745H.valid,
CkPdoCore0H1N66)
CkPdoCore1H1N66)
CkPdoCore2H1N66)
CkPdoCore3H1N66)
CkPdoUncCoreH1N66)
always_comb begin : MCI_PDO_COMP
// 4:1 mux of core pdo signals
unique casex (h_refcorselT731H)
2'b00 : begin
PdoRefH746H
= NHptPdoH746H[0];
PdoValRefH746H = NHptPdoValH746H[0];
end
2'b01 : begin
PdoRefH746H = NHptPdoH746H[1];
PdoValRefH746H = NHptPdoValH746H[1];
119
end
2'b10 : begin
PdoRefH746H = NHptPdoH746H[2];
PdoValRefH746H = NHptPdoValH746H[2];
end
2'b11 : begin
PdoRefH746H = NHptPdoH746H[3];
PdoValRefH746H = NHptPdoValH746H[3];
end
default : begin
PdoRefH746H = NHptPdoH746H[0];
PdoValRefH746H = NHptPdoValH746H[0];
end
endcase // casex(refcoreselectT731H)
//===============================================================
// PDO Match logic
//===============================================================
for(int core = 0; core < NHM_MAX_NUM_CORES; core++)
begin
// Compare each core to core selected as reference
//
CBitMatchH746H_b[core] = NHptPdoH746H[core] ^ PdoRefH746H;
// Invert result
//
CBitMatchH746H[core] = ~(CBitMatchH746H_b[core]);
// Determine if all bits indicate a match and gate with
core bypass
PMatchH746H[core] = h_corebypassT731H[core] |
(&CBitMatchH746H[core][MCI_CORE_PDO_WIDTH-1:0]);
end // for (int core = 0; core < NHM_MAX_NUM_CORES; core++)
// Do all bits match on all cores?
PMatchAllH746H = PMatchH746H[0] & PMatchH746H[1] & PMatchH746H[2]
& PMatchH746H[3];
//===============================================================
// Select the source for PDO to flitout
//===============================================================
// valid
outselect[2:0]
PdoFinalVal
//-----------------------------------------------------//
xx1
all zeros
//
x10
Pdo Reference valid
//
100
Pdo Reference valid
//
000
Pdo Uncore valid
unique casex ({h_outselectT731H, Tx_Valid_LoopbackHXnnnH})
4'bxxx1 : PdoValFinalH746H = csptPdiValidH783H;
4'bxx10 : PdoValFinalH746H = ptMciPdiH740H.valid;
4'bx100 : PdoValFinalH746H = PdoValRefH746H;
4'b1000 : PdoValFinalH746H = PdoValRefH746H;
4'b0000 : PdoValFinalH746H = UncMciPdoValH746H;
endcase // casex(outselectT731H)
120
// bits [14:0]
outselect[2:0]
PdoFinal[14:0]
//-----------------------------------------------------//
xx1
all zeros
//
x10
Pdo Reference [14:0]
//
100
Pdo Reference [14:0]
//
000
Pdo Uncore [14:0]
unique casex (h_outselectT731H)
3'bxx1 : PdoFinalH746H[14:0] =
3'bx10 : PdoFinalH746H[14:0] =
3'b100 : PdoFinalH746H[14:0] =
3'b000 : PdoFinalH746H[14:0] =
endcase // casex(outselectT731H)
'0;
PdoRefH746H[14:0];
PdoRefH746H[14:0];
UncPdoH746H[14:0];
// Bit 15
outselect[1:0]
PdoFinal[15]
//-----------------------------------------------------//
xx1
uncore tdo
//
x10
Pdo Reference [15]
//
100
Pdo Reference [15]
//
000
Pdo Uncore [15]
unique casex (h_outselectT731H)
3'bxx1 : PdoFinalH746H[15] = ptuncTdoH733H;
3'bx10 : PdoFinalH746H[15] = PdoRefH746H[15];
3'b100 : PdoFinalH746H[15] = PdoRefH746H[15];
3'b000 : PdoFinalH746H[15] = UncPdoH746H[15];
endcase // casex(outselectT731H)
// bits [19:16] outselect[1:0]
PdoFinal[19:16]
//-----------------------------------------------------//
xx1
Tdo from cores[3:0]
//
x10
Pmatch core [3:0]
//
100
all zeros
//
000
Pdo Uncore [19:16]
unique casex (h_outselectT731H[2:0])
3'bxx1 : begin
PdoFinalH746H[19] = NHptTdoH733H[3];
PdoFinalH746H[18] = NHptTdoH733H[2];
PdoFinalH746H[17] = NHptTdoH733H[1];
PdoFinalH746H[16] = NHptTdoH733H[0];
end
3'bx10 : begin
PdoFinalH746H[19] = PMatchH746H[3];
PdoFinalH746H[18] = PMatchH746H[2];
PdoFinalH746H[17] = PMatchH746H[1];
PdoFinalH746H[16] = PMatchH746H[0];
end
3'b100 : PdoFinalH746H[19:16] = '0;
3'b000 : PdoFinalH746H[19:16] = '0;
endcase // casex(outselectT731H[2:0])
end // block: MCI_PDO_COMP
// Pass tdo from the uncore and the cores through metastable flops
prior to
// sending them out to CSI
`METAFLOP_2(ptuncTdoH733H, ptuncTdoT731L,
CkPdC1N44)
121
`METAFLOP_2(NHptTdoH733H[0],
`METAFLOP_2(NHptTdoH733H[1],
`METAFLOP_2(NHptTdoH733H[2],
`METAFLOP_2(NHptTdoH733H[3],
NHptTdoT731L[0],
NHptTdoT731L[1],
NHptTdoT731L[2],
NHptTdoT731L[3],
CkPdC1N44)
CkPdC1N44)
CkPdC1N44)
CkPdC1N44)
// flop final parallel data prior to send it to flit-out
node PTMciVldOutUnmaskedH770H;
`MSFF (ptcsPdoH747H,
PdoFinalH746H,
CkMciPdoH1N66)
`MSFF (PTMciVldOutUnmaskedH770H, PdoValFinalH746H, CkMciPdoH1N66)
assign PTMciVldOutH770H = ~Tx_Valid_MaskHXnnnH &
PTMciVldOutUnmaskedH770H;
//
Flop for PMatch to TDO.
`ASYNC_RST_MSFF (PMatchAllT731H, 1'b1, CkTapT1N22, ~PMatchAllH746H)
// Flop to prevent pulse evaporation on miss matches close to the
rising edge of TCLK)
`MSFF (PMatchAllT732H, PMatchAllT731H, CkTapT1N22)
122
Dualcore System.vs
module system_dualcore(
);
tri
xxtck ;
tri
xxtdi0 ;
tri
xxtdi1 ;
tri
xxtdo0 ;
tri
xxtdo1 ;
tri
xxtms ;
tri
xxtrst_b ;
xunit xunit(
.xxtck
(xxtck ),
.xxtdi0
(xxtdi0 ),
.xxtdi1
(xxtdi1 ),
.xxtms
(xxtms ),
.xxtrst_b
(xxtrst_b ),
.xxtdo0
(xxtdo0 ),
.xxtdo1
(xxtdo1 ),
);
endmodule
123
Quadcore System.vs
module system_quadcore(
);
tri
xxtck ;
tri
xxtdi0 ;
tri
xxtdi1 ;
tri
xxtdi2 ;
tri
xxtdi3 ;
tri
xxtdo0 ;
tri
xxtdo1 ;
tri
xxtdo2 ;
tri
xxtdo3 ;
tri
xxtms ;
tri
xxtrst_b ;
xunit xunit(
.xxtck
(xxtck ),
.xxtdi0
(xxtdi0 ),
.xxtdi1
(xxtdi1 ),
.xxtms
(xxtms ),
.xxtrst_b
(xxtrst_b ),
.xxtdo0
(xxtdo0 ),
.xxtdo1
(xxtdo1 ),
.xxtdi2
(xxtdi2 ),
.xxtdi3
(xxtdi3 ),
.xxtdo2
(xxtdo2 ),
.xxtdo3
(xxtdo3 ),
);
endmodule
124
APPENDICES B
125
MACE Code for TAP Init
# Start Mace and Tap initialization routines
##########################################
# All Mace and Tap related initialization routines go here.
# -------------------------------------------------------# Include Mace M4 macro definitions, use predefined macros
# -------------------------------------------------------m4_include(Mace.h4);
# -------------------------------------------# Need some defines
# -------------------------------------------package def;
# --------------------------------------------------# TAP.mace related defines are in the below def file
# --------------------------------------------------require "sys_resolved_defs.pl";
# -------------------------------------------------------------# Array of Streams. All the streams in the test needs to # be
defined in this array.
# -------------------------------------------------------------@Mace::STREAMS = (\&taptest1);
# Stream Subroutine
# -------------------------------------------------sub taptest1 {
_start_stream("taptest1");
# -------------------------------------------# Call the init subroutine to install tap sigs
# -------------------------------------------&Tap::init();
&sys::init_clocks();
# -----------------------------------------------------# Initialize TAP and put TAP FSM in Run-Test/Idle state
# -----------------------------------------------------&Mace::inform("Initializing the TAP\n");
my $tapinit = &Tap::init_tap();
WAIT_UNTIL($tapinit->is_complete);
&Mace::inform("Finished TAP initialization\n");
########################################
# END OF Mace and Tap initialization
########################################
126
IDCODE test (For sanity check)
#!/usr/intel/97r1.3/bin/perl -w
# Include Mace M4 macro definitions, use predefined macros
#
m4_include(Mace.h4);
#
# Need some defines from the fsb.def file
#
package def;
require "sys_resolved_defs.pl";
$dbg_all = 0;
$dbg_all = $ENV{DEBUG_ALL} if (defined $ENV{DEBUG_ALL});
@Mace::STREAMS = (\&taptest1);
sub taptest1 {
_start_stream("taptest1");
# -------------------------------------------# Call the init subroutine to install tap sigs
# -------------------------------------------&Tap::init();
&sys::init_clocks();
# --------------------------------------------# Initialize TAP and put in Run-Test/Idle state
# --------------------------------------------my $tapratio = 8;
if (exists $ENV{CSIM_CLK_RATIO}) {
$tapratio = $ENV{CSIM_CLK_RATIO};
} elsif (exists $ENV{CLK_RATIO}){
$tapratio = $ENV{CLK_RATIO};
}
if (exists $ENV{CSIM_TAP_BCLK_RATIO}) {
$tapratio = $tapratio * $ENV{CSIM_TAP_BCLK_RATIO};
} else {
$tapratio = $tapratio * 1;
}
&Mace::inform("Initializing the TAP\n");
my $tapinit = &Tap::init_tap();
WAIT_UNTIL($tapinit->is_complete);
&Mace::inform("Finished TAP initialization\n");
# --------------------------------------------# Run the IR emacro
# --------------------------------------------&Mace::inform("Shift IDCODE into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x02);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("IDCODE is shifted in\n");
127
WAIT(5);
# Checking the instruction signal here
$idcode_instr = &Mace::Signal::new('/system/tap/iidcode');
if ($idcode_instr->val != 1)
{
&Mace::error("IDCODE instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# ----------------------------------------# Run the DR emacro
# ----------------------------------------$dr = BitVec::new(32, "10101010101010101010101010101011");
$E1 = &Tap::dr (NUM_DATA_OUT => 32, DATA_IN => $dr, PAUSES =>
0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
&Mace::inform("IDCODE: %s\n", $E1->{DATA}->to_hex);
# ----------------------------------------# Check the IDCODE
# ----------------------------------------$IDCODE = BitVec::new(32, "0xAAAAAAAB");;
if ($E1->{DATA} != $IDCODE )
{
&Mace::error("IDCODE %s is not correct value %s!\n", $E1>{DATA}->to_hex, $IDCODE->to_hex);
$Mace::sim->kill(1);
exit(1);
}
&Mace::inform("IDCODE matched!\n");
WAIT(5);
_exit_stream();
}
1;
128
Multicore Interface test for Dualcore Serial Mode Configuration
#!/usr/intel/97r1.3/bin/perl -w
# Include Mace M4 macro definitions, use predefined macros
#
m4_include(Mace.h4);
#
# Need some defines from the fsb.def file
#
package def;
require "sys_resolved_defs.pl";
$dbg_all = 0;
$dbg_all = $ENV{DEBUG_ALL} if (defined $ENV{DEBUG_ALL});
@Mace::STREAMS = (\&taptest1);
sub taptest1 {
_start_stream("taptest1");
# -------------------------------------------# Call the init subroutine to install tap sigs
# -------------------------------------------&Tap::init();
&sys::init_clocks();
# Commented out the code because TAP can't start until
# $dfx_reset is off.
#WAIT_UNTIL($Tap::xxreset_bar->val);
#&Mace::inform("Reset signal (active low) has been
deasserted\n");
# --------------------------------------------# Initialize TAP and put in Run-Test/Idle state
# --------------------------------------------my $tapratio = 8;
if (exists $ENV{CSIM_CLK_RATIO}) {
$tapratio = $ENV{CSIM_CLK_RATIO};
} elsif (exists $ENV{CLK_RATIO}){
$tapratio = $ENV{CLK_RATIO};
}
if (exists $ENV{CSIM_TAP_BCLK_RATIO}) {
$tapratio = $tapratio * $ENV{CSIM_TAP_BCLK_RATIO};
} else {
$tapratio = $tapratio * 1;
}
&Mace::inform("Initializing the TAP\n");
my $tapinit = &Tap::init_tap();
WAIT_UNTIL($tapinit->is_complete);
&Mace::inform("Finished TAP initialization\n");
# --------------------------------------------# Run the IR emacro for MCIMODE
# --------------------------------------------&Mace::inform("Shift MCIMODE into inst. reg.\n");
129
$E1 = &Tap::ir(OPCODE => 0x10);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MCIMODE instruction is shifted in\n");
WAIT(5);
# Checking the instruction signal here
$mcimode_instr = &Mace::Signal::new('/system/tap/imcimode');
if ($mcimode_instr->val != 1)
{
&Mace::error("MCIMODE instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MCIMODE, "1" = Serial mode
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the mcimode control signals here
$mcimode_sig0 = &Mace::Signal::new('/system/tap/MciModeT731H');
WAIT(5);
## Checking the result here
if ($mcimode_sig0->val != 1)
{
&Mace::error("MCIMODE signal value %s does not match expected
value %s", $mcimode_sig0->val, "1");
}
else
{
&Mace::inform("MCIMODE control signal matched!!!\n");
}
# --------------------------------------------# Run the IR emacro for CoreConnect
# --------------------------------------------&Mace::inform("Shift CORECONNECT into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x11);
WAIT_UNTIL($E1->is_complete);
130
&Mace::inform("CORECONNECT instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$coreconnect_instr =
&Mace::Signal::new('/system/tap/icoreconnect');
if ($coreconnect_instr->val != 1)
{
&Mace::error("CORECONNECT instruction is not shifted-in to
the IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for CoreConnect, "01" = bypass Core0 TAP
# -----------------------------------------------$dr = &BitVec::new(2, "01");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the mcimode control signals here
$coreconnect_sig0 =
&Mace::Signal::new('/system/tap/CoreConnectT731H');
WAIT(1);
## Checking the result here
if ($coreconnect_sig0->val != 01)
{
&Mace::error("CORECONNECT signal value %s does not match
expected value %s", $coreconnect_sig0->val, "01");
}
else
{
&Mace::inform("CORECONNECT control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for MISREN
# --------------------------------------------&Mace::inform("Shift MISREN into inst. reg.\n");
131
$E1 = &Tap::ir(OPCODE => 0x13);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MISREN instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$misren_instr = &Mace::Signal::new('/system/tap/iMISREn');
if ($misren_instr->val != 1)
{
&Mace::error("MISREN instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MISREN, "1" = Enable MISR
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the misren control signals here
$misren_sig0 = &Mace::Signal::new('/system/tap/misrenT731H');
WAIT(1);
## Checking the result here
if ($misren_sig0->val != 1)
{
&Mace::error("MISREN signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MISREN control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for MsgBusRd
# --------------------------------------------&Mace::inform("Shift MsgBusRd into inst. reg.\n");
132
$E1 = &Tap::ir(OPCODE => 0x14);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MsgBusRd instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$MsgBusRd_instr = &Mace::Signal::new('/system/tap/iMsgBusRd');
if ($MsgBusRd_instr->val != 1)
{
&Mace::error("MsgBusRd instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MsgBusRd, "1" = Perform Read
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the MsgBusRd control signals here
$MsgBusRd_sig0 =
&Mace::Signal::new('/system/tap/MsgBusRdT731H');
WAIT(1);
## Checking the result here
if ($MsgBusRd_sig0->val != 1)
{
&Mace::error("MsgBusRd signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MsgBusRd control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# -----------------------------------------------# Check to verify if the correct signature is read
133
# -----------------------------------------------$SIGNATURE = BitVec::new(32, "0x0000EFFD");;
if ($E1->{DATA} != $SIGNATURE )
{
&Mace::error("SIGNATURE %s is not correct value %s!\n", $E1>{DATA}->to_hex, $SIGNATURE->to_hex);
$Mace::sim->kill(1);
exit(1);
}
&Mace::inform("SIGNATURE matched!\n");
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for CoreConnect
# --------------------------------------------&Mace::inform("Shift CORECONNECT into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x11);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("CORECONNECT instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$coreconnect_instr =
&Mace::Signal::new('/system/tap/icoreconnect');
if ($coreconnect_instr->val != 1)
{
&Mace::error("CORECONNECT instruction is not shifted-in to
the IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for CoreConnect, "10" = bypass Core1 TAP
# -----------------------------------------------$dr = &BitVec::new(2, "10");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the mcimode control signals here
$coreconnect_sig0 =
&Mace::Signal::new('/system/tap/CoreConnectT731H');
WAIT(1);
134
## Checking the result here
if ($coreconnect_sig0->val != 10)
{
&Mace::error("CORECONNECT signal value %s does not match
expected value %s", $coreconnect_sig0->val, "10");
}
else
{
&Mace::inform("CORECONNECT control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for MISREN
# --------------------------------------------&Mace::inform("Shift MISREN into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x13);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MISREN instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$misren_instr = &Mace::Signal::new('/system/tap/iMISREn');
if ($misren_instr->val != 1)
{
&Mace::error("MISREN instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MISREN, "1" = Enable MISR
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the misren control signals here
$misren_sig0 = &Mace::Signal::new('/system/tap/misrenT731H');
WAIT(1);
135
## Checking the result here
if ($misren_sig0->val != 1)
{
&Mace::error("MISREN signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MISREN control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for MsgBusRd
# --------------------------------------------&Mace::inform("Shift MsgBusRd into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x14);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MsgBusRd instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$MsgBusRd_instr = &Mace::Signal::new('/system/tap/iMsgBusRd');
if ($MsgBusRd_instr->val != 1)
{
&Mace::error("MsgBusRd instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MsgBusRd, "1" = Perform Read
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the MsgBusRd control signals here
$MsgBusRd_sig0 =
&Mace::Signal::new('/system/tap/MsgBusRdT731H');
136
WAIT(1);
## Checking the result here
if ($MsgBusRd_sig0->val != 1)
{
&Mace::error("MsgBusRd signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MsgBusRd control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# -----------------------------------------------# Check to verify if the correct signature is read
# -----------------------------------------------$SIGNATURE = BitVec::new(32, "0x0000EFEB");;
if ($E1->{DATA} != $SIGNATURE )
{
&Mace::error("SIGNATURE %s is not correct value %s!\n", $E1>{DATA}->to_hex, $SIGNATURE->to_hex);
$Mace::sim->kill(1);
exit(1);
}
&Mace::inform("SIGNATURE matched!\n");
WAIT(1);
_exit_stream();
}
1;
137
Multicore Interface test for Dualcore Parallel Mode Configuration
#!/usr/intel/97r1.3/bin/perl -w
# Include Mace M4 macro definitions, use predefined macros
#
m4_include(Mace.h4);
#
# Need some defines from the fsb.def file
#
package def;
require "sys_resolved_defs.pl";
$dbg_all = 0;
$dbg_all = $ENV{DEBUG_ALL} if (defined $ENV{DEBUG_ALL});
@Mace::STREAMS = (\&taptest1);
sub taptest1 {
_start_stream("taptest1");
# -------------------------------------------# Call the init subroutine to install tap sigs
# -------------------------------------------&Tap::init();
&sys::init_clocks();
# Commented out the code because TAP can't start until
# $dfx_reset is off.
#WAIT_UNTIL($Tap::xxreset_bar->val);
#&Mace::inform("Reset signal (active low) has been
deasserted\n");
# --------------------------------------------# Initialize TAP and put in Run-Test/Idle state
# --------------------------------------------my $tapratio = 8;
if (exists $ENV{CSIM_CLK_RATIO}) {
$tapratio = $ENV{CSIM_CLK_RATIO};
} elsif (exists $ENV{CLK_RATIO}){
$tapratio = $ENV{CLK_RATIO};
}
if (exists $ENV{CSIM_TAP_BCLK_RATIO}) {
$tapratio = $tapratio * $ENV{CSIM_TAP_BCLK_RATIO};
} else {
$tapratio = $tapratio * 1;
}
&Mace::inform("Initializing the TAP\n");
my $tapinit = &Tap::init_tap();
WAIT_UNTIL($tapinit->is_complete);
&Mace::inform("Finished TAP initialization\n");
# --------------------------------------------# Run the IR emacro for MCIMODE
# --------------------------------------------&Mace::inform("Shift MCIMODE into inst. reg.\n");
138
$E1 = &Tap::ir(OPCODE => 0x10);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MCIMODE instruction is shifted in\n");
WAIT(5);
# Checking the instruction signal here
$mcimode_instr = &Mace::Signal::new('/system/tap/imcimode');
if ($mcimode_instr->val != 1)
{
&Mace::error("MCIMODE instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MCIMODE, "0" = Parallel mode
# -----------------------------------------------$dr = &BitVec::new(1, "0");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the mcimode control signals here
$mcimode_sig0 = &Mace::Signal::new('/system/tap/MciModeT731H');
WAIT(5);
## Checking the result here
if ($mcimode_sig0->val != 0)
{
&Mace::error("MCIMODE signal value %s does not match expected
value %s", $mcimode_sig0->val, "0");
}
else
{
&Mace::inform("MCIMODE control signal matched!!!\n");
}
# --------------------------------------------# Run the IR emacro for CoreConnect
# --------------------------------------------&Mace::inform("Shift CORECONNECT into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x11);
WAIT_UNTIL($E1->is_complete);
139
&Mace::inform("CORECONNECT instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$coreconnect_instr =
&Mace::Signal::new('/system/tap/icoreconnect');
if ($coreconnect_instr->val != 1)
{
&Mace::error("CORECONNECT instruction is not shifted-in to
the IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for CoreConnect, "01" = bypass Core0 TAP
# -----------------------------------------------$dr = &BitVec::new(2, "01");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the mcimode control signals here
$coreconnect_sig0 =
&Mace::Signal::new('/system/tap/CoreConnectT731H');
WAIT(1);
## Checking the result here
if ($coreconnect_sig0->val != 01)
{
&Mace::error("CORECONNECT signal value %s does not match
expected value %s", $coreconnect_sig0->val, "01");
}
else
{
&Mace::inform("CORECONNECT control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for Coreselect
# ---------------------------------------------
140
&Mace::inform("Shift COREselect into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x11);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("COREselect instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$coreselect_instr =
&Mace::Signal::new('/system/tap/icoreselect');
if ($coreselect_instr->val != 1)
{
&Mace::error("COREselect instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for Coreselect, "1" = select Core1
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the mcimode control signals here
$coreselect_sig0 =
&Mace::Signal::new('/system/tap/CoreselectT731H');
WAIT(1);
## Checking the result here
if ($coreselect_sig0->val != 1)
{
&Mace::error("COREselect signal value %s does not match
expected value %s", $coreselect_sig0->val, "1");
}
else
{
&Mace::inform("COREselect control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
141
# --------------------------------------------# Run the IR emacro for MISREN
# --------------------------------------------&Mace::inform("Shift MISREN into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x13);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MISREN instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$misren_instr = &Mace::Signal::new('/system/tap/iMISREn');
if ($misren_instr->val != 1)
{
&Mace::error("MISREN instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MISREN, "1" = Enable MISR
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the misren control signals here
$misren_sig0 = &Mace::Signal::new('/system/tap/misrenT731H');
WAIT(1);
## Checking the result here
if ($misren_sig0->val != 1)
{
&Mace::error("MISREN signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MISREN control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
142
# --------------------------------------------# Run the IR emacro for MsgBusRd
# --------------------------------------------&Mace::inform("Shift MsgBusRd into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x14);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MsgBusRd instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$MsgBusRd_instr = &Mace::Signal::new('/system/tap/iMsgBusRd');
if ($MsgBusRd_instr->val != 1)
{
&Mace::error("MsgBusRd instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MsgBusRd, "1" = Perform Read
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the MsgBusRd control signals here
$MsgBusRd_sig0 =
&Mace::Signal::new('/system/tap/MsgBusRdT731H');
WAIT(1);
## Checking the result here
if ($MsgBusRd_sig0->val != 1)
{
&Mace::error("MsgBusRd signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MsgBusRd control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
143
# -----------------------------------------------# Check to verify if the correct signature is read
# -----------------------------------------------$SIGNATURE = BitVec::new(32, "0x0000EFFD");;
if ($E1->{DATA} != $SIGNATURE )
{
&Mace::error("SIGNATURE %s is not correct value %s!\n", $E1>{DATA}->to_hex, $SIGNATURE->to_hex);
$Mace::sim->kill(1);
exit(1);
}
&Mace::inform("SIGNATURE matched!\n");
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for CoreConnect
# --------------------------------------------&Mace::inform("Shift CORECONNECT into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x11);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("CORECONNECT instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$coreconnect_instr =
&Mace::Signal::new('/system/tap/icoreconnect');
if ($coreconnect_instr->val != 1)
{
&Mace::error("CORECONNECT instruction is not shifted-in to
the IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for CoreConnect, "10" = bypass Core1 TAP
# -----------------------------------------------$dr = &BitVec::new(2, "10");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the mcimode control signals here
144
$coreconnect_sig0 =
&Mace::Signal::new('/system/tap/CoreConnectT731H');
WAIT(1);
## Checking the result here
if ($coreconnect_sig0->val != 10)
{
&Mace::error("CORECONNECT signal value %s does not match
expected value %s", $coreconnect_sig0->val, "10");
}
else
{
&Mace::inform("CORECONNECT control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for Coreselect
# --------------------------------------------&Mace::inform("Shift COREselect into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x11);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("COREselect instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$coreselect_instr =
&Mace::Signal::new('/system/tap/icoreselect');
if ($coreselect_instr->val != 1)
{
&Mace::error("COREselect instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for Coreselect, "0" = select Core0
# -----------------------------------------------$dr = &BitVec::new(1, "0");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
145
# Installing the mcimode control signals here
$coreselect_sig0 =
&Mace::Signal::new('/system/tap/CoreselectT731H');
WAIT(1);
## Checking the result here
if ($coreselect_sig0->val != 0)
{
&Mace::error("COREselect signal value %s does not match
expected value %s", $coreselect_sig0->val, "0");
}
else
{
&Mace::inform("COREselect control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for MISREN
# --------------------------------------------&Mace::inform("Shift MISREN into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x13);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MISREN instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$misren_instr = &Mace::Signal::new('/system/tap/iMISREn');
if ($misren_instr->val != 1)
{
&Mace::error("MISREN instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MISREN, "1" = Enable MISR
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
146
&Mace::inform("Data is shifted in and out\n");
# Installing the misren control signals here
$misren_sig0 = &Mace::Signal::new('/system/tap/misrenT731H');
WAIT(1);
## Checking the result here
if ($misren_sig0->val != 1)
{
&Mace::error("MISREN signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MISREN control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for MsgBusRd
# --------------------------------------------&Mace::inform("Shift MsgBusRd into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x14);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MsgBusRd instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$MsgBusRd_instr = &Mace::Signal::new('/system/tap/iMsgBusRd');
if ($MsgBusRd_instr->val != 1)
{
&Mace::error("MsgBusRd instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MsgBusRd, "1" = Perform Read
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
147
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the MsgBusRd control signals here
$MsgBusRd_sig0 =
&Mace::Signal::new('/system/tap/MsgBusRdT731H');
WAIT(1);
## Checking the result here
if ($MsgBusRd_sig0->val != 1)
{
&Mace::error("MsgBusRd signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MsgBusRd control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# -----------------------------------------------# Check to verify if the correct signature is read
# -----------------------------------------------$SIGNATURE = BitVec::new(32, "0x0000EFEB");;
if ($E1->{DATA} != $SIGNATURE )
{
&Mace::error("SIGNATURE %s is not correct value %s!\n", $E1>{DATA}->to_hex, $SIGNATURE->to_hex);
$Mace::sim->kill(1);
exit(1);
}
&Mace::inform("SIGNATURE matched!\n");
WAIT(1);
_exit_stream();
}
148
Multicore Interface test for Quadcore Serial Mode Configuration
#!/usr/intel/97r1.3/bin/perl -w
# Include Mace M4 macro definitions, use predefined macros
#
m4_include(Mace.h4);
#
# Need some defines from the fsb.def file
#
package def;
require "sys_resolved_defs.pl";
$dbg_all = 0;
$dbg_all = $ENV{DEBUG_ALL} if (defined $ENV{DEBUG_ALL});
@Mace::STREAMS = (\&taptest1);
sub taptest1 {
_start_stream("taptest1");
# -------------------------------------------# Call the init subroutine to install tap sigs
# -------------------------------------------&Tap::init();
&sys::init_clocks();
# Commented out the code because TAP can't start until
# $dfx_reset is off.
#WAIT_UNTIL($Tap::xxreset_bar->val);
#&Mace::inform("Reset signal (active low) has been
deasserted\n");
# --------------------------------------------# Initialize TAP and put in Run-Test/Idle state
# --------------------------------------------my $tapratio = 8;
if (exists $ENV{CSIM_CLK_RATIO}) {
$tapratio = $ENV{CSIM_CLK_RATIO};
} elsif (exists $ENV{CLK_RATIO}){
$tapratio = $ENV{CLK_RATIO};
}
if (exists $ENV{CSIM_TAP_BCLK_RATIO}) {
$tapratio = $tapratio * $ENV{CSIM_TAP_BCLK_RATIO};
} else {
$tapratio = $tapratio * 1;
}
&Mace::inform("Initializing the TAP\n");
my $tapinit = &Tap::init_tap();
WAIT_UNTIL($tapinit->is_complete);
&Mace::inform("Finished TAP initialization\n");
# --------------------------------------------# Run the IR emacro for MCIMODE
# --------------------------------------------&Mace::inform("Shift MCIMODE into inst. reg.\n");
149
$E1 = &Tap::ir(OPCODE => 0x10);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MCIMODE instruction is shifted in\n");
WAIT(5);
# Checking the instruction signal here
$mcimode_instr = &Mace::Signal::new('/system/tap/imcimode');
if ($mcimode_instr->val != 1)
{
&Mace::error("MCIMODE instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MCIMODE, "1" = Serial mode
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the mcimode control signals here
$mcimode_sig0 = &Mace::Signal::new('/system/tap/MciModeT731H');
WAIT(5);
## Checking the result here
if ($mcimode_sig0->val != 1)
{
&Mace::error("MCIMODE signal value %s does not match expected
value %s", $mcimode_sig0->val, "1");
}
else
{
&Mace::inform("MCIMODE control signal matched!!!\n");
}
# --------------------------------------------# Run the IR emacro for CoreConnect
# --------------------------------------------&Mace::inform("Shift CORECONNECT into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x11);
WAIT_UNTIL($E1->is_complete);
150
&Mace::inform("CORECONNECT instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$coreconnect_instr =
&Mace::Signal::new('/system/tap/icoreconnect');
if ($coreconnect_instr->val != 1)
{
&Mace::error("CORECONNECT instruction is not shifted-in to
the IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for CoreConnect, "0100" = bypass Core0,2,3
TAP
# -----------------------------------------------$dr = &BitVec::new(4, "0100");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the mcimode control signals here
$coreconnect_sig0 =
&Mace::Signal::new('/system/tap/CoreConnectT731H');
WAIT(1);
## Checking the result here
if ($coreconnect_sig0->val != 0100)
{
&Mace::error("CORECONNECT signal value %s does not match
expected value %s", $coreconnect_sig0->val, "0100");
}
else
{
&Mace::inform("CORECONNECT control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for MISREN
# ---------------------------------------------
151
&Mace::inform("Shift MISREN into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x13);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MISREN instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$misren_instr = &Mace::Signal::new('/system/tap/iMISREn');
if ($misren_instr->val != 1)
{
&Mace::error("MISREN instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MISREN, "1" = Enable MISR
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the misren control signals here
$misren_sig0 = &Mace::Signal::new('/system/tap/misrenT731H');
WAIT(1);
## Checking the result here
if ($misren_sig0->val != 1)
{
&Mace::error("MISREN signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MISREN control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for MsgBusRd
# ---------------------------------------------
152
&Mace::inform("Shift MsgBusRd into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x14);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MsgBusRd instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$MsgBusRd_instr = &Mace::Signal::new('/system/tap/iMsgBusRd');
if ($MsgBusRd_instr->val != 1)
{
&Mace::error("MsgBusRd instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MsgBusRd, "1" = Perform Read
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the MsgBusRd control signals here
$MsgBusRd_sig0 =
&Mace::Signal::new('/system/tap/MsgBusRdT731H');
WAIT(1);
## Checking the result here
if ($MsgBusRd_sig0->val != 1)
{
&Mace::error("MsgBusRd signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MsgBusRd control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# ------------------------------------------------
153
# Check to verify if the correct signature is read
# -----------------------------------------------$SIGNATURE = BitVec::new(32, "0x0000EFFD");;
if ($E1->{DATA} != $SIGNATURE )
{
&Mace::error("SIGNATURE %s is not correct value %s!\n", $E1>{DATA}->to_hex, $SIGNATURE->to_hex);
$Mace::sim->kill(1);
exit(1);
}
&Mace::inform("SIGNATURE matched!\n");
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for CoreConnect
# --------------------------------------------&Mace::inform("Shift CORECONNECT into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x11);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("CORECONNECT instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$coreconnect_instr =
&Mace::Signal::new('/system/tap/icoreconnect');
if ($coreconnect_instr->val != 1)
{
&Mace::error("CORECONNECT instruction is not shifted-in to
the IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for CoreConnect, "1000" = bypass Core1,2,3
TAP
# -----------------------------------------------$dr = &BitVec::new(4, "1000");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the mcimode control signals here
$coreconnect_sig0 =
&Mace::Signal::new('/system/tap/CoreConnectT731H');
154
WAIT(1);
## Checking the result here
if ($coreconnect_sig0->val != 1000)
{
&Mace::error("CORECONNECT signal value %s does not match
expected value %s", $coreconnect_sig0->val, "1000");
}
else
{
&Mace::inform("CORECONNECT control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for MISREN
# --------------------------------------------&Mace::inform("Shift MISREN into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x13);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MISREN instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$misren_instr = &Mace::Signal::new('/system/tap/iMISREn');
if ($misren_instr->val != 1)
{
&Mace::error("MISREN instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MISREN, "1" = Enable MISR
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the misren control signals here
$misren_sig0 = &Mace::Signal::new('/system/tap/misrenT731H');
155
WAIT(1);
## Checking the result here
if ($misren_sig0->val != 1)
{
&Mace::error("MISREN signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MISREN control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for MsgBusRd
# --------------------------------------------&Mace::inform("Shift MsgBusRd into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x14);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MsgBusRd instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$MsgBusRd_instr = &Mace::Signal::new('/system/tap/iMsgBusRd');
if ($MsgBusRd_instr->val != 1)
{
&Mace::error("MsgBusRd instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MsgBusRd, "1" = Perform Read
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the MsgBusRd control signals here
156
$MsgBusRd_sig0 =
&Mace::Signal::new('/system/tap/MsgBusRdT731H');
WAIT(1);
## Checking the result here
if ($MsgBusRd_sig0->val != 1)
{
&Mace::error("MsgBusRd signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MsgBusRd control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# -----------------------------------------------# Check to verify if the correct signature is read
# -----------------------------------------------$SIGNATURE = BitVec::new(32, "0x0000EFEB");;
if ($E1->{DATA} != $SIGNATURE )
{
&Mace::error("SIGNATURE %s is not correct value %s!\n", $E1>{DATA}->to_hex, $SIGNATURE->to_hex);
$Mace::sim->kill(1);
exit(1);
}
&Mace::inform("SIGNATURE matched!\n");
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for CoreConnect
# --------------------------------------------&Mace::inform("Shift CORECONNECT into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x11);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("CORECONNECT instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$coreconnect_instr =
&Mace::Signal::new('/system/tap/icoreconnect');
157
if ($coreconnect_instr->val != 1)
{
&Mace::error("CORECONNECT instruction is not shifted-in to
the IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for CoreConnect, "0010" = bypass Core0,1,3
TAP
# -----------------------------------------------$dr = &BitVec::new(4, "0010");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the mcimode control signals here
$coreconnect_sig0 =
&Mace::Signal::new('/system/tap/CoreConnectT731H');
WAIT(1);
## Checking the result here
if ($coreconnect_sig0->val != 0010)
{
&Mace::error("CORECONNECT signal value %s does not match
expected value %s", $coreconnect_sig0->val, "0010");
}
else
{
&Mace::inform("CORECONNECT control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for MISREN
# --------------------------------------------&Mace::inform("Shift MISREN into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x13);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MISREN instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
158
$misren_instr = &Mace::Signal::new('/system/tap/iMISREn');
if ($misren_instr->val != 1)
{
&Mace::error("MISREN instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MISREN, "1" = Enable MISR
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the misren control signals here
$misren_sig0 = &Mace::Signal::new('/system/tap/misrenT731H');
WAIT(1);
## Checking the result here
if ($misren_sig0->val != 1)
{
&Mace::error("MISREN signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MISREN control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for MsgBusRd
# --------------------------------------------&Mace::inform("Shift MsgBusRd into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x14);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MsgBusRd instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
159
$MsgBusRd_instr = &Mace::Signal::new('/system/tap/iMsgBusRd');
if ($MsgBusRd_instr->val != 1)
{
&Mace::error("MsgBusRd instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MsgBusRd, "1" = Perform Read
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the MsgBusRd control signals here
$MsgBusRd_sig0 =
&Mace::Signal::new('/system/tap/MsgBusRdT731H');
WAIT(1);
## Checking the result here
if ($MsgBusRd_sig0->val != 1)
{
&Mace::error("MsgBusRd signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MsgBusRd control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# -----------------------------------------------# Check to verify if the correct signature is read
# -----------------------------------------------$SIGNATURE = BitVec::new(32, "0x0000FDFE");;
if ($E1->{DATA} != $SIGNATURE )
{
&Mace::error("SIGNATURE %s is not correct value %s!\n", $E1>{DATA}->to_hex, $SIGNATURE->to_hex);
$Mace::sim->kill(1);
160
exit(1);
}
&Mace::inform("SIGNATURE matched!\n");
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for CoreConnect
# --------------------------------------------&Mace::inform("Shift CORECONNECT into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x11);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("CORECONNECT instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$coreconnect_instr =
&Mace::Signal::new('/system/tap/icoreconnect');
if ($coreconnect_instr->val != 1)
{
&Mace::error("CORECONNECT instruction is not shifted-in to
the IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for CoreConnect, "0001" = bypass Core0,1,2
TAP
# -----------------------------------------------$dr = &BitVec::new(4, "0001");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the mcimode control signals here
$coreconnect_sig0 =
&Mace::Signal::new('/system/tap/CoreConnectT731H');
WAIT(1);
## Checking the result here
if ($coreconnect_sig0->val != 0001)
{
161
&Mace::error("CORECONNECT signal value %s does not match
expected value %s", $coreconnect_sig0->val, "0001");
}
else
{
&Mace::inform("CORECONNECT control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for MISREN
# --------------------------------------------&Mace::inform("Shift MISREN into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x13);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MISREN instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$misren_instr = &Mace::Signal::new('/system/tap/iMISREn');
if ($misren_instr->val != 1)
{
&Mace::error("MISREN instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MISREN, "1" = Enable MISR
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the misren control signals here
$misren_sig0 = &Mace::Signal::new('/system/tap/misrenT731H');
WAIT(1);
## Checking the result here
if ($misren_sig0->val != 1)
{
162
&Mace::error("MISREN signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MISREN control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for MsgBusRd
# --------------------------------------------&Mace::inform("Shift MsgBusRd into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x14);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MsgBusRd instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$MsgBusRd_instr = &Mace::Signal::new('/system/tap/iMsgBusRd');
if ($MsgBusRd_instr->val != 1)
{
&Mace::error("MsgBusRd instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MsgBusRd, "1" = Perform Read
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the MsgBusRd control signals here
$MsgBusRd_sig0 =
&Mace::Signal::new('/system/tap/MsgBusRdT731H');
WAIT(1);
## Checking the result here
if ($MsgBusRd_sig0->val != 1)
{
163
&Mace::error("MsgBusRd signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MsgBusRd control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# -----------------------------------------------# Check to verify if the correct signature is read
# -----------------------------------------------$SIGNATURE = BitVec::new(32, "0x0000FEAF");;
if ($E1->{DATA} != $SIGNATURE )
{
&Mace::error("SIGNATURE %s is not correct value %s!\n", $E1>{DATA}->to_hex, $SIGNATURE->to_hex);
$Mace::sim->kill(1);
exit(1);
}
&Mace::inform("SIGNATURE matched!\n");
WAIT(1);
_exit_stream();
}
1;
164
Multicore Interface test for Quadcore Parallel Mode Configuration
#!/usr/intel/97r1.3/bin/perl -w
# Include Mace M4 macro definitions, use predefined macros
#
m4_include(Mace.h4);
#
# Need some defines from the fsb.def file
#
package def;
require "sys_resolved_defs.pl";
$dbg_all = 0;
$dbg_all = $ENV{DEBUG_ALL} if (defined $ENV{DEBUG_ALL});
@Mace::STREAMS = (\&taptest1);
sub taptest1 {
_start_stream("taptest1");
# -------------------------------------------# Call the init subroutine to install tap sigs
# -------------------------------------------&Tap::init();
&sys::init_clocks();
# Commented out the code because TAP can't start until
# $dfx_reset is off.
#WAIT_UNTIL($Tap::xxreset_bar->val);
#&Mace::inform("Reset signal (active low) has been
deasserted\n");
# --------------------------------------------# Initialize TAP and put in Run-Test/Idle state
# --------------------------------------------my $tapratio = 8;
if (exists $ENV{CSIM_CLK_RATIO}) {
$tapratio = $ENV{CSIM_CLK_RATIO};
} elsif (exists $ENV{CLK_RATIO}){
$tapratio = $ENV{CLK_RATIO};
}
if (exists $ENV{CSIM_TAP_BCLK_RATIO}) {
$tapratio = $tapratio * $ENV{CSIM_TAP_BCLK_RATIO};
} else {
$tapratio = $tapratio * 1;
}
&Mace::inform("Initializing the TAP\n");
my $tapinit = &Tap::init_tap();
WAIT_UNTIL($tapinit->is_complete);
&Mace::inform("Finished TAP initialization\n");
# --------------------------------------------# Run the IR emacro for MCIMODE
# --------------------------------------------&Mace::inform("Shift MCIMODE into inst. reg.\n");
165
$E1 = &Tap::ir(OPCODE => 0x10);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MCIMODE instruction is shifted in\n");
WAIT(5);
# Checking the instruction signal here
$mcimode_instr = &Mace::Signal::new('/system/tap/imcimode');
if ($mcimode_instr->val != 1)
{
&Mace::error("MCIMODE instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MCIMODE, "0" = Parallel mode
# -----------------------------------------------$dr = &BitVec::new(1, "0");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the mcimode control signals here
$mcimode_sig0 = &Mace::Signal::new('/system/tap/MciModeT731H');
WAIT(5);
## Checking the result here
if ($mcimode_sig0->val != 0)
{
&Mace::error("MCIMODE signal value %s does not match expected
value %s", $mcimode_sig0->val, "0");
}
else
{
&Mace::inform("MCIMODE control signal matched!!!\n");
}
# --------------------------------------------# Run the IR emacro for CoreConnect
# --------------------------------------------&Mace::inform("Shift CORECONNECT into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x11);
WAIT_UNTIL($E1->is_complete);
166
&Mace::inform("CORECONNECT instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$coreconnect_instr =
&Mace::Signal::new('/system/tap/icoreconnect');
if ($coreconnect_instr->val != 1)
{
&Mace::error("CORECONNECT instruction is not shifted-in to
the IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for CoreConnect, "0100" = bypass Core0,2,3
TAP
# -----------------------------------------------$dr = &BitVec::new(4, "0100");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the mcimode control signals here
$coreconnect_sig0 =
&Mace::Signal::new('/system/tap/CoreConnectT731H');
WAIT(1);
## Checking the result here
if ($coreconnect_sig0->val != 0100)
{
&Mace::error("CORECONNECT signal value %s does not match
expected value %s", $coreconnect_sig0->val, "0100");
}
else
{
&Mace::inform("CORECONNECT control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for Coreselect
167
# --------------------------------------------&Mace::inform("Shift COREselect into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x11);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("COREselect instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$coreselect_instr =
&Mace::Signal::new('/system/tap/icoreselect');
if ($coreselect_instr->val != 1)
{
&Mace::error("COREselect instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for Coreselect, "00" = select Core0
# -----------------------------------------------$dr = &BitVec::new(2, "00");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the mcimode control signals here
$coreselect_sig0 =
&Mace::Signal::new('/system/tap/CoreselectT731H');
WAIT(1);
## Checking the result here
if ($coreselect_sig0->val != 00)
{
&Mace::error("COREselect signal value %s does not match
expected value %s", $coreselect_sig0->val, "00");
}
else
{
&Mace::inform("COREselect control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
168
# --------------------------------------------# Run the IR emacro for MISREN
# --------------------------------------------&Mace::inform("Shift MISREN into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x13);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MISREN instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$misren_instr = &Mace::Signal::new('/system/tap/iMISREn');
if ($misren_instr->val != 1)
{
&Mace::error("MISREN instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MISREN, "1" = Enable MISR
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the misren control signals here
$misren_sig0 = &Mace::Signal::new('/system/tap/misrenT731H');
WAIT(1);
## Checking the result here
if ($misren_sig0->val != 1)
{
&Mace::error("MISREN signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MISREN control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
169
# --------------------------------------------# Run the IR emacro for MsgBusRd
# --------------------------------------------&Mace::inform("Shift MsgBusRd into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x14);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MsgBusRd instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$MsgBusRd_instr = &Mace::Signal::new('/system/tap/iMsgBusRd');
if ($MsgBusRd_instr->val != 1)
{
&Mace::error("MsgBusRd instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MsgBusRd, "1" = Perform Read
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the MsgBusRd control signals here
$MsgBusRd_sig0 =
&Mace::Signal::new('/system/tap/MsgBusRdT731H');
WAIT(1);
## Checking the result here
if ($MsgBusRd_sig0->val != 1)
{
&Mace::error("MsgBusRd signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MsgBusRd control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
170
}
# -----------------------------------------------# Check to verify if the correct signature is read
# -----------------------------------------------$SIGNATURE = BitVec::new(32, "0x0000EFEB");;
if ($E1->{DATA} != $SIGNATURE )
{
&Mace::error("SIGNATURE %s is not correct value %s!\n", $E1>{DATA}->to_hex, $SIGNATURE->to_hex);
$Mace::sim->kill(1);
exit(1);
}
&Mace::inform("SIGNATURE matched!\n");
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for CoreConnect
# --------------------------------------------&Mace::inform("Shift CORECONNECT into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x11);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("CORECONNECT instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$coreconnect_instr =
&Mace::Signal::new('/system/tap/icoreconnect');
if ($coreconnect_instr->val != 1)
{
&Mace::error("CORECONNECT instruction is not shifted-in to
the IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for CoreConnect, "1000" = bypass Core1,2,3
TAP
# -----------------------------------------------$dr = &BitVec::new(4, "1000");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
171
# Installing the mcimode control signals here
$coreconnect_sig0 =
&Mace::Signal::new('/system/tap/CoreConnectT731H');
WAIT(1);
## Checking the result here
if ($coreconnect_sig0->val != 1000)
{
&Mace::error("CORECONNECT signal value %s does not match
expected value %s", $coreconnect_sig0->val, "1000");
}
else
{
&Mace::inform("CORECONNECT control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for Coreselect
# --------------------------------------------&Mace::inform("Shift COREselect into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x11);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("COREselect instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$coreselect_instr =
&Mace::Signal::new('/system/tap/icoreselect');
if ($coreselect_instr->val != 1)
{
&Mace::error("COREselect instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for Coreselect, "01" = select Core1
# -----------------------------------------------$dr = &BitVec::new(2, "01");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
172
&Mace::inform("Data is shifted in and out\n");
# Installing the mcimode control signals here
$coreselect_sig0 =
&Mace::Signal::new('/system/tap/CoreselectT731H');
WAIT(1);
## Checking the result here
if ($coreselect_sig0->val != 01)
{
&Mace::error("COREselect signal value %s does not match
expected value %s", $coreselect_sig0->val, "01");
}
else
{
&Mace::inform("COREselect control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for MISREN
# --------------------------------------------&Mace::inform("Shift MISREN into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x13);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MISREN instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$misren_instr = &Mace::Signal::new('/system/tap/iMISREn');
if ($misren_instr->val != 1)
{
&Mace::error("MISREN instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MISREN, "1" = Enable MISR
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
173
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the misren control signals here
$misren_sig0 = &Mace::Signal::new('/system/tap/misrenT731H');
WAIT(1);
## Checking the result here
if ($misren_sig0->val != 1)
{
&Mace::error("MISREN signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MISREN control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for MsgBusRd
# --------------------------------------------&Mace::inform("Shift MsgBusRd into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x14);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MsgBusRd instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$MsgBusRd_instr = &Mace::Signal::new('/system/tap/iMsgBusRd');
if ($MsgBusRd_instr->val != 1)
{
&Mace::error("MsgBusRd instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MsgBusRd, "1" = Perform Read
# -----------------------------------------------$dr = &BitVec::new(1, "1");
174
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the MsgBusRd control signals here
$MsgBusRd_sig0 =
&Mace::Signal::new('/system/tap/MsgBusRdT731H');
WAIT(1);
## Checking the result here
if ($MsgBusRd_sig0->val != 1)
{
&Mace::error("MsgBusRd signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MsgBusRd control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# -----------------------------------------------# Check to verify if the correct signature is read
# -----------------------------------------------$SIGNATURE = BitVec::new(32, "0x0000EFFD");;
if ($E1->{DATA} != $SIGNATURE )
{
&Mace::error("SIGNATURE %s is not correct value %s!\n", $E1>{DATA}->to_hex, $SIGNATURE->to_hex);
$Mace::sim->kill(1);
exit(1);
}
&Mace::inform("SIGNATURE matched!\n");
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for CoreConnect
# --------------------------------------------&Mace::inform("Shift CORECONNECT into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x11);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("CORECONNECT instruction is shifted in\n");
175
WAIT(1);
# Checking the instruction signal here
$coreconnect_instr =
&Mace::Signal::new('/system/tap/icoreconnect');
if ($coreconnect_instr->val != 1)
{
&Mace::error("CORECONNECT instruction is not shifted-in to
the IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for CoreConnect, "0010" = bypass Core0,1,3
TAP
# -----------------------------------------------$dr = &BitVec::new(4, "0010");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the mcimode control signals here
$coreconnect_sig0 =
&Mace::Signal::new('/system/tap/CoreConnectT731H');
WAIT(1);
## Checking the result here
if ($coreconnect_sig0->val != 0010)
{
&Mace::error("CORECONNECT signal value %s does not match
expected value %s", $coreconnect_sig0->val, "0010");
}
else
{
&Mace::inform("CORECONNECT control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for Coreselect
# --------------------------------------------&Mace::inform("Shift COREselect into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x11);
176
WAIT_UNTIL($E1->is_complete);
&Mace::inform("COREselect instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$coreselect_instr =
&Mace::Signal::new('/system/tap/icoreselect');
if ($coreselect_instr->val != 1)
{
&Mace::error("COREselect instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for Coreselect, "10" = select Core2
# -----------------------------------------------$dr = &BitVec::new(2, "10");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the mcimode control signals here
$coreselect_sig0 =
&Mace::Signal::new('/system/tap/CoreselectT731H');
WAIT(1);
## Checking the result here
if ($coreselect_sig0->val != 10)
{
&Mace::error("COREselect signal value %s does not match
expected value %s", $coreselect_sig0->val, "10");
}
else
{
&Mace::inform("COREselect control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for MISREN
177
# --------------------------------------------&Mace::inform("Shift MISREN into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x13);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MISREN instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$misren_instr = &Mace::Signal::new('/system/tap/iMISREn');
if ($misren_instr->val != 1)
{
&Mace::error("MISREN instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MISREN, "1" = Enable MISR
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the misren control signals here
$misren_sig0 = &Mace::Signal::new('/system/tap/misrenT731H');
WAIT(1);
## Checking the result here
if ($misren_sig0->val != 1)
{
&Mace::error("MISREN signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MISREN control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for MsgBusRd
178
# --------------------------------------------&Mace::inform("Shift MsgBusRd into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x14);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MsgBusRd instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$MsgBusRd_instr = &Mace::Signal::new('/system/tap/iMsgBusRd');
if ($MsgBusRd_instr->val != 1)
{
&Mace::error("MsgBusRd instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MsgBusRd, "1" = Perform Read
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the MsgBusRd control signals here
$MsgBusRd_sig0 =
&Mace::Signal::new('/system/tap/MsgBusRdT731H');
WAIT(1);
## Checking the result here
if ($MsgBusRd_sig0->val != 1)
{
&Mace::error("MsgBusRd signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MsgBusRd control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
179
# -----------------------------------------------# Check to verify if the correct signature is read
# -----------------------------------------------$SIGNATURE = BitVec::new(32, "0x0000FDFE");;
if ($E1->{DATA} != $SIGNATURE )
{
&Mace::error("SIGNATURE %s is not correct value %s!\n", $E1>{DATA}->to_hex, $SIGNATURE->to_hex);
$Mace::sim->kill(1);
exit(1);
}
&Mace::inform("SIGNATURE matched!\n");
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for CoreConnect
# --------------------------------------------&Mace::inform("Shift CORECONNECT into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x11);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("CORECONNECT instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$coreconnect_instr =
&Mace::Signal::new('/system/tap/icoreconnect');
if ($coreconnect_instr->val != 1)
{
&Mace::error("CORECONNECT instruction is not shifted-in to
the IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for CoreConnect, "0001" = bypass Core0,1,2
TAP
# -----------------------------------------------$dr = &BitVec::new(4, "0001");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
# Installing the mcimode control signals here
$coreconnect_sig0 =
&Mace::Signal::new('/system/tap/CoreConnectT731H');
180
WAIT(1);
## Checking the result here
if ($coreconnect_sig0->val != 0001)
{
&Mace::error("CORECONNECT signal value %s does not match
expected value %s", $coreconnect_sig0->val, "0001");
}
else
{
&Mace::inform("CORECONNECT control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for Coreselect
# --------------------------------------------&Mace::inform("Shift COREselect into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x11);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("COREselect instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$coreselect_instr =
&Mace::Signal::new('/system/tap/icoreselect');
if ($coreselect_instr->val != 1)
{
&Mace::error("COREselect instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for Coreselect, "11" = select Core3
# -----------------------------------------------$dr = &BitVec::new(2, "11");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
181
# Installing the mcimode control signals here
$coreselect_sig0 =
&Mace::Signal::new('/system/tap/CoreselectT731H');
WAIT(1);
## Checking the result here
if ($coreselect_sig0->val != 11)
{
&Mace::error("COREselect signal value %s does not match
expected value %s", $coreselect_sig0->val, "11");
}
else
{
&Mace::inform("COREselect control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for MISREN
# --------------------------------------------&Mace::inform("Shift MISREN into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x13);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MISREN instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$misren_instr = &Mace::Signal::new('/system/tap/iMISREn');
if ($misren_instr->val != 1)
{
&Mace::error("MISREN instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MISREN, "1" = Enable MISR
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
182
# Installing the misren control signals here
$misren_sig0 = &Mace::Signal::new('/system/tap/misrenT731H');
WAIT(1);
## Checking the result here
if ($misren_sig0->val != 1)
{
&Mace::error("MISREN signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MISREN control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# --------------------------------------------# Run the IR emacro for MsgBusRd
# --------------------------------------------&Mace::inform("Shift MsgBusRd into inst. reg.\n");
$E1 = &Tap::ir(OPCODE => 0x14);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("MsgBusRd instruction is shifted in\n");
WAIT(1);
# Checking the instruction signal here
$MsgBusRd_instr = &Mace::Signal::new('/system/tap/iMsgBusRd');
if ($MsgBusRd_instr->val != 1)
{
&Mace::error("MsgBusRd instruction is not shifted-in to the
IR\n");
_exit_stream();
}
# -----------------------------------------------# Run the DR emacro for MsgBusRd, "1" = Perform Read
# -----------------------------------------------$dr = &BitVec::new(1, "1");
$E1 = &Tap::dr (NUM_DATA_OUT => 1, DATA_IN => $dr, PAUSES => 0);
WAIT_UNTIL($E1->is_complete);
&Mace::inform("Data is shifted in and out\n");
183
# Installing the MsgBusRd control signals here
$MsgBusRd_sig0 =
&Mace::Signal::new('/system/tap/MsgBusRdT731H');
WAIT(1);
## Checking the result here
if ($MsgBusRd_sig0->val != 1)
{
&Mace::error("MsgBusRd signal value %s does not match expected
value %s", $misren_sig0->val, "1");
}
else
{
&Mace::inform("MsgBusRd control signal matched!!!\n");
}
WAIT(1);
_exit_stream();
}
# -----------------------------------------------# Check to verify if the correct signature is read
# -----------------------------------------------$SIGNATURE = BitVec::new(32, "0x0000FEAF");;
if ($E1->{DATA} != $SIGNATURE )
{
&Mace::error("SIGNATURE %s is not correct value %s!\n", $E1>{DATA}->to_hex, $SIGNATURE->to_hex);
$Mace::sim->kill(1);
exit(1);
}
&Mace::inform("SIGNATURE matched!\n");
WAIT(1);
_exit_stream();
}
1;
184
MACE Codes for MISR Signature Calculation
# Include Mace M4 macro definitions, use predefined macros
#
m4_include(Mace.h4);
package def;
require "sys_resolved_defs.pl";
$dbg_all = 0;
$dbg_all = $ENV{DEBUG_ALL} if (defined $ENV{DEBUG_ALL});
@Mace::STREAMS = (\&misr_calc);
sub misr_calc {
_start_stream("misr_calc");
# -------------------------------------------# Call the init subroutine to install tap sigs
# -------------------------------------------&Tap::init();
&sys::init_clocks();
# --------------------------------------------# Initialize TAP and put in Run-Test/Idle state
# --------------------------------------------&Mace::inform("Initializing the TAP\n");
my $tapinit = &Tap::init_tap();
WAIT_UNTIL($tapinit->is_complete);
&Mace::inform("Finished TAP initialization\n");
@misr_input = ();
@misr_calc = (1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1); #Initial value
of the misr (after reset).
WAIT_UNTIL($misr_calc->is_complete);
WAIT($sys::mclk_cycle);
WAIT_UNTIL($mclk_cycle->val);
#--------------------------------------------#Tied values for core0 MISR inputs
#--------------------------------------------# core0
@misr1in_val = (0,1,1,1,1,1,1,1,1,1,0,1,0,1,1,1);
#--------------------------------------------#Check misr signature for core0
#--------------------------------------------for ($j=0; $j<15; $j++) {
$misr1in_val = $misr1in->val;
$misr1_calc_val = &misr_calc_routine($misr1in_val);
&Mace::inform("Input of misr is $misr1in_val\n");
185
$misr1_out_val = &BitVec::new(16, $sigsir_out->val);
&Mace::inform("Signature of the core 0 misr is 0x%x\n",
$misr1_calc_val);
WAIT($sys::mclk_cycle); #Proceed to the next clock cycle
} #End for
#--------------------------------------------#Tied values for core1 MISR inputs
#--------------------------------------------# core1
@misr1in_val = (1,1,1,0,0,1,1,1,1,1,1,1,0,1,1,1);
#--------------------------------------------#Check misr signature for core1
#--------------------------------------------for ($j=0; $j<15; $j++) {
$misr1in_val = $misr1in->val;
$misr1_calc_val = &misr_calc_routine($misr1in_val);
&Mace::inform("Input of misr is $misr1in_val\n");
$misr1_out_val = &BitVec::new(16, $sigsir_out->val);
&Mace::inform("Signature of the core 1 misr is 0x%x\n",
$misr1_calc_val);
WAIT($sys::mclk_cycle); #Proceed to the next clock cycle
} #End for
#--------------------------------------------#Tied values for core2 MISR inputs
#--------------------------------------------# core2
@misr1in_val = (1,1,1,1,1,1,1,0,1,1,0,1,0,0,1,1);
#--------------------------------------------#Check misr signature for core2
#--------------------------------------------for ($j=0; $j<15; $j++) {
$misr1in_val = $misr1in->val;
$misr1_calc_val = &misr_calc_routine($misr1in_val);
&Mace::inform("Input of misr is $misr1in_val\n");
$misr1_out_val = &BitVec::new(16, $sigsir_out->val);
&Mace::inform("Signature of the core 2 misr is 0x%x\n",
$misr1_calc_val);
WAIT($sys::mclk_cycle); #Proceed to the next clock cycle
} #End for
#--------------------------------------------#Tied values for core3 MISR inputs
#---------------------------------------------
186
# core3
@misr1in_val = (1,1,1,1,1,1,1,1,1,0,1,1,0,1,1,1);
#--------------------------------------------#Check misr signature for core3
#--------------------------------------------for ($j=0; $j<15; $j++) {
$misr1in_val = $misr1in->val;
$misr1_calc_val = &misr_calc_routine($misr1in_val);
&Mace::inform("Input of misr is $misr1in_val\n");
$misr1_out_val = &BitVec::new(16, $sigsir_out->val);
&Mace::inform("Signature of the core 3 misr is 0x%x\n",
$misr1_calc_val);
WAIT($sys::mclk_cycle); #Proceed to the next clock cycle
} #End for
#==================================================
# Subroutines used
#==================================================
##################################################
#
# misr out calcuation
#
##################################################
sub misr_calc_routine {
#-----------------------------------------------# Calculate the expected value of the misr
#-----------------------------------------------#for ($j = 0; $j < 15; $j++) {
local ($so_misr1_in) = @_;
@misr_input = @misr_calc;
my $i, $m1, $m1_val;
for ($i=0; $i<=15; $i++) {
$m1 = $i-1;
$m1_val = $misr_input[$m1];
if ($i == 0) {
$misr_calc[$i] = &xor($so_misr1_in, $misr_input[15]);
} elsif (($i == 1) || ($i == 2) || ($i == 3) || ($i == 5)
|| ($i == 16)) {
$misr_calc[$i] = &xor($m1_val, $misr_input[15]);
} else {
$misr_calc[$i] = $m1_val;
}
}
#-------------------------------------------------------#Convert the misr output to hex and store it in an array
#-------------------------------------------------------$misr_calc_out = $misr_calc[15];
for ($i=14; $i>=0; $i--) {
$misr_calc_out = $misr_calc_out.$misr_calc[$i];
}
$misr_calc_hex = &bin2hex($misr_calc_out);
187
return $misr_calc_hex;
#}
}
#==============================================
#The below subbroutine calculates the XOR value.
#==============================================
sub xor {
local ($in2, $in1)=@_;#the two inputs
local ($xor_out); #xor output,
if ((($in1==1) && ($in2==0)) || (($in2==1) && ($in1==0))) {
$xor_out = 1;
} else {
$xor_out = 0;
}
return $xor_out;
}
#==============================================
#Binary to hexadecimal conversion
#==============================================
sub bin2hex {
local($data) = @_;
local($out_hex,$out);
$out=0;
$out_hex = "";
$data =~ s/'//g;
foreach($i=0;$i<length($data);$i++) {
$out= ($out<<1) + ((substr($data,$i,1) eq "1")?1:0);
if ( ((length($data) - $i - 1) % 4) == 0 ) {
$out_hex = $out_hex . sprintf("%x",$out);
$out=0;
}
}
return(&hex2hex($out_hex));
} ### bin2hex ###
sub hex2hex {
local($data) = @_;
$data =~ s/'h//;
if (length($data)<9) {
$data = eval ("0x" . $data);
} else {
$data = "\"$data\"";
}
return($data);
} ### hex2hex ###
#///////////////////////////////////////////////////////////////////
//////////////
_exit_stream();
}
1;
© Copyright 2026 Paperzz