Formal definition of the DSS Alarm-Action Matrix for a

Formal definition of the DSS Alarm-Action Matrix
for a « Subsystem »
Giulio Morpurgo, CERN/EN/ICE
Introduction
The problem we are trying to solve is the following: how to formally express a subset of the Detector
Safety System (DSS) Alarm-Action Matrix (AAM) in such a way that a) experts can validate it and b) at
a later stage a script can check if what is currently in the DSS still reproduces faithfully what was
validated (and, if not, produce a list of differences).
Moving towards a solution for this problem, we first describe the validation process itself. Then,
after having reminded the reader about the different components of the AAM, we will propose a
method to identify the AAM for a Subsystem. Finally, after a useful digression about the DSS item’s
UIDs (Unique Identifier)1, we start outlining the solution, the first step of which is to define the
information which needs to be stored.
The validation process
While tools exist to output the totality of the AAM, the size of the resulting document makes it
unpractical in many cases. The typical scenario is that an Experiment can be considered as divided in
smaller and almost self-contained parts (here called “Subsystems”)2.
For each Subsystem,



The Subsystem experts define the DSS safety requirements, with the help of the DSS
experiment experts.
The DSS experts implement the requirements by configuring the DSS AAM (i.e. defining
sensors, alarms, actions and their links).
The DSS experts together with the Subsystem experts test and validate the implementation;
in parallel with this activity a document describing this part of the DSS AAM is compiled by
the DSS experts, read and approved by the Subsystem experts, and stored in EDMS for
future consultation.
But DSS is a system which can be reconfigured at any time. On top of it, it has been common practice
to rename sensors, alarms, actuators. So, if at validation time we had a sensor A which was
triggering an alarm B, which was executing an action C, and now the sensor is called Z, the alarm Y,
the action X, is the AAM still the same? Is the safety of the subsystem still assured? Clearly the EDMS
document, unless it is updated at each renaming, is now difficult to read, because the original names
may not exist anymore in the current DSS. Therefore a tool able to distinguish between “cosmetic”
changes and real changes would be very useful. In the case of renaming, if the tool was able to make
1
The DSS UID (Unique Identifier) is a 16-bit integer number used to unambiguously identify each DSS item
(Sensors, Actuators, Alarms etc.). Each item type has its own range of UIDs, for instance for Digital Inputs it is
1..2048, for Analogue Inputs is 8193..9216 etc.
2
For an example of “Validation Document” for a Subsystem, look at a pdf DSS_VELO_vers2.1_.pdf from
Laurent in https://indico.cern.ch/event/392445/
the correspondence between old and new names, it could also be used to update the EDMS
document, by replacing the old names with the new ones (This of course would require precision
when compiling the EDMS document for the first time).
A look at what composes the AAM
The DSS is based on the following item categories:

Sensors (digital or analogue) which can be in “normal” or “abnormal” state
o A digital sensor has one abnormal state
o An analogue sensor can have two abnormal states (“too high” and “too low’)
A sensor is a physical equipment, and it is connected to the DSS PLC system at a well-defined
hardware address. From this address we derive an index, or UID, which gives the position of
the information (status and parameters) relative to the sensor in the PLC memory
datablocks. The same is true for the actuators (==Actions).
A sensor has also some properties (persistency, thresholds) which have a direct impact on
the safety provided to the detector.

Alarms. An alarm is a software entity. The “Alarm Condition”, which will cause the alarm to
trigger when it becomes TRUE, is defined as a logical expression of the states of up 64
sensors (although most of the times just one or two sensors are used). Also the Alarm has an
UID, which is automatically generated by DSS when the Alarm is created, using the first free
slot in the corresponding PLC memory datablock.

Actuator (i.e. Action): like the Sensor, this is a physical equipment (a digital output), and its
UID derives from its hardware address. The Actuator, which can be seen as an interlock,
ultimately protects the detector, for example by switching off power. The Actuator does not
really have configurable properties. When the state of an Action becomes TRUE, we say that
the DSS “has executed the Action”, and the reason for this is that an Alarm, to whom the
Action is linked, has triggered. But the Action is not executed directly from the Alarm;
instead, DSS uses another layer, the Alarm-to-Action link (A2A) to store the information of
which Actions have to be executed when an Alarm triggers, and when3.

A2A (alarm-to-action link). This is again a software entity. Its UID is generated by DSS (like
for the Alarm). Besides the names of the alarm and the action it has to link, it contains the
value (default == 0) specifying the time interval between the triggering of the alarm and the
execution of the action.
Identification of the AAM for a Subsystem
When we try to identify the part of the AAM corresponding to a Subsystem, we will have a list of
items of the different categories we just mentioned. But we must be aware that it will not be always
possible to cut clearly the AAM into totally separated Subsystem parts; some of the items might
naturally appear in different subsystems. So once more we go through the above list of categories,
3
The reason for this choice are that in this way it was easier to have an alarm triggering a large number of
actions and actions being triggered by a large number of alarm, and it was possible to specify, for each alarm
and action pair, a specific time interval between the triggering of the alarm and the execution of the action.
and see where the Subsystem split is well defined, and where it is not. This time we start from the
bottom, i.e. from the Actuators, and move up towards the Sensors




Actuator. Most of the time an Actuator belongs to a well specified Subsystem. This is
because it typically acts on a very specific and restricted part of the equipment (i.e. switch
off a single rack)4.
A2A. There is a direct correspondence between an A2A and an Action; so we can say that a
given A2A “belongs” to the same Subsystem as its Action
Alarm. Here things are less well defined. You could have an Alarm which is linked (through
the A2As) only to Actions of a given Subsystem, or to Actions of different Subsystems.
Pragmatically, we can consider an Alarm “belongs” to every Subsystem of which it is linked
to at least one Action
Sensors. Same situation as for the Alarms. A sensor can appear in the alarm conditions of
many different Alarms; these Alarms may belong to different Subsystems. So, a Sensor
“belongs” to every Subsystem to whom at least an Alarm, influenced by the Sensor, belongs.
This will not pose any practical problem when we will talk about “a Subsystem’s AAM”; we simply
must be aware of the fact that any tool we build must allow Sensors and Alarms to belong to several
“owners”.
A consideration about UIDs
We have seen that there is a distinction in the way the UIDs are derived for Sensors and Actuators
on one side (from hardware address) and for Alarms and A2As (allocating a free slot in the PLC
memory).
For Sensors and Actuators, the UID is really important; if the User just renames a Sensor from “A” to
“Z”, the name information changes, but the UID remains. The Sensor “Z” with the new name is still
the same Sensor “A” as before. On the other hand, if, as a consequence of DSS configuration
operations the old Sensor “A” was renamed, and another Sensor, with a different UID, is now called
“A”, the old “A” and the new “A” are in principle5 not the same .
On the other hand, the UID of an Alarm or of an A2A is much less important; suppose the old
“validation document” contained an Alarm called X for which UID_X was used, and that Alarm had a
certain alarm condition, and was linked to a certain set of Actions. Suppose that now Alarm X does
not exists anymore, but that there is another Alarm Y, using UID_Y, which has the same alarm
condition and is linked to the same Action. As a matter of fact, the old Alarm X and the new Alarm Y
are completely equivalent, and implement the same safety. The same is true for the A2As as long as
for each old A2A there is a new one which does the same thing, the safety is the same.
4
Indeed when the DSS was designed, it was clearly said that there would have been point-like Actions, but also
higher level Actions (which would switch off an entire Subdetector). Unfortunately, due to the implementation
of the electrical distribution in the experiments, these high-level Actions could not be implemented, and this
brought to the necessity of having Alarms which execute hundreds of point-like Actions)
5
The only exception is when a channel in an Input module is broken, and the old sensor « A » is disconnected
from the broken channel and reconnected to another one. Such an infrequent operation has to be noted
explicitly in the “validation document”, to keep trace of it.
Information which need to be stored to characterize the Subsystem AAM
The DSS Experiment expert can write his “Validation Document”, which describes the Sensors,
Alarms, Actuators involved in the Safety of the Subsystem, and their relations.
From this, or (better) from an explicit list, the following lists can be extracted




The list of Sensors which belong to the Subsystem (in the meaning described in section
“Identification of the AAM for a Subsystem). Optional: hardware address, sensor type,
persistency, thresholds.
The list of Actions which belong to the Subsystem. Optional: their hardware addresses
The list of Alarms which belong to the Subsystem. Optional: the definition of the alarm
condition.
Optional : for each Alarm, the list of Actions to be executed, together with their delays.
Optional things might be omitted, if everything already exists and has been tested, because a
script can easily extract it from the PVSS database. If instead the validation document is
produced before the configuration of the AAM, it is necessary to have this optional information,
so that the tool can check, once everything is implemented, that what has been implemented
matches what is described in the document. Pragmatically, now we should be more in the first
situation, as the DSS has been in operation for quite some time. But even if a totally new part of
the AAM had to be built, one can consider creating and testing the Alarms before connecting
them to any Action. In this way the alarm condition, which is the most complicated thing to
convert from text description into data structure, could still be extracted directly from the PVSS
database.
From these lists, a script can extract from the PVSS database the information it will later need to
compare if what exists is what was defined.

For Sensors: Sensor Name, UID, Persistency, Sensor Type*, Low Threshold* High Threshold*
(* only for analogue sensors)
 For Actuators: Actuator Name, UID
 For Alarms: Alarm Name, UID, + the following dpes, all below <alarm>.Definition., which
define the Alarm Condition :
o Type, Nmin,
o Lev1_Op, Lev1_Sign, Lev2_Nmin, Lev2_Persistency,
o Lev2_Dp, Lev2_Op, Lev2_Sign
1. For A2A: Alarm Name, Actuator Name, Delay
How to compare the current AAM with the information linked to the validation
document
The process of comparing the current DSS database with the validation document consists in
answering the following questions:
1. Do the sensors specified in the list still exist? (Maybe renamed, but with the same UID)? Do
they have the same persistency, sensor type and thresholds (where applicable) as the ones
described in the validation document?
2. Do the actuators still exist (Maybe renamed, but with the same UID).?
3. For each alarm described in the validation document: does an alarm with the same alarm
condition, and which has links to the same actions, exist? (When checking, one has to take in
account the renaming discovered in steps 1 and 2)
4. For each A2A in the validation document: is there currently an A2A linking the same Alarm
and Actions (besides renaming discovered in 2 and 3) and with the same delay?
If the answer to all these questions is affirmative, the AAM for the Subsystem is still the same;
otherwise the experts should be shown a list of the differences this process has found. The experts
might also be interested in seeing all the renaming operations that the tool has detected.
Two panels to help the experiment expert in doing this work
I am currently developing two panels to assist you in producing the information discussed above.
The two panels are called DefineSubdetector and CompareSubdetector.
DefineSubdetector
With this panel, you select a number of Actions (the ones which belong to the Subsystem in the
sense discussed above). The panel will then find all the A2A, Alarms, and Sensors related to these
Actions, and will output all the required information both in a format which will be used by the tool
when it has to compare the validated AAM with the current one, and for your documentation.
The information produced by the panel for a later comparison will be stored, in text format, in a
datapoint called VALIDA_<subdetector>, where <subdetector> is the name you provide.
CompareSubdetector
With this panel you can compare, at a later stage, if the validated AAM for a subdetector is still
present in DSS. The panel reads from the VALIDA_<subdetector> datapoint the list of DSS items
belonging to the subdetector and their properties, and then checks if these items are still defined in
the DSS database, if their properties have changed, and so on. The panel is able to deal with items
which might have been renamed meanwhile (it will also report the renaming it has found).
The panel produces a report, and stores it into a file and into datapoint
VALIDAREPORT_<subdetector>
The report informs the User if the AAM for the subdetector is still there, if there have been renaming
operations, or modification in some properties (sensors persistency, thresholds, action delays,
structure of the alarm conditions), or if some items are missing. It will tell the User that the
comparison was successful, or that some properties have changed and need to be checked carefully,
or if the AAM was clearly modified and needs to be validated again.