Supporting material for the paper:- FBMTP: An automated fault and behavioural anomaly detection and isolation tool for PLC controlled manufacturing systems 1 FBMTP: An automated fault and behavioural anomaly detection and isolation tool for PLC controlled manufacturing systems Arup Ghosh, Shiming Qin, Jooyeoun Lee, and Gi-Nam Wang Fig. 1. A BIW automotive manufacturing system model. I. EXPERIMENTS WITH A SMALL VIRTUAL MANUFACTURING SYSTEM (EXPERIMENTAL STUDY, RESULTS, AND DISCUSSION) F BMTP tool is implemented in C# programming language. We have tested FBMTP on various real-world as well as simulated (or virtual) manufacturing systems (obtained from UDMTEK Co., Ltd. [1]) to find out its practical effectiveness. Here, we report our experimental results based on a small-sized virtual manufacturing system. This is because, it would be impossible in such a short space to discuss a real-world In this supplementary file, we present the experimental results validating the proposed approach on a wide range of manufacturing system scenarios. These experimental results further illustrate the importance and the effectiveness of the proposed approach. ***In this file, the term “PAPER” refers to the original article i.e., “FBMTP: An automated fault and behavioural anomaly detection and isolation tool for PLC controlled manufacturing systems”.*** For any query please contact the author directly: Arup Ghosh (email: [email protected]) manufacturing system comprehensively. Moreover, a real system cannot be used freely (for example, faults or anomalies cannot be inserted into the system arbitrarily, partitioning the system into subsystems cannot be achieved easily etc.). The architecture of the virtual manufacturing system in which our experiments are performed, is presented in Fig. 1. This virtual system is designed and implemented very carefully (by using PLC Studio Software [2]) so that it behaves in the same manner as the real system. Just to remind the readers, in the context of this PAPER, the term ‘large system’ refers to a system where the number of signals that can be accessed in each PLC scan cycle, is substantially lower than the total number of signals. So, even if the given system is small in size, we can still convert it into a large system by limiting the access of the data logger to a very few signals in each PLC scan cycle. The architecture of this virtual system (or the underlying PLC program) is actually taken from a real-world Body-In-White (BIW) automotive manufacturing subsystem. It works as follows (also see Fig. 1): I. If the part storage location is not empty (detected by using Supporting material for the paper:- FBMTP: An automated fault and behavioural anomaly detection and isolation tool for PLC controlled manufacturing systems 2 Fig. 2. List of the PLC I/O signals and their respective functions. the sensor Str_Part_CHK), then the green light of the signal lamp Lmp will be ON; otherwise, the red light of the signal lamp Lmp will be ON. If the green light is ON, then the system works as follows: o a part (i.e., a door panel) is taken out from the storage location and is loaded to the part loader L (this is a manual operation). o the part loader L moves along the rail track (towards the robot RB1). o after the part loader L reaches the end position of the rail track (its advanced position), the robot RB1 starts its operation. The complete operation of the robot RB1 comprises two sub-tasks:– i) first task: move the robotic arm close to the part loader L and then grasp the part on it; and ii) second task: pick the part from the part loader o o o o o o L and then load it to the daecha D. after the robot RB1 finishes its operation, the daecha clamp DCLP grasps the part on the daecha D (the clamp closing operation of the daecha clamp DCLP). then, the robot RB1 starts to return back to its home position. after that, the welding robot RB2 moves its arm close to the daecha D and starts to perform the sealing operation (this whole operation is referred to as the welding task of the robot RB2). after the robot RB1 reaches its home position, the part loader L starts to return back to its home position. the robot RB2 finishes its welding task and then starts to return back to its home position. after the part loader L and the robot RB2 reach their Supporting material for the paper:- FBMTP: An automated fault and behavioural anomaly detection and isolation tool for PLC controlled manufacturing systems corresponding home positions, the daecha D starts to move towards its advanced position. o when the daecha D is in its advanced position, the daecha clamp DCLP opens and the part is immediately removed (manually) from the system. o then, the daecha D returns back to its home position. [this cycle starts again when another part is set on the part loader L] II. If the red light is ON, then the operator has to wait until the storage is partially or completely filled (the storage can be filled only after the completion of a system cycle). We should mention that the daecha clamp DCLP is actually composed of two clamps i.e., the left clamp and the right clamp. A simulation video of this virtual manufacturing system can be found in: https://www.youtube.com/watch?v=gB0Q5C3qGWo (PLC Studio Software [2] is used for the simulation purpose). The whole system is controlled by 24 sensor and 15 actuator signals (in total 39 PLC I/O signals). A complete list of those signals and their corresponding objectives are given in Fig. 2. The signal names are arranged in that list according to the name of the devices that they operates. The DSVTF model of the above described manufacturing system is given in Fig. 3 (Graphviz Software [3] is used for the graphical visualization purpose). FBMTP has generated this DSVTF model based on the log data records of twenty five consecutive runs of the manufacturing system of Fig. 1. The DSVTF model states are represented by the state numbers (for example: X1, X2, and so on) instead of the actual boolean vectors (or the hash codes) in order to make the model more simple and easy to understand. The states are numbered according to their appearance in the log data records. A part is loaded to the part loader or the storage is filled after an arbitrary time interval (recall that those operations are the manual operations). Please note that the transition between state X2 and X27 is executed only when the last remaining part is taken out from the part storage location. In Fig. 3, the brown coloured arrows represent the transitions with high-variance transition times; the blue coloured arrows represent the low-frequency transitions; and the black coloured arrows represent the rest of the transitions (recall that this differentiation is required for the behavioural anomaly detection). The TTO times associated with the transitions (see Fig. 3) are given in milliseconds (MS) where, TTO Time = (corresponding maximum transition time × 1.02) [this is a virtual system and hence, the state transitions occur very quickly]. There is no Type II system state in the DSVTF model of Fig. 3 because, no signal changes its status value too frequently. The parameter values of the time inaccuracy bound TIB are given as follows (the symbols have the same meanings as in Definition 5 – see Subsection 4.2.1 of the PAPER): i) N = 39; ii) n = 10 (means, four PLC scan cycles are required to obtain all the PLC I/O signal data); iii) δ = 10 MS; and iv) ∂ = 10 MS (so, TIB = 40 MS). In our original system, three PLC scan cycles are required to obtain all the PLC I/O signal data. Here, we have increased that number to four in order to find out the effectiveness of FBMTP for even larger system (given the fact that the state transitions 3 occur very rapidly in this virtual system, this is a quite high factor). As can be seen from Fig. 3, FBMTP is able to define the complete process behaviour by using only 51 states and 53 transitions (that means almost linear state and space complexity). It is highly unfair to compare FBMTP with any other approaches (for details see Subsection 2 of the PAPER) because, as stated earlier, those approaches are not intended to handle the large manufacturing processes or the data inaccuracy issues. However, for the sake of comparison, if we apply the NDAAO approach [4] on the same set of log data records, then it generates 103 states and 141 transitions [we set the preceding sequence length Lps = 0 (see Subsection 4.3 of the PAPER); otherwise, it generates an extremely large and complex NDAAO model]. Please note that the number of states and transitions required to express the same system behaviour are approximately doubled (because, the NDAAO approach does not eliminate the redundant transition paths and the unstable system states from the control process model). This ratio generally increases unboundedly with the growing number of signals and TIB time; and hence, may lead to a state or space explosion problem for large-sized systems [here, we have used the NDAAO approach [4] as an explanatory example – the other existing approaches cited in the PAPER also induce the same problems]. The existing literatures on this subject have not provided much experimental details about the accuracy of their FDI approach or the evaluation procedure of it (most of them do not solve the BADI problem as well). In FBMTP, the information related to the system alarm, transition time, transition execution pattern, undetected fault propagation, infinite looping at a state etc., has been taken into consideration; and sufficient measures have been implemented to handle the data inaccuracy issues. These measures theoretically make FBMTP much more accurate and effective than the other existing approaches (see Subsection 2 of the PAPER). In order to practically evaluate it, we selected seven participants and asked each of them to insert two faults into the virtual system of Fig. 1 (the categories of the faults are aforementioned in order to avoid the redundancy problem). The faults are generated by inserting the incorrect signal status values into the PLC memory using the KEPServerEX Software [5]; and by activating or deactivating the device components using the PLC Studio Software. The results of this experiment are summarized in Fig. 4 [in Fig. 3 and Fig. 4, the SSC elements sets of the DSVTF model transitions and the (stable) faulty transitions are sorted according to some predefined signal numbering scheme]. The state expiry time associated with each DSVTF model state is set equal to the maximum transition path time of all possible transition paths of length six from that state (see Subsection 5.2 of the PAPER). In Fig. 4, fault number 3, 8 and 12 are the examples of a fault without a faulty transition. Among the others, fault number 1, 4, 7, 9, 13 and 14 are the examples of the Cause I fault case scenarios and fault number 2, 5, 10 and 11 are the examples of the Cause II fault case scenarios (these are the examples of a fault with a faulty transition – for details see Subsection 5.3 of the PAPER). Supporting material for the paper:- FBMTP: An automated fault and behavioural anomaly detection and isolation tool for PLC controlled manufacturing systems Fig. 3. DSVTF model of the manufacturing system of Fig. 1. 4 Supporting material for the paper:- FBMTP: An automated fault and behavioural anomaly detection and isolation tool for PLC controlled manufacturing systems 5 Fig. 4. The output results of the FDI method of FBMTP. As can be seen in Fig. 4, FBMTP is able to detect 13 out of 14 faults correctly (accuracy rate: 93%). Moreover, it does not produce any false positives. The fault number 6 remains undetected because, the actuator associated with the signal Lmp_Red does not have any impact on the rest of the system operations. Actually, the fault number 6 causes both the red and the green light of the signal lamp Lmp to glow together. We assume that the operator continues to supply the parts to the system by ignoring such display signs. This makes the actuator associated with the signal Lmp_Red irrelevant and hence, that soft fault (i.e., fault number 6) remains undetected. The fault number 11 of Fig. 4 is yet another example of the soft fault [here also, we assume that the operator oversights the display signs (i.e., the red light of the signal lamp Lmp)]. However, as can be seen, FBMTP is able to detect this soft fault accurately. From our practical experience, we have seen that most of the times, the soft faults remain undetected because of the following two reasons: i) similar types of soft faults (or anomalies) have already been assimilated into the nominal DSVTF model; and ii) the system has some structural or functional deficiencies (for example, consider the objectives of the sensor and actuators associated with the signal lamp Lmp). So, those issues are needed to be addressed first by the system engineers in order to significantly reduce the probability of occurrence of an undetectable soft fault (as there exists no other ways to deal with such issues). If all the structural and Supporting material for the paper:- FBMTP: An automated fault and behavioural anomaly detection and isolation tool for PLC controlled manufacturing systems 6 Fig. 5. The output results of the BADI method of FBMTP. functional deficiencies associated with the system of Fig. 1 (particularly, the signal lamp Lmp) are resolved, then FBMTP can accurately detect all the faults of Fig. 4. Recall from Subsection 5.3.3 of the PAPER that FBMTP can produce an inaccurate initial fault candidate set only in the following two cases: i) in cases where the fault is not detected at the source state (the exact state where the fault has actually taken place); and ii) in cases where the data logger fails to detect all the faulty SSC events. As can be seen in Fig. 4, the SSC event/s of the faulty signal/s is correctly included in the initial fault candidate set in 12 out of 13 fault cases (accuracy rate: 92%). In case of fault number 9, FBMTP fails to include any SSC event of the faulty sensor signal D_RET in the initial fault candidate set because of the faulty SSC event miss incidents (for details, see Subsection 5.3.3 of the PAPER). However, even in that case, FBMTP is able to correctly identify that the sensor signal D_RET is changing its status value irregularly. So, the actual faulty signal is ultimately isolated. Recall that in case of determining the exact fault candidate set, FBMTP may produce an incorrect exact fault candidate set (in other words, may misclassify the cause of the fault) especially if the user-defined time threshold value is not set appropriately (see Rule 5 in the PAPER). In the above experiment, we have set the path length parameter value to 5 and the time threshold parameter value to approximately 650 MS (actually, a little extra value is dynamically added to that time depending on the transition path time). As can be seen in Fig. 4, FBMTP is able to capture the SSC event/s of the faulty signal/s in the exact fault candidate set in 8 out of 9 Cause I and Cause II fault cases (accuracy rate: 89%) [the fault number 9 is excluded because of the faulty SSC event miss incidents]. If the time threshold value is not set too low (approximately, < 240 MS) or too high (approximately, > 1970 MS), then FBMTP provides the same accurate results as in Fig. 4 (given the fact that it is a virtual system, the time range is quite wide). Only in case of fault number 11, FBMTP is unable to correctly identify the real cause of the fault. This is because, its corresponding transition times fluctuate immensely (as they are dependent on the users’ inputs) and hence, it becomes very difficult to set the appropriate time threshold value without any prior knowledge. For real-world systems, we recommend users to set a few seconds to the time threshold parameter depending on the time fluctuations of the state transitions. If we limit the number of signals that can be accessed by FBMTP in each PLC scan cycle i.e., n to 4 (that means ten PLC scan cycles are required to collect all the PLC I/O signal data) then the TIB time becomes 100 MS. It is easy to realize from Fig. 3, two DSVTF model states i.e., state X26 and X51 become unstable as a consequence of it and hence, are discarded from the model. However, it does not alter much the FDI results of Fig. 4. Only in case of fault number 5, two extra SSC elements i.e., RB2_RUNNING_OFF and RB2_READY_ON are added in the initial and exact fault candidate set. So, basically, it does not affect the accuracy of Supporting material for the paper:- FBMTP: An automated fault and behavioural anomaly detection and isolation tool for PLC controlled manufacturing systems the FDI results (only marginally increases the number of SSC elements in the fault candidate sets). Theoretically, following the same procedure, we can arbitrarily increase the TIB time until the complete DSVTF model is scaled down to a very few states, so that FBMTP produces incorrect FDI results in most the fault cases. However, doing so does not practically make any sense because, we have already set the N/n ratio of the TIB time (see Definition 5 in the PAPER) to 10, which is quite high compared to the N/n ratio of a real-world PLC controlled manufacturing system (also keep in mind that the state transitions occur very rapidly in this virtual manufacturing system). In the similar way, if we set the number of accessible signals n to 39 (implies that all the I/O signals can be captured in a single PLC scan cycle – the property of a small manufacturing system) then the TIB time becomes 10 MS. This measure introduces 18 new additional states and transitions in the DSVTF model of Fig. 3 and hence, the FDI results of Fig. 4 are completely modified. The most significant change it brings is that FBMTP produces the correct initial and exact fault candidate set for fault number 9. This becomes possible because, the data logger is now capable of detecting all the SSC events of all the PLC I/O signals accurately. So, in this setting (TIB = 10 MS), FBMTP is able to correctly identify the real cause of the fault in 9 out of 10 Cause I and Cause II fault cases (accuracy rate: 90%). Moreover, the number of (irrelevant) SSC elements in the fault candidate sets are also reduced by 1 to 2 elements in case of 5 out of 13 detected faults (for fault numbers 2, 5, 9, 12 and 13). The above discussed experimental results provide enough evidence to support the claims made throughout the PAPER. If we apply the NDAAO approach to detect the same set of faults (as in Fig. 4), then two faults i.e., fault number 6 and 11 remain undetected (accuracy rate: 86%). The fault number 11 remains undetected because, in the NDAAO approach, the transition execution pattern (or transition time) information is not incorporated into the control process model. Moreover, for the settings n = 10 and 4 (that means, TIB = 40 and 100 MS), the NDAAO approach generates an excessive number of false positives during the fault detection phase, which makes it practically infeasible for large manufacturing systems. However, if we set the n value to 39, then the NDAAO approach does not produce any false positives and provides almost similar fault isolation results as FBMTP (both of them are able to incorporate the SSC events of the faulty signals into the fault candidate sets for all the detected faults, except fault number 9). Please note that even in that setting, FBMTP provides more restricted fault candidate sets than the NDAAO approach (by applying the exact fault candidate set finding method). From the discussions throughout the PAPER, it is easy to perceive that FBMTP will always grossly outperform than the NDAAO approach for large manufacturing systems; and for small manufacturing systems, FBMTP will generally provide more accurate FDI results (because, as stated earlier, the information related to the system alarm, transition time, transition execution pattern, undetected fault propagation etc., has also been taken into account in FBMTP). These claims are also empirically validated through several other experiments 7 (in addition to the above experiment – see below): experiments with virtual systems: these experiments are performed by inserting dozens of software-generated faults into the seven different virtual manufacturing systems (the N/n ratio is set to 1, 4 or 5 and 10). experiments with real-world systems: these experiments are performed on six log databases taken from four different automotive manufacturing systems (they have in total eight faulty device components). We have also inserted twenty-two software-generated faults into those databases (the simulated faults are carefully generated with the help from the engineers of UDMTEK Co., Ltd.). In all the above experiments, it is found that FBMTP provides more accurate FDI results than the NDAAO approach (for all the settings of the N/n ratio). Moreover, FBMTP gave more than 83% accurate FDI results (which is quite high accuracy rate) in all the above cases (by the term ‘accurate FDI result’, we actually mean that the fault is detected correctly and the faulty signal/s is isolated accurately). Similar experiments are also carried out for evaluating the accuracy of the BADI approach of FBMTP. We have found that FBMTP can accurately detect and isolate most the behavioural anomalies present in the manufacturing system (recall that the BADI approach of FBMTP always provides an accurate isolation result). As an example, the participants were asked to insert several behavioural anomalies into the virtual system of Fig. 1. The corresponding BADI results (for a few examples) are shown in Fig. 5 (given for exemplification purposes only). Actually, the accuracy rate of identifying behavioural anomalies (such as, transition time error, transition and transition time probability error etc.) is hard to determine as it varies depending on the system user’s perspective (in other words, varies based on the definition of the behavioural anomalies). A system user often assigns different values to the time and probability threshold parameters in order to find out how compactly the system is working (mostly performed off-line). Anyhow, if all the parameter values are set appropriately, then a behavioural anomaly that causes a significant deviation in the system behaviour, is detected and isolated precisely (for details see Subsection 5.1 of the PAPER – also see Fig. 5). At this point, we must clarify the fact that it is not always possible to automatically identify (correctly) the device components that carry out the physical/mechanical operations associated with a particular state transition (required for determining the DDC candidate set – see Fig. 5 and also see Subsection 5.1 of the PAPER). A system engineer can easily determine the device components connected with a state-transition operation by inspecting the corresponding DSVTF model and/or the PLC program. However, we strongly recommend the users to create a separate file explaining the linkages between the transition operations and the device components. The mentioned file can also be used by FBMTP to produce the needed DDC candidate set automatically. As argued previously, in FBMTP, some faults and behavioural anomalies that do not have much impact on the system operation can remain unidentified. However, that is not really a Supporting material for the paper:- FBMTP: An automated fault and behavioural anomaly detection and isolation tool for PLC controlled manufacturing systems matter of concern because, from the point of view of the system engineers, those faults or anomalies are unimportant or irrelevant. ACKNOWLEDGMENT The authors would like to thank UDMTEK Co., Ltd. for the use of its research facilities during this study. REFERENCES [1] UDMTEK Co., Ltd., Website: http://www.udmtek.com, last retrieved on November 20th, 2016. [2] PLC Studio Software, Website: http://www.udmtek.com/esub04_04_01, last retrieved on November 20th, 2016. [3] Graphviz Software, Website: http://www.graphviz.org, last retrieved on November 20th, 2016. [4] [5] M. Roth, S. Schneider, J. J. Lesage, and L. Litz, “Fault detection and isolation in manufacturing systems with an identified discrete event model,” Int. J. of Syst. Sci., vol. 43, no.10, pp. 1826–1841, Oct. 2012. KEPServerEX Software, Website: https://www.kepware.com/products/kepserverex/, last retrieved on November 20th, 2016. 8
© Copyright 2026 Paperzz