Process MD - Air Products and Chemicals, Inc.

Discover
what you love to do at Air Products.
Process MD
Capturing the Heartbeat
of Plants through
Advanced Analytics
Presented by
Carlos A. Henao
SHPE Conference
November, 2016
1
Discover
what you love to do at Air Products.
Agenda
• About Air Products
• What is and Why Process MD* ?
• What is Behind Process MD* ?
• Multivariate Analysis at Air Products
• Case Study
2
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
About Me
• Born and raised in Medellin-Colombia
• B.S. Chemical Engineering
Universidad Pontificia Bolivariana - Medellin
• Worked 5 years for EPC Technip as a
process engineer
• PhD Chemical Engineering,
University of Wisconsin-Madison
• Joined Air Products in 2012
3
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
About Air Products
We…
• Are a world-leading industrial gases company
• Supply atmospheric and process gases and
related equipment to manufacturing markets,
including refining and petrochemicals, metals,
electronics, food and beverage
• Strive to be the world’s safest and best
performing industrial gases company
• Are the world’s largest supplier of hydrogen
to the energy market sector
• Are the world’s leading supplier of helium and
liquefied natural gas process technology and
equipment
4
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
About Air Products
The numbers
$9.5
17,000
billion in sales
employees
~$30
700+
billion in market
capitalization
5
production
facilities
Nov 2016
Air Products Public
operations in over
50
countries
30+
industries
served
Discover
what you love to do at Air Products.
About Air Products
Processes: Cryogenics (Air separation, helium,..)
Distillation
Column
Packing
6
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
About Air Products
Processes: Methane reforming to produce H2
Reformer
Burners
Catalyst
Tubes
7
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
What is and Why Process MD ?
8
Nov 2015
Air Products Public
Discover
what you love to do at Air Products.
What is Process MD ?
Web-based application developed by Air Products to conduct
real time monitoring and diagnosis of production facilities.
Process MD includes a data-driven modeling environment
based on advanced multivariate analysis techniques suitable
for quick development and deployment of models capturing
the behavior of entire plants or plant subsystems
Models created with Process MD® are used to monitor process
Key Performance Indicators (KPIs) and diagnose process upset conditions
to quickly determine causes and suggest corrective actions.
9
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
Why Process MD ?
Enterprise-wide Asset Performance Management
1,800
700+
miles of industrial
gas pipeline
production
facilities
operations in over
50
countries
• Asset safety, reliability and efficiency are key to value generation
• Large number of geographically dispersed and constantly evolving assets
represents a significant monitoring challenge
• Classical approaches are limiting due to volume and interrelated complexity
of process data
• Advanced multivariate monitoring and fault diagnostic platform is key to
guiding value-added decision-making
10
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
Why Process MD ?
Nature of Process Data
With the advent of digital control systems, data
historians, and New developments such Internet of
Things (IoT) and Cloud computing manufacturing
companies now have a access to huge amounts of data
Characteristics
•
•
•
•
•
•
11
Very high dimensional data matrices
10-20% missing data is common
Low signal-to-noise ratio
Non-causal in nature
High degree of correlation
Indirect representation of overall asset performance
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
Why Process MD ?
Data to Insight
Transforming huge datasets coming from the
plant/equipment into actionable information
Use models to
quickly capture
the heart beat of
the plant
The key lies in the interaction/correlation amongst the
variables, even more than the variables themselves…
13
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
Why Process MD ?
Data to Insight
Transforming huge datasets coming from the
plant/equipment into actionable information
Use models to
quickly capture
the heart beat of
the plant
At Air Products, young engineers like yourself help us
develop, maintain and use plant and equipment models
saving the company tens of millions of dollars a year
14
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
What is Behind Process MD ?
15
Nov 2015
Air Products Public
Discover
what you love to do at Air Products.
What is Behind Process MD ?
State-of-the-art Data Mining Technologies
Reduces a problem from n variables, to a much smaller
number of composite variables (principal components/scores)
Many System
Variables
v6
v1
Two Composite
Variables
t1
16
t2
Nov 2016
Air Products Public
v5
v9
Discover
what you love to do at Air Products.
What is Behind Process MD ?
Multivariate Techniques
PRINCIPAL COMPONENT ANALYSIS (PCA)
Projection technique in which the variability of a data set (X or Y)
in a high dimensional space is captured using a lower dimensional
representation.
PROJECTION TO LATENT STRUCTURES (PLS)
Projection technique in which the variability of a data set of process
conditions (X) and its relationship with a data set of quality
indicators (Y) is captured using lower dimensional representations.
In the traditional implementation, these techniques assume the
existence of a linear relationship between the system variables
17
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
What is Behind Process MD ?
Principal Component Analysis (PCA)
Characteristics
• Basic concept
• Key idea
• Results
A set X of N data points (i.e. X={xn, n=1,…,N} ) is
“summarized” by representing each data point xn ϵ RK
as a point tn ϵ RA using a coordinate system with fewer
dimensions (i.e. A<K)
The axis of the new coordinate system are orthogonal
to each other and they are oriented in the directions
(wa ϵ RK, a=1,…,A) where the original data set X has
maximum variability (i.e. Principal components)
• Most of the important information in the original data
set X is captured in the new representation T={tn,
n=1,…,N} using only a few new variables (i.e. A<<K)
• The new axis represent “Latent Variables” = linear
combinations of original variables.
18
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
What is Behind Process MD ?
Principal Component Analysis (PCA)
Geometric illustration: With X ϵ RNx3, and T ϵ RNx2
x3
t1
t1
w1
w2
x2
t2
t2
x1
• First Principal Component: Orientation w1 of coordinate axis t1 in the X
space is selected as the direction in which the data has the maximum
variability (First Eigenvector of the Covariance Matrix ΣX=XT.X/(N-1) )
• Second Principal Component :Orientation w2 of coordinate axis t2 in the
X space is selected as the direction perpendicular to w1 in which the data
has the maximum variability (Second Eigenvector of covariance Matrix ΣX )
19
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
What is Behind Process MD ?
Principal Component Analysis (PCA)
Geometric illustration: With X ϵ RNx3, and T ϵ RNx2
x3
t1
t1
w1
w2
x2
t2
t2
x1
In this illustration, finding the two first components is equivalent to finding
the line of view that allows you to get the most information about the swarm
of points X. By projecting the original points onto the plane (t1-t2)
perpendicular to the mentioned line, you are able to gain much better insights
about the data (i.e. differences and similarities between different points) than
when using any other line of view.
20
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
What is Behind Process MD ?
Principal Component Analysis (PCA)
Linear algebra details
Finding
Vectors W
1  0 
T
XT X
T
ΣX 
 W  Λ  W  W1  WH        W1  WH 
N  1
 0  H 
 Data matrix with centered and scaled data. 

X  R NxK : 
 That is, N total observations of K variables 
H  rank(Σ X )
 Eigenvectors of Σ X . H vectors of dimesion K (H  K).

If
all
x
variables
are
independen
t
H

K


Wh , h  1,..., H  R K : 
 , h  1,..., H  R : Eigenvalues
h
21
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
What is Behind Process MD ?
Principal Component Analysis (PCA)
Linear algebra details
T  XW
Calculating scores T
Model predictions XPred
Model residuals E
E  X  X Pred
 Matrix with first A eigenvectors 

W  R KxA : 
of
Σ
(A

H

K
)
X


T  R NxA : Matrix with t - scores of X 
x3
xn
en
t1
X Pred  R NxK
xnPred
w1
w2
X Pred  T  W T
,
x2
E  R NxK
t2
 Reconstructed data matrix using

 the first A principal components
:
Pred
 Row " n" is the prediction x n of
 data point x .
n

 Model residuals.

:  Row " n" is the residual of e n of

 data point x n
x1
22
Nov 2016
Air Products Public












Discover
what you love to do at Air Products.
What is Behind Process MD ?
Principal Component Analysis (PCA)
Linear algebra details
Square predict. errors SPEn
Model Correlation Coef. R2
x3
xn
en

 x
k 1,..., K

R 2  1    SPEn
 n 1,..., N
 xnPred
,k
 SS
n 1,..., N

2
, SS n  x n
2

 x 
2
k 1,..., K


n

x2
Nov 2016
Air Products Public
t2
x1
23
n ,k
 Squared error in the prediction of 

SPEn  R : 
data
point
n
(i.e.
x
)
n


2
R  R : Model correlation coefficient 
xnPred
w2
2
 Reconstructed data matrix using 

X Pred  R NxK : 
the
first
A
principal
components
.


E  R NxK : Model residuals 
t1
w1
SPEn  e n
n ,k
Discover
what you love to do at Air Products.
What is Behind Process MD ?
Principal Component Analysis (PCA)
Model Creation
Plotting
and Model
Validation
Covariance
Matrix
Decomposition
X
Select
Training
Data
(XT. X)/(N-1)
=W.Λ. WT
A
T=X.W
t2
E=X-T.WT
R2=1-SPE/SS
Data
Projection
W
t1
T
Yes
Satisfactory?
Select Number
of PC’s
No
Model
Parameters
W, Λ, R2
• From historical data deemed to be extracted during the normal operation of
the system (e.g. chemical plant), a model is produced.
• During the validation step the models is evaluated to determine if it is
suitable, i.e. captures most of the variability in the original data, and there
is no clustering or nonlinearities identified in the T-scores
24
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
What is Behind Process MD ?
Principal Component Analysis (PCA)
Model Creation
• Data scores T={tn ϵ RA ,n=1,…,N):
Coordinates of the original data in the new
coordinate space
• Loadings W={wa ϵ RK ,a=1,…,A): Vectors in
the original space indicating the direction of
the new coordinate axis.
• Proximity of scores is representative of
similarity between data points
t2
• Clustering in the score space (i.e. latent
variable space) can help to identify
shift/change in process conditions
• Build different models for different
operating modes
25
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
What is Behind Process MD ?
Principal Component Analysis (PCA)
x3
Model use in process monitoring
xNew
• Squared Predicted Error (Model error):
SPE New 
 x
k 1,.., K
New_Pred
k
 xkNew

2
• Squared Mean Shift:
SMS New 

a 1,.., A
t 
New 2
a
a
x2
w2
t2
x1
t1
tNew
, n  1,..., N
- Measures distance from Process Mean
- Assesses how different an operating point is
from historical mean
26
xNew_Pred
, n  1,..., N
- Measures distance from Model
- Assesses if process data is consistent with
historically captured relationships
t1
Nov 2016
Air Products Public
t2
Discover
what you love to do at Air Products.
What is Behind Process MD ?
Principal Component Analysis (PCA)
Model use in process monitoring
SPE New 
• Squared Predicted Error:
 x
k 1,.., K
t
SMS New 
• Squared Mean Shift:
27
Nov 2016
Air Products Public
 t 
a 1,.., A
Control action
required and
taken
SMS
SMS is constantly monitored for new data
coming in. If it goes beyond the control limit,
while SPE is under control, it could mean the
process has shifted from its target state and
corrective actions have to be taken.
 xkNew
Switch to a proper model
Upper control
limit (UCL)
SPE
SPE is constantly monitored for new data
coming in. If it goes beyond the control limit,
it could mean the process has shifted to a
operating mode not covered correctly by the
model.  Need to use a different model.
New_Predicted
k
New 2
a
a
Upper control
limit (UCL)
t

2
Discover
what you love to do at Air Products.
What is Behind Process MD ?
Projection to Latent Structures (PLS)
Motivation
In PLS we are interested in finding the linear
relationship between predictor variables x ϵ RK
and response variables y ϵ RM, based on data
sets X ϵ RNxK and Y ϵ RNxM
B  R KxM : Coefficient matrix 
Multilinear least square regression would yield:
B  XT  X
y
x2


1
 XT  Y
Predictor data points
Predictor-response data points
Predicted values
x1
28
y  xB
Nov 2016
Usually with real data XT.X
is ill conditioned or rank deficient
Air Products Public
Discover
what you love to do at Air Products.
What is Behind Process MD ?
Projection to Latent Structures (PLS)
Characteristics
• Basic concept
• Key idea
• Results
A data set X for predictors (i.e. N data points X={xn,
n=1,…,N} ) is “summarized” to facilitate finding the
relationship between X and a data set Y of key
responses.
The data X is projected to a new coordinate system
were the new axis are oriented in the directions (wa ϵ
RK, a=1,…,A), where the original data set X mostly has
variability which can be used to explain variability in Y.
• Most of the variability in data set X that helps
explaining the variation in Y is captured via the new
representation T={tn, n=1,…,N} using only a few
new variables (i.e. A<<K)
• The new axis represent “Latent Variables” = linear
combinations of original variables.
29
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
What is Behind Process MD ?
Projection to Latent Structures (PLS)
Geometric illustration: With X ϵ RNx3,Y ϵ RNx1, T ϵ RNx2
x3
t1
t1
w1
w2
x2
t2
t2
x1
• First Component: Orientation w1 of coordinate axis t1 in the X space is
selected as the direction in which X has the maximum variability which can
be connected to variability in Y (First Eigenvector of X.YT.Y.XT)
• Second Component: Orientation w2 of coordinate axis t2 in the X space is
selected as the direction perpendicular to w1 in which X has the maximum
variability which can be connected to variability in Y (Second Eigenvector of
T
T
30 X.Y .Y.X )
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
What is Behind Process MD ?
Projection to Latent Structures (PLS)
Geometric illustration: With X ϵ RNx3,Y ϵ RNx1, T ϵ RNx2
t1
c
y+m
c
t2
w2
w1

C  Y T  T  TT  T

1
Y Pred  T  CT
t1
t2
• Once the T scores are calculated, you find the hyperplane in the T-Y space
that allow you to approximate the original Y data using T scores in the best
way possible (i.e. in the Least Squares sense)
• This is equivalent to finding the directions C in which the variability in T is
best linked to the variability in Y
31
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
What is Behind Process MD ?
Projection to Latent Structures (PLS)
To build a PLS model, we need to regress the original data
onto the t-scores T, which are used to predict the y-loadings
C, which in turn are used to predict the responses Y
Data matrices:
X ϵ RNxK, Y ϵ RNxM
X
T
Y
X = T.PT + E
(x-Loads)
T = X.W
(t-Scores)
Y = T.CT + F
(y-Loads)
B = W.CT
(Coefficients)
32
Nov 2016
Air Products Public
Y = X.B + F
“Regression”
Expression
Discover
what you love to do at Air Products.
What is Behind Process MD ?
Projection to Latent Structures (PLS)
Linear algebra details
Finding
Vectors W
Σ YT X
1  0 
T
XT YYT X
T

 W  Λ  W  W1  WH        W1  WH 
N  1
 0  H 
 Data matrix with centered and scaled data. 

X  R NxM : 
That
is,
N
total
observatio
ns
of
K
variable
s


 Data matrix with centered and scaled data. 

Y  R NxK : 
That
is,
N
total
observatio
ns
of
M
variables


H  rank(Σ YT X )
Wh , h  1,..., H  R K : Eigenvectors of Σ Y X . H vectors of dimesion K (H  K).
h , h  1,..., H  R : Eigenvalues
T
33
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
What is Behind Process MD ?
Projection to Latent Structures (PLS)
Linear algebra details
Square predict. errors SPExn
Model Correlation Coef. Rx2
SPEX n  e n
2

en
xnPred
p1
p2
X Pred  R NxK
t1
x2
t2
x1
34
Nov 2016
k 1,..., K

RX2  1    SPEX n
 n 1,..., N
x3
xn
 x
n ,k
 xnPred
,k
 SS
n 1,..., N

Xn
2
, SS X n  x n
2

 Model residuals.



E  R NxK :  Row " n" is the residual e n of 


 data point x n

 Squared error in the prediction of 

SPEX n  R : 
data
point
"
n"
(i.e.
x
)
n


R X2  R : Model correlation coefficient 
Air Products Public
2
m 1,..., K



 Reconstructed data matrix using

 t - scores and x - loading vectors
:
Pred
Row
"
n"
is
the
prediction
x
of
n

 data point x .
n

 x 







n ,k
Discover
what you love to do at Air Products.
What is Behind Process MD ?
Projection to Latent Structures (PLS)
Linear algebra details
Square predict. errors SPEyn
Model Correlation Coef. Ry2
SPEY n  f n
2

fn
ynPred
c1
c2
Y Pred  R NxM
t1
y2
t2
y1
35
Nov 2016
m 1,..., M

RY2  1    SPEY n
 n 1,..., N
y3
yn
 y
n ,m
 ynPred
,m

2
, SS Y n  y n
2
 Model residuals.



F  R NxM :  Row " n" is the residual f n of 


 data point y n

 Squared error in the prediction of 

SPEY n  R : 
 data point " n" (i.e. y n )

R Y2  R : Model correlation coefficient 
 y 
2
m 1,..., M


SS

Yn 
n 1,..., N

 Reconstructed data matrix using

 t - scores and y - loading vectors
:
Pred
 Row " n" is the prediction y n of
 data point y .
n

Air Products Public








n ,m
Discover
what you love to do at Air Products.
What is Behind Process MD ?
Projection to Latent Structures (PLS)
Model Creation
Least Squares
Regression
P = XT.T.(TT.T)-1
C = YT.T.(TT.T)-1
Covariance
Matrix
Decomposition
X
Select
Training
Data
(XT.Y.YT.X)/(N-1)
=W.Λ. WT
Data
Projection
W
A
T=X.W
Plotting
and Model
Validation
t2
E=X-T.PT, F=Y-T.CT
Rx2=1-SPEx/SSx
Ry2=1-SPEy/SSy
T
Yes
Satisfactory?
Select Number
of PC’s
t1
No
Model
Parameters
W, P, C, R2
• From historical data deemed to be extracted during the normal operation of
the system (e.g. chemical plant), a model is produced.
• During the validation step the models is evaluated to determine if it is
suitable, i.e. captures most of the variability in the original data, and there
is no clustering or nonlinearities identified
36
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
What is Behind Process MD ?
Projection to Latent Structures (PLS)
Model use in process monitoring: Adaptive Control Limits
Partial Least Squares (PLS) Approach
• Y variable is the monitored variable in
question
• X variables are used to predict the value
of the Y variable
• PLS takes into account the correlation
amongst the X and Y variables
37
Nov 2016
Air Products Public
•
•
Predicted Y Value
Calculate predicted Y and
compare to actual Y value
Predicted Y offset to define
upper and lower control limits
based upon residual distribution
discrimination
Discover
what you love to do at Air Products.
What is Behind Process MD ?
Projection to Latent Structures (PLS)
Model use in process monitoring:
Identifying Key Drivers
ymPr ed
Group
From
• Predicted Y attempts to capture actual Y
variability based upon associated process data
(i.e., X data)
• Contribution plot generated from a group
from-to analysis of the predicted Y identifies
the key process variables which contributed to
the observed deviation
 m,k 
x
nN From
n ,k
 Bk ,m
N From

x
nN To
n ,k
 Bk ,m
NTo
k  1,..., K
, 
m  1,..., M
Time
 m,k
Contribution
• Detect occurrences when process changes
cause a deviation in KPI of interest
Group
To
Analogous contribution analysis can be
performed using SPE and SMS metrics
38
Nov 2016
Air Products Public
X-Variable
Index (k)
Discover
what you love to do at Air Products.
Multivariate Analysis at Air Products
39
Nov 2015
Air Products Public
Discover
what you love to do at Air Products.
Multivariate Analysis at Air Products
Our Journey at Air Products
Join
MACC
(1995)
First application of multivariate
techniques: Batch emulsion
plant data (1996)
Dr. John
MacGregor’s
keynote speech at
Lehigh University
(1993)
Quality
Safety Efficiency
Monitoring
Continuous Process
and Equipment
Condition Monitoring
(2009 – present)
Diagnostic
Real-time
Monitoring
(2008)
40
Resolving
filtration and
catalyst issues
for batch plant
(2005)
Numerous APCI
applications
(1997 – 2003)
Online monitoring
Prototype successful
for a Batch Plant
(2004)
HYCO plant
(2005)
Offline Batch
Monitoring
(2005)
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
Multivariate Analysis at Air Products
Progression of Asset Management
• Maximize utilization and efficiency of assets
• Identify opportunities for productivity improvement
• Increase the uptime of critical assets and reduce unscheduled maintenance
Value/Reliability
ProcessMD
Predictive
Maintenance
Scheduled
Maintenance
Relative
cost =
1000
Relative
cost = 10
Relative
cost =
100
Monitoring and Fault Diagnostics
Real-time and web-based (Last 6 years)
Relative cost = 1
What assets should be serviced/replaced
because they are likely to fail? (6-8 years ago)
How can I prevent downtimes and
catastrophic failures? (6-8 years ago)
Alerts on…What’s Happening? Where? What’s affected? (10-12 years ago)
Reactive
Maintenance
Post-event analysis (10-15 years ago)
Intelligence
41
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
Multivariate Analysis at Air Products
Real-time Monitoring and Fault Diagnostic Vision
• Create clear and consistent line-of-sight for asset performance
• Decompose complex production process to fundamental blocks
• Provide proactive notification of subtle process deviations
MONITOR
DETECT
Web-based,
state-of-the-art
real-time
monitoring
• Deviations in
parameters
• Deviations in
relationships
ALERT
DIAGNOSE
Push alerts via
email
Identifies items
correlated with
the detected
deviation
Advanced data modeling and analytics via
the ProcessMD© Technology Platform
42
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
Multivariate Analysis at Air Products
Architecture Breakdown
Assets
Data
Storage
Patented
Solution
Historian 1
Modeling
Engine
Historian 2
Historian 3
Nonhistorian
data
Database for
Analysis and
Reporting
Web service
for end
users
US patents 7606681 and 8882883
43
Nov 2016
Air Products Public
End
Users
Discover
what you love to do at Air Products.
Case Study
44
Nov 2015
Air Products Public
Discover
what you love to do at Air Products.
Case Study
PLS Fan Monitoring
Motor temperature monitoring
• Out-of-bound condition detected in a Process KPI
• The value is well below the control system alarm threshold
• Engineer received an alert
Yellow = advisory
Red = Action Required
45
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
Case Study
PLS Fan Monitoring
Motor temperature monitoring
• Engineer performed deep-drill within the system
• Able to rule out likely causes quickly
• Escalated to site technician
46
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
Case Study
PLS Fan Monitoring
Motor temperature monitoring
• Manual inspection revealed excessively dirty filter
• Filter was replaced
• A problem that could take days of data analysis to resolve using
traditional techniques, is solved in hours using Process MD
47
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
Case Study
PLS Fan Monitoring
Motor temperature monitoring
• Process KPIs came back within limits
• Avoided much more costly damage if equipment was allowed to run with
dirty filter (action costed a couple of hundred USD vs a more extensive
damage of several thousand USD)
48
Nov 2016
Air Products Public
Discover
what you love to do at Air Products.
Thank you
tell me more
airproducts.com/shpe