Distance-based classification on EARLINET data - ACTRiS-2

Distance-based classification on EARLINET
data
Nikos Papagiannopoulos, Lucia Mona
[email protected]
ACTRIS-2 Second WP2 Workshop | Barcelona | 7-11 November
Aerosol classification
 Aerosol typing can help:
 understand sources, transformations, effects, and feedback mechanisms.
 improve accuracy of satellite retrievals; test aerosol models; and
 quantify assessments of aerosol radiative impacts on climate.
 Lidars can provide a multitude of type defining variables (e.g., Ångström
exponents, color ratios, depolarization ratios).
 Automatic procedures for aerosol typing include:
 CALIOP uses a decision-tree based on lidar and external info (Omar et al., 2009),
while EarthCARE depends solely on lidar-derived info (Wandinger et al., 2016).
 HSRL study uses a distance-based multivariate analysis depending only on lidar
intensive properties (Burton et al., 2012).
 Nicolae et al. (2015) and Mamun and Muller (2016) used artificial neural networks
to derive aerosol types with data coming from Raman lidars.
 What about automatic typing for EARLINET?
N. Papagiannopoulos | Distance-based classification on EARLINET data
2
Classification process
 Classification is the process of learning a function f that matches each
attribute vector x with one previously defined class labels y.
Input
Attribute vector (x)
Classification
Model
Output
Class label (y)
 First, a model is built which describes a previously determined set of
classes (i.e., aerosol types). This is the training phase.
 Second follows the testing phase, where testing data are used on
model distribution to measure learning success.
N. Papagiannopoulos | Distance-based classification on EARLINET data
3
Classification model
 As training dataset we used coordinated CALIPSO/EARLINET
measurements (Pappalardo et al., 2010; Wandinger et al., 2011).
 Characterized 3β+2α profiles (Schwarz, A., Dissertation, 2016)
enhanced with profiles from Papagiannopoulos et al. (2016).
 The classification model consists of 64 samples that belong to 7
classes.
 D=Dust
PD = Polluted Dust (Smoke/Pollution+Dust)
 S=Smoke
MD=Mixed Dust (Marine+Dust)
 PC=Polluted Continental
CC=Clean Continental
Type*
D
PD
MD
PC
CC
S
M
β-Angstrom
Exponent (IR,UV)
Mean
SD
0.4
0.1
0.9
0.3
0.5
0.2
1.3
0.3
1.0
0.2
1.3
0.1
0.8
0.1
Lidar ratio @ 532 nm
[sr]
Mean
SD
55
7
64
9
47
6
63
15
41
6
78
11
24
7
M=Marine
Ratio of lidar ratios
Mean
0.9±0.1
1.3±0.3
1.1±0.2
0.9±0.2
0.8±0.2
1.0±0.2
0.9±0.1
N. Papagiannopoulos | Distance-based classification on EARLINET data
SD
0.1
0.3
0.2
0.2
0.2
0.2
0.1
# Profiles
9
5
10
16
9
7
8
4
Sensitivity analysis
 Two statistical parameters were examined.
 Total Wilk’s lambda shows the tendency of the clusters to separate.
 Partial Wilk’s lambda shows the discriminatory power of the each
intensive parameter.
 The set performed the best is: BAE(IR,UV), LR(VIS), RLR
 The Total Wilk’s lambda (λ  0),
indicates a good cluster
separation.
 The Ångström exponent has the
most weight in the classification.
Partial
Wilk’s λ
B-AE
(IR,UV)
LR [sr]
(UV)
RLR
Total
Wilk’s λ
0.18
0.28
0.51
~0.1
N. Papagiannopoulos | Distance-based classification on EARLINET data
5
Classifier: Mahalanobis distance
 The Mahalanobis Distance (Mahalanobis, 1936) of a test point (x=x1, x2,
..., xN) from a cluster (μ=μ1, μ2, …,μN) is given:
Dij =−
( xi µ j )T S −j 1( xi − µ j )
 Where S is the variance-covariance matrix for cluster distribution j.
 In aerosol classification: Burton et al., 2012; Russell et al., 2014; Hamill
et al., 2016.
C1
x
x assigned to C2
C2
N. Papagiannopoulos | Distance-based classification on EARLINET data
6
Results
 EARLINET data collected during the summer 2012 ACTRIS campaign
were chosen to test the automatic algorithm.
 A detailed aerosol typing for that was provided by Papagiannopoulos et
al. 2016 (EGU 2016).
Using 7 clusters
Using 6 clusters
Using 4 clusters
Merged S+PC
Merged D+PD+MD
 The prediction rate increases as the number of the clusters decreases.
 What happens when we have depolarization ratio? Note there are 21
depolarization profiles
N. Papagiannopoulos | Distance-based classification on EARLINET data
7
Results
 Literature depolarization values were used in the training phase of the
algorithm.
Type
D
PD
MD
PC
CC
S
M
LPDR [%]
27-35
10-20
10-17
2-10
2-6
2-8
1-9
Using 7 clusters
References
Groß et al., 2011
Groß et al., 2011
Groß et al., 2016
Burton et al., 2013
Omar et al., 2009
Burton et al., 2013
Groß et al., 2013
 Depolarization measurements
facilitate the correct typing while for
limited aerosol types complicate the
selection with respect to the typing
based only on LR, B-AE, and RLR.
Using 6 clusters
Using 4 clusters
Merged S+PC
Merged D+PD+MD
N. Papagiannopoulos | Distance-based classification on EARLINET data
8
Conclusions
 The automatic procedure (Burton et al., 2012) was modified to satisfy the
needs of EARLINET.
 The prediction of the automatic classification showed positive results when
compared against manually classified data.
 The training of the algorithm with literature depolarization values enhances
the strength of correct prediction.
 Positive remarks
 Code adaptability and fast run-time process.
 The training dataset can be easily enlarged with high confidence data.
 The integration of new classifying parameters and aerosol types.
 Negative remarks
 The prediction quality declines in case of complicated aerosol scenes.
 The quality of the training dataset is imperative for supervised learning
techniques.
N. Papagiannopoulos | Distance-based classification on EARLINET data
9
Future work
 Enlarge both training and testing datasets for assessing method and its
stability.
 Easy implementation for SCC
 Can be part of new EARLINET products (level 2 layer products)
 Comparison with other methods such as ANN.
Acknowledgements
 The financial support for EARLINET in the ACTRIS Research Infrastructure Project by the
European Union’s Horizon 2020 research and innovation programme under grant
agreement no. 654169 in the Seventh Framework Programme (FP7/2007–2013) is
gratefully acknowledged.
N. Papagiannopoulos | Distance-based classification on EARLINET data
10
References
 Burton, S. P. et al., 2012: Aerosol classification using airborne High Spectral Resolution Lidar measurements – methodology and
examples, Atmos. Meas. Tech., 5, 73–98, doi:10.5194/amt-5-73-2012.
 Burton, S. P., et al., 2013: Aerosol classification from airborne HSRL and comparisons with the CALIPSO vertical feature mask, Atmos.
Meas. Tech., 6, 1397-1412, doi:10.5194/amt-6-1397-2013.
 Groß, S. et al., 2016: Saharan dust contribution to the Caribbean summertime boundary layer – a lidar study during SALTRACE, Atmos.
Chem. Phys., 16, 11535-11546, doi:10.5194/acp-16-11535-2016.
 Groß, S. et al., 2013: A.: Aerosol classification by airborne high spectral resolution lidar observations, Atmos. Chem. Phys., 13, 24872505, doi:10.5194/acp-13-2487-2013.
 Groß, S. et al., 2011: Characterization of Saharan dust, marine aerosols and mixtures of biomass-burning aerosols and dust by means of
multi-wavelength depolarization and Raman lidar measurements during SAMUM 2. Tellus B, 63, 706-724,
doi:10.3402/tellusb.v63i4.16369.
 Hamill, P., et al., 2016: An AERONET-based aerosol classification using the Mahalanobis distance, Atmos. Env., 140, 213-233,
doi:10.1016/j.atmosenv.2016.06.002.
 Mahalanobis, P. C., 1936: On the generalized distance in statistics. Proceedings of the National Institute of Science of India ,12, 49–55.
 Nicolae, D. et al., 2015: Using artificial neural networks to retrieve the aerosol type from multi-spectral lidar data, European Geosciences
Union, General Assembly, Vol. 17, EGU2015-9793.
 Omar, A. et al., 2009: The CALIPSO Automated Aerosol Classification and Lidar Ratio Selection Algorithm, J. Atmos. Ocean. Tech., 26,
1994– 2014, doi:10.1175/2009JTECHA1231.1.
 Papagiannopoulos, N. et al., 2016a: CALIPSO climatological products: evaluation and suggestions from EARLINET, Atmos. Chem.
Phys., 16, 2341-2357, doi:10.5194/acp-16-2341-2016.
 Papagiannopoulos, N. et al., 2016b: Aerosol classification using EARLINET measurements for an intensive observational period,
European Geosciences Union, General Assembly, Vol. 18, EGU2016-16026.
 Pappalardo, G. et al., 2010: EARLINET correlative measurements for CALIPSO: First intercomparison results, J. Geophys. Res., 115,
D00H19, doi:10.1029/2009JD012147.
 Russell, P.B., et al., 2014: A multiparameter aerosol classification method and its application to retrievals from spaceborne polarimetry, J.
Geophys. Res. Atmos., 119, 9838–9863, doi:10.1002/2013JD021411.
 Schwarz, A., 2016: Aerosol typing over Europe and its benefits for the CALIPSO and EarthCARE missions – statistical analysis based on
multiwavelength aerosol lidar measurements from ground-based EARLINET stations and comparison to spaceborne CALIPSO data
University of Leipzig, Dissertation 178 pp., 289 Ref., 43 Fig., 20 Tab
 Wandinger, U. et al., 2011: Aerosols and Clouds: Long-Term Database from Spaceborne Lidar Measurements, Tech. rep., final report,
ESTEC Contract 21487/08/NL/HE, ESA Publications Division, Noordwijk, the Netherlands.