C. Definition of morphological features

Supplementary materials
SPIN: A Method of Skeleton-based Polarity Identification for
Neurons
Yi-Hsuan Lee  Yen-Nan Lin  Chung-Chuan Lo
Chung-Chuan Lo ()  Yi-Hsuan Lee  Yen-Nan Lin
Institute of Systems Neuroscience, National Tsing Hua University, Hsinchu 30013,
Taiwan;
e-mail: [email protected]
Tel: +886-3-574-2014, +886-3-571-5131 ext. 80390
Fax: +886-3-571-5934
Chung-Chuan Lo
Brain Research Center, National Tsing Hua University, Hsinchu 30013, Taiwan
1
Contents
A. Terminology ............................................................................................................ 3
B. Lists of neurons used in the study .......................................................................... 4
C. Definition of morphological features ...................................................................... 7
C.1 Length-related features ................................................................................... 7
C.2 Branch-related features ................................................................................... 9
C.3 Volume-related features ................................................................................ 10
D. Parameter definition and default values .............................................................. 11
E.
F.
Program tutorial .................................................................................................... 12
E.1 Quick guide..................................................................................................... 12
E.2 Operation instructions ................................................................................... 15
E.2.1 The very first step................................................................................ 15
E.2.2 Manual data preprocessing................................................................. 15
E.2.3 Classifier training................................................................................. 21
E.2.4 Polarity identification .......................................................................... 22
References............................................................................................................. 23
2
A. Terminology
The terms used in the present paper follows those used in TREES toolbox (Cuntz,
Forstner, Borst, & Häusser 2010). A neuron's structure can be represented by a set of
interconnected nodes. There are three types of nodes: continuation points, branch
points, and terminal points. Two connected nodes form a segment. A branch may be
composed of more than one segment and is delimited by either branch points or a
termination point.
3
B. Lists of neurons used in the
study
PB neurons listed by their IDs in the FlyCircuit database(Chiang et al. 2011). For
the sake of simplicity, in this paper we refer to specific neurons by their numbers
(first column).
Number
Training neuron ID
Test neuron ID
1
Cha-F-400006
Gad1-F-300123
2
Cha-F-400012
Gad1-F-500035
3
Cha-F-500009
Cha-F-500028
4
Cha-F-200009
Cha-F-000098
5
Cha-F-400017
Cha-F-100041
6
Cha-F-200013
Gad1-F-300027
7
Cha-F-000014
Cha-F-200084
8
Cha-F-000023
Cha-F-000050
9
Cha-F-500046
Gad1-F-300066
10
Cha-F-200046
Gad1-F-500065
11
Cha-F-200068
Gad1-F-300029
12
Cha-F-700086
Gad1-F-600081
13
Cha-F-500109
Cha-F-100032
14
Cha-F-100065
Cha-F-300072
15
Cha-F-300152
Gad1-F-800013
16
Cha-F-000106
Cha-F-600001
17
VGlut-F-300517
Gad1-F-100004
18
Gad1-F-400017
Cha-F-300160
19
Gad1-F-600003
Gad1-F-800025
20
Gad1-F-900011
Cha-F-000031
21
Gad1-F-600025
22
Gad1-F-600033
23
Gad1-F-800046
24
Gad1-F-600077
25
Gad1-F-600084
26
TH-F-000048
27
Tdc2-F-300003
4
28
Tdc2-F-500000
29
Tdc2-F-600000
30
Tdc2-F-400002
MED neurons listed by their IDs in the FlyCircuit database (Chiang et al. 2011).
For the sake of simplicity, in the paper we refer to specific neurons by their
numbers (first column)
Number
Training neuron ID
Testing neuron ID
1
5-HT1B-F-500013
fru-F-300050
2
Cha-F-300010
VGlut-F-500012
3
Cha-F-100027
VGlut-F-400884
4
Cha-F-400101
VGlut-F-400671
5
Cha-F-700121
VGlut-F-300600
6
Cha-F-100052
VGlut-F-900011
7
VGlut-F-200401
VGlut-F-000130
8
VGlut-F-400521
Cha-F-500093
9
VGlut-F-400577
Tdc2-F-000022
10
VGlut-F-800081
fru-F-800011
11
VGlut-F-700226
VGlut-F-000188
12
VGlut-F-800100
Cha-F-500044
13
VGlut-F-300494
VGlut-F-300391
14
VGlut-F-100277
VGlut-F-300560
15
VGlut-F-900093*
fru-F-700075
16
VGlut-F-000557
VGlut-F-400142
17
VGlut-F-000600
VGlut-F-400360
18
VGlut-F-300602
fru-F-800015
19
VGlut-F-200012
Tdc2-F-100067
20
VGlut-F-400013
Trh-F-300093
21
VGlut-F-300212
22
VGlut-F-800017
23
VGlut-F-300037
24
VGlut-F-200114
25
VGlut-F-300103
26
VGlut-F-400133
27
fru-F-300054
*
*
These neurons were not included in Vaa3D analysis. See Methods in the main text.
5
28
fru-F-000053
29
Gad1-F-400107
30
Trh-F-400043
31
Trh-F-300113
32
Tdc2-F-100013
33
Tdc2-F-100037
34
Tdc2-F-200049
35
Tdc2-F-200058
36
Tdc2-F-200065
37
Tdc2-F-200066
6
C. Definition of morphological
features
All morphological features are defined relative to a substructure. The definitions can
be extended to the complete neural skeleton by replacing all occurrences of the term
"root of the substructure" with simply "soma."
C.1 Length-related features
1. Summation of segment lengths
2. Maximum path length
4. Mean ratio of path length to Euclidean length
7. Mean branch length
8. Mean path length
18. Balancing factor
19. Path length to soma
• Path length: For a node within the substructure, the path length is the summation
of segment lengths between the node and the root of the substructure. SPIN
calculates path lengths for all branch points and terminal points within a
substructure.
• Euclidean length: For a node within the substructure, the Euclidean length is the
Euclidean distance between the node and the root of the substructure. SPIN
calculates Euclidean lengths for all branch points and terminal points within a
substructure.
• Branch length: A branch length is the summation of segment lengths for a branch,
i.e., the summation of segment length between two branch points or one branch
and one terminal point. SPIN calculates branch lengths for all possible branches
within a substructure.
• Path length to soma: The path length to soma for a substructure is the
summation of segment lengths between the root of a substructure and the soma.
Note that this feature is normalized by the longest possible path length in the
neuron.
7
• Balancing factor: Cuntz et al. (Cuntz et al. 2010) proposed the idea of a balancing
factor as a weighting between the material cost and conduction time during the
construction of neuronal branches. The authors proposed that the construction of
neuronal branches should minimize a total cost given by:
Total cost = wiring cost + 𝑏𝑓 ∗ path length cost,
where the wiring cost is given by the Euclidean distance between the carrier
points (unconnected points) and the node on the tree, the path length cost is
given by the path length between the carrier points and the node on the tree
and bf is the balancing factor.
However, the authors only used the equation to construct a tree structure
with an assumed balancing factor but no method was proposed to estimate the
balancing factor for a given neuronal structure. To extract the balancing factor as
a morphological feature, SPIN adopts the following procedure: A series of tree
structures is constructed based on the nodes of a target neuron with assumed
balancing factors ranging from 0 to 1 with an interval of 0.1. Next, the
constructed structure that most resembles the actual skeleton of the target
neuron is selected. To find the best fit, we evaluate the similarity between two
structures (the actual and the constructed) by calculating an error score defined
as:
Error score =
|Sum of segment lengths in actual tree − Sum of segment lengths in constructed tree|
Sum of segment lengths in actual tree
Then SPIN uses the Nelder-Mead simplex algorithm (Lagarias, Reeds, Wright,
& Wright 1998) to find the balancing factor that minimizes the error score.
8
C.2 Branch-related features
3. Number of branch points
5. Maximum branch order
6. Mean branch angle
9. Mean branch order
16. Mean asymmetry at branch points
20. Branch order in a complete neuron
•
•
•
•
Branch order: For a node within a substructure, the branch order is the number
of branch points along the path between the node and the root of the
substructure. SPIN calculates branch orders for all branch points and terminal
points within a substructure.
Branch angle: For a branch point within a substructure, the branch angle is the
angle between the two descending segments of a branch point. SPIN calculates
branch angles for all branch points within a substructure.
Branch order in a complete neuron: The branch order in a complete neuron for a
substructure is the number of branch points along the path between the root of
the substructure and the soma. Note that this feature is normalized by the largest
possible branch order in the neuron.
Branch point asymmetry: The asymmetry at a branch point is the asymmetry in
the numbers of descending terminal points arising from the two directly
descending nodes. Assuming the number of descending terminal points is ∑ 1
for one directly descending node and ∑ 2 for the other directly descending
node, the branch point asymmetry is defined as ∑ 1/(∑ 1+∑ 2) (assuming
∑ 1<∑ 2). SPIN calculates asymmetry values for all branch points within a
substructure.
Figure C.2.1. A schematic structure
illustrating the branch angle (red arch)
Figure C.2.2. A schematic plot
for one branch point and the branch
illustrating how the branch point
asymmetry is calculated.
order (numbers) for each node.
9
(Figures were adapted from Cuntz et al., 2010 under the Creative Commons Attribution License (CC-BY))
C.3 Volume-related features
10. Ratio of width (x direction) to height (y direction) of the substructure
11. Ratio of width (x direction) to depth (z direction) of the substructure
12. Center of mass of the substructure in the x direction
13. Center of mass of the substructure in the y direction
14. Center of mass of the substructure in the z direction
15. Volume of the convex hull
17. Mean volume of Voronoi pieces
• Convex hull: The convex hull of a substructure is the smallest convex set
containing all nodes of the substructure.
• Voronoi pieces: The Voronoi algorithm subdivides the convex hull enclosing a
substructure enclosed by the convex hull using voronoi-algorithm into regions
containing exactly one node called Voronoi pieces.
C.3.1 A schematic convex hull
C.3.2 A schematic plot showing
(colored area) that encloses the
Voronoi pieces that enclose each
substructure.
node of the substructure.
(Figures were adapted from Cuntz et al., 2010 under the Creative Commons Attribution License (CC-BY))
10
D. Parameter definition and default values
Sub-step
Artificial
branch
removal
Processing
Parameter
name
PB
MED Blowfly
Trunk isolation
Dividing
Undivided
point scan
branch scan
Critical point
scan (splits
from trunks)
Dividing point
determination
Explanation
𝑛𝐶𝑙𝑒𝑎𝑛𝑇𝑖𝑚𝑒𝑠
3
3
3
Number of cleaning iterations. To locate the neuron trunk, terminal
branches that are shorter than the longest branch length are removed.
This process is repeated 𝑛𝐶𝑙𝑒𝑎𝑛𝑇𝑖𝑚𝑒𝑠 times.
𝑇ℎ𝑅𝑒𝑚𝑜𝑣𝑒𝐿𝑒𝑛
0.22
0.37
0.37
The maximum relative branch length to be removed completely or the
percentage of branch lengths to remove. (see text for details)
𝑇ℎ𝐿𝑜𝑛𝑔𝐵𝑟𝑎𝑛𝑐ℎ
0.10
0.12
0.15
𝑇ℎ𝐶𝑟𝑖𝑡𝑃
0.35
0.35
0.35
The threshold for defining critical points (see text for details).
𝑇ℎ𝑛𝑢𝑚𝑇𝑃
0.85
0.85
0.85
The minimum percentage of descending terminal points following a
critical point.
𝑇ℎ𝐷𝑃
0.01
0.08
The minimum length of an undivided branch.
The minimum number of descending terminal points exclusively
following a real dividing point.
0.001
11
E. Program tutorial
E.1 Quick guide
The SPIN software package comes with a sample classifier and a set of sample
neurons for testing. Here we guide users to apply the enclosed classifier and neuron
data to polarity identification, the last stage of the SPIN system:
Step1: Setting up a Matlab search path
1) Execute goInclude.m. This script automatically adds required functions to the
Matlab search path.
Step2: Performing neuron polarity identification
1) Execute goIdentifyPolarity_GUI.m. The script opens a GUI (graphical user
interface) of SPIN for polarity identification.
2) Follow steps a-f shown on the figure below.
a. Enter the classifier name (here, MED) and select the type (e.g., Exhaustive)
of the desired classifier.
b. Give an arbitrary name to your test data (here we use “test”).
c.
d.
e.
f.
Click on the “Browse” buttons to select a file for the list of neuron names
(here, ‘./fileList.txt’) and a directory (here, ‘./SWC_labeled’) that contains
the SWC files of your data.
If your SWC files contain known neuronal polarity information (which
usually comes from experimental results), select “Compare with
experimental results” to export terminal-level accuracies.
You can use default parameters, which are the PB parameters for neurons
having less than 50 terminal points and MED parameters for neurons
having more than 50 terminal points.
Click on “GO” to begin polarity identification.
12
a.
b.
d.
c.
e.
f.
Fig. E.1.1 Control panel for polarity identification
Upon completion, a message box will appear (Fig. E.1.2):
Fig.E.1.2 Message box upon completion of polarity
identification
Identified results will be displayed on the panel (Fig. E.1.3).
Fig E.1.3 Click on
“Previous” and
“Next” to
browse through
results of
different
neurons
13
Users can also locate the results under the directory ‘./Result/test’. These are:






classifiedResultMat: The .mat file of each neuron (including their
substructures).
classifiedResultPlot: Resulting plots of each neuron.
classifiedResultSWC: Resulting SWC files.
cleanedTreePlot: Resulting plot after artificial branch removal.
morphoClustPlot: Resulting plot after morphological clustering.
resultRecord_test.txt: Table that summarizes the results. The names of
neurons are displayed in the first column. If “Compare with experimental
results” were selected, the terminal-level accuracies would be displayed in
the second column. The following column displays whether warnings were
issued in the corresponding substructures. 1: type I warning; 2: type II
warning; 3: type I + type II warning. Take neuron “VGlut-F-900011” for
example, the terminal-level accuracy was around 72%, and the neuronal
skeleton was divided into three substructures, with warnings issued for
each of them. Their warning types were type II, type I, and type II,
respectively.
Fig. E.1.4 An example of summarizing table

logFile.txt: A text file that displays error messages (if any)

metaData_test.txt: A text file that records the date, classifier source,
classifier type, swc files source, and parameters.
14
E.2 Operation instructions
In this set of instructions, we show how to use the entire SPIN system starting with
training data. We will guide users through all three stages of SPIN: Manual data
preprocessing, classifier training and polarity identification.
E.2.1 The very first step
1) Execute goInclude.m. This script automatically adds required functions to the
Matlab search path.
2) Make sure that the main soma is the first node in the swc file. If not, set the main
soma to root by using TREES toolbox function redirect_tree.
E.2.2 Manual data preprocessing
Artificial branch removal
Step1: Data preparation
1) Place neuronal skeleton data files (in SWC format) under the directory
‘swc_rawdata'
2) Make a text file containing the list of names of the training neurons. Name the
file as ‘fileList.txt’ and place it under the SPIN directory. The name of each
training neuron should be identical to its SWC file name (without the file
extension).
Note: You can change the paths and the filenames directly by editing the value of
three variables, data.tarDir (line 41), data.srcDir (line 42) and data.nameList (line 46)
in the file ‘./data_preprocessing/GUI_ManualDenoise.m’. But be sure to put a ‘/’ at
the end of each path.
15
Step2: Removing artificial branches manually
1) Execute goManualDenoise.m and two panels will show up (Figs. E.2.1 and E.2.2).
Go back to the previous neuron
Go to the next neuron
Remove selected branches
Reset to the original state
Export cleaned neuron in SWC format
Skip manual denoise. Directly export all the neurons to
the SWC_cleaned directory for the next stage.
Fig. E.2.1 Control panel
Fig. E.2.2 Display panel
16
2) Select a piece of trunk that you would like to clean by first clicking on the starting
point of the trunk (Fig. E.2.3) and then clicking on the end point of the trunk (Fig.
E.2.4). The selected trunk will turn red.
3) Click on “Clean” to remove all terminal branches on the selected trunk (Fig.
E.2.5).
4) Repeat the process until all artificial branches are removed.
5) Click on “Export” to export a cleaned neuron skeleton (the files can be found
under ‘./SWC_cleaned’ with the default setting). If you want to skip this step,
simply click “Export all” to copy all the files in swc_rawdata directory to
SWC_cleaned directory for the next step.
Select a starting point here
Fig. E.2.3. Clicking on the neuron skeleton to select the starting point of the trunk
Select an end point here
17
Fig. E.2.4. Clicking on the neuron skeleton
to select the end point of the trunk
Fig. E.2.5. Clicking on “Clean” to remove all the terminal branches on the selected trunk
Note: Be sure to unselect the figure tool while clicking on the skeleton.
18
Morphological clustering & polarity labeling
Note: Here SPIN automatically reads SWC files (under ‘./SWC_cleaned’ ) and the list
of neuron names (‘./fileList.txt’) from the previous step.
Step1: Neuronal polarity labeling
1) Execute goHandLabel.m and two panels will show up (Figs. E.2.6 and E.2.7)
Go back to the previous neuron
Go to the next neuron
Label the current substructure as an axon
Label the current substructure as a dendrite
Reset to the unlabeled state
Export labeled results in SWC format
Fig. E.2.6 Control panel
Fig. E.2.7 Display panel
19
2) First click on a branch point in the display panel and then click on a structure type
button (axon or dendrite) in the control panel (Fig. E.2.6) to label the polarity. The
color of the substructure descending from the branch point will change to
indicate the polarity (Fig. E.2.8).
Click here
Fig. E.2.8 A neuron skeleton with manually labeled polarity
3) Click the “Export” button on the control panel to export labeled polarity in SWC
format. The files will be stored under the directory ‘./SWC_labeled’.
Note: Be sure to unselect the figure tool while clicking on the skeleton.
20
E.2.3 Classifier training
Step1: Performing classifier training
1) Execute goTrainClassifier_GUI.m. One panel will show up.
2) Follow steps a-g shown on the figure below:
a. Give an arbitrary name to the classifier (here is MED).
b. Click on “Browse” to select the directory that contains SWC files of training
neurons (here, ‘./SWC_labeled’).
c. Decide whether to perform feature extraction or not.
d. Enter the IDs of features you want to exclude from feature extraction (e.g,
e.
if you want to exclude features #1, #3, #4, #5, #7, please enter 1,3:5,7.
Please refer to the main text for the list of the features.)
Select the algorithm(s) for feature selection2.
f. Enter the value of k for the k-nearest-neighbor classifier
g. Click on ‘GO’ to begin classifier training.
After the training is done, the trained classifier is stored in ‘./Classifier/MED’.
a
b.
e.
c.
d.
f.
g.
Fig. E.2.9 Control panel for classifier training
2
The algorithms for feature selection:
Sequential: Use the k-nearest-neighbor classifier and the leave-one-out test to find the feature that
most correlate with the polarity. Next, find a second feature that, in combination with the first feature,
most improves the correlation. Repeat this procedure until the correlation can no longer be improved.
Exhaustive: For every possible subset of features, use the k-nearest-neighbor classifier and the
leave-one-out test to evaluate the correlation between the feature set and the polarity.
See also
http://neural.cs.nthu.edu.tw/jang/books/dcpr/fsMethod.asp?title=10-2%20Feature%20Selection%20
Methods%20(%AFS%BCx%BF%EF%A8%FA%A4%E8%AAk).
21
E.2.4 Polarity identification
In this stage of SPIN, users use the classifier trained in the previous stage to perform
polarity identification for neurons with unknown polarity. Please see the Quick Guide
for details.
22
F. References
Chiang, A.-S., Lin, C.-Y., Chuang, C.-C., Chang, H.-M., Hsieh, C.-H., Yeh, C.-W., … Hwang,
J.-K. (2011). Three-Dimensional Reconstruction of Brain-wide Wiring
Networks in Drosophila at Single-Cell Resolution. Current Biology, 21(1), 1–11.
doi:10.1016/j.cub.2010.11.056
Cuntz, H., Forstner, F., Borst, A., & Häusser, M. (2010). One Rule to Grow Them All: A
General Theory of Neuronal Branching and Its Practical Application. PLoS
Comput Biol, 6(8), e1000877. doi:10.1371/journal.pcbi.1000877
Lagarias, J. C., Reeds, J. A., Wright, M. H., & Wright, P. E. (1998). Convergence
Properties of the Nelder-Mead Simplex Method in Low Dimensions. SIAM
JOURNAL OF OPTIMIZATION, 9, 112–147.
23