PDF

CARMEN and spike
detection and sorting
Leslie S. Smith
University of Stirling, Scotland, UK
http://www.carmen.org.uk
Contents
•
•
•
•
CARMEN architecture and project
Neural Data Format (NDF)
Workflows (?)
Project status
– Where we are now
• Some reflections
INCF June 17 2012
2
CARMEN ‘Cloud’ (CAIRN)
Rich
Clients
Compute Cluster
on which Services
are Dynamically
Deployed
Workflow
Enactment
Engine
Security
Raw Signal Data
Search &
Visualisation
Web Portal
Enactment of
scientific analysis
processes
Raw & Derived
Data Store
Data
Metadata
Analysis
Code Store
Registry
Service
Repository
Security Policies
Controlling Access to
Data & Code
Search for Data &
Analysis Code
Structured
Metadata Store
Enabling Search &
Annotation
CARMEN project status
• Initially a 4 year UK e-Science project
– From September 2006-March 2011
• Extended with BBSRC tools and techniques
grant
– To Sept 2014
• Major work in last 2 years has been
– Improving User Interface
– NDF implementation (working)
– Workflow Implementation (nearly there!)
INCF June 17 2012
4
Parameters
Amplifier
CARMEN
CARMEN
and spikes
Recording
Electrodes
Note that
CARMEN
also has other
services,
including
higher level
services, etc.
Gain, limiting
Filter
High-pass filter
at < Nyquist
frequency
Analogue/Digital Converter
Sample rate
Sample length
linear/log
Filter
High or bandpass filter to
remove LFP's
CARMEN
Spike Detector
spike times,
segments
Spike Sorter
INCF June 17 2012
Spike train outputs
Type of detector,
detector parameters
Duration
projection technique
clustering technique
rejection criteria
Collision handling
5
...
Data, services and workflows
• CARMEN supports
– Data and metadata
– Services: which process data, and
– workflows (almost): concatenations of services.
• Initially:
– We allowed more or less any data format
– Services which processed a data format and produced a
different data format
• …but to develop workflows (and to enable
interoperability between services)
– We now strongly recommend using our Neural Data
Format (NDF)
INCF June 17 2012
6
Neural Data Format (NDF)
• An NDF dataset consists of a configuration file in
XML format which contains metadata and
references to the associated host data files.
• A special XML data element, History, is included
within the header file for recording data processing
history. This element contains the full history
(recording chain) of previous processing
• The NDF API has been implemented as a C library.
The NDF API translates the XML tree/nodes to C
style data structures and insulates the data
structures within the binary data file from the clients.
INCF June 17 2012
7
NDF: Supported datatypes
INCF June 17 2012
8
NDF XML file
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<ndtfDataCfg xmlns="http://www.carmen.org.uk"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.carmen.org.uk ndtfDataCfg.xsd">
<Version>1.0.1</Version>
<NdtfDataID>897A9272-4E6F-4F32-8A63-89C699F99120</NdtfDataID>
<GeneralInfo>
<Description>NDF Spike Detector Service (COB) Version 2 - SNN</Description>
<Laboratory>Carmen VLE</Laboratory>
<CreateDate>2012-06-13</CreateDate>
<CreateTime>16:29:03</CreateTime>
<RecordID>N/A</RecordID>
</GeneralInfo>
<DataSet>
…
INCF June 17 2012
9
NDF XML file cont’d
<History>
<Processor>
<ProcessingDateTime StartDateTime="2012-06-01T16:16:08"/>
<CommandLine>mat2ndf (m0192_all.mat,m0192_all.ndf)</CommandLine>
</Processor>
<Processor>
<ProcessingDateTime StartDateTime="2012-06-13T17:05:58"/>
<CommandLine>Spike detector COB NDF m0192_all.ndf, 256, 0.002, 15, 30,
no</CommandLine>
<ProcessingSettings>Spike Detector (COB) for NDF data</ProcessingSettings>
</Processor>
</History>
</ndtfDataCfg>
INCF June 17 2012
10
NDF based services
• Filtering services:
– HPF, LPF, BPF
• Spike detectors:
– single or multiple channel signal
– Simple thresholding, positive/negative/both sided, NEO (Teager
energy operator), Cepstrum of Bispectrum
• Spike sorters
– Kmeans
– Waveclus (superparamagnetic clustering)
• We can add new spike detectors and new spike sorters
reasonably easily.
– Wrapping services
• The User Interface allows specific channels and sections of
the dataset to be selected
INCF June 17 2012
11
Workflows
• Currently at alpha stage testing:
– Can create workflows (graphically), generate scripts,
store them, apply security and sharing appropriately:
execution of workflows is almost ready.
• Workflows will be generable either graphically or
using a scripting language.
INCF June 17 2012
12
Workflow graphical interface
INCF June 17 2012
13
Where are we now? On the cluster
• Can run single services:
– But joining them together requires user intervention
– (NDF services do read each other’s data correctly)
– New NDF services can be “wrapped”
• Have run workshops on this
• Can turn a variety of formats into NDF
– Mcd, nev, nex, plx, map, smr, abf, abf2
• Spike detectors of three sorts
– Can process multi-electrode data in one service
• Spike sorters of two sorts (Kmeans and waveclus)
• But no workflows yet (promised soon!)
– … also not really enough public datasets
INCF June 17 2012
14
Where are we now? Local systems
• NDF toolbox is available for Matlab (downloadable).
– Runs on recent versions of Matlab: PC, Mac, Linux.
• Services which run on the cluster are/will shortly be
available to run locally under Matlab
– Not really the intent of the project, but does enable service
running and testing (and debugging!) to be carried out
locally
• Environment is essentially the same as on the cluster.
• Local workflows enabled through writing XML files, and simple (ish) scripts.
– Can test the wrapping of scripts locally
INCF June 17 2012
15
CARMEN and
validation
• Validation of services
• Testing services on
multiple datasets
• Testing multiple services
on datasets
– Locally
– On the Portal.
INCF June 17 2012
16
Why is this so difficult? Why has it taken so long?
What lessons can we learn?
• Initially allowed user to write services for their own data types
– Ties services to specific types: not easily shareable
• NDF proved complex to implement
– Generality, multiple language support
• User Interface for portal was difficult
– Wanted to support non-technical users
• Existing software proved difficult to use
– Software developed for R&D systems proved not to be robust:
expecting it to be was overly optimistic
• Insufficient development staff
– Underestimated programming/development requirements
– Supporting early development projects (“low hanging fruit”) took a
lot of time.
INCF June 17 2012
17
Carmen consortium
INCF June 17 2012
18