Collective Analysis of RAF Lessons using QDA Miner Steve Redmond Head of Lessons Analysis HQ Air, RAF High Wycombe 01494 49 6682 [email protected] UNCLASSIFIED UNCLASSIFIED © UK MOD Crown Copyright, 2010 Contents • • • • • Background Research Solution Future Demonstration – The data in this presentation has been changed to maintain MoD integrity. Please interject at any time UNCLASSIFIED © UK MOD Crown Copyright, 2010 Background UNCLASSIFIED © UK MOD Crown Copyright, 2010 Overview • • • • • RAF HQ Air Command, High Wycombe Air Lessons Cell Small team Aim is to capture, learn, analyse and exploit the lessons from operations and exercises • Improvement UNCLASSIFIED © UK MOD Crown Copyright, 2010 What is a lesson? UNCLASSIFIED © UK MOD Crown Copyright, 2010 Lessons database • Defence Lessons Identified Management System (DLIMS) • Classified data stored separately • Two main functions: – Process the analysis of individual lessons – Enable basic search and collation of lessons for deeper analysis UNCLASSIFIED © UK MOD Crown Copyright, 2010 The Lessons Process Capture Learn Exploit Analyse UNCLASSIFIED © UK MOD Crown Copyright, 2010 Individual analysis • Lessons are entered into DLIMS • Subject matter experts (SMEs) are appointed to each lesson to validate, consider and recommend action • Ultimately each lesson gets closed (individual analysis complete) UNCLASSIFIED © UK MOD Crown Copyright, 2010 Deeper analysis • Aim unclear, something like: – Key points – Aggregates – Trends • Basic search functionality • High volume of written information • Requires significant time and effort • Sensitivity analysis hard • Feeling there must be a better way UNCLASSIFIED © UK MOD Crown Copyright, 2010 Research UNCLASSIFIED © UK MOD Crown Copyright, 2010 Challenges • New • Text – acronyms, terminology, spelling, waffle, subjective, volume • Tasking – informal, unclear • Capability – simplistic, unscientific • Resource – subject skills and experience UNCLASSIFIED © UK MOD Crown Copyright, 2010 Start • • • • Early 2009 Blank sheet of paper How do we do “deeper analysis”? Research into – Summarisers of content – Mining of text – Metadata analysis – Other users (what do they do) UNCLASSIFIED © UK MOD Crown Copyright, 2010 Findings • • • • • • Summer 2009 Identified key terms Established scientific techniques Established processes for text mining Short list of preferred tools Tested shortlist - online, email, phone UNCLASSIFIED © UK MOD Crown Copyright, 2010 Text mining Text mining applies automation and science to the analysis of large volumes of written data UNCLASSIFIED © UK MOD Crown Copyright, 2010 Text mining Intelligent text mining requires a taxonomy and thesaurus on the subject matter of interest. The more words and phrases that are recognised, the more intelligent the mining. A taxonomy is the classification of words and phrases by various criteria, possibly hierarchy, similarity or type. Supporting the taxonomy will be a thesaurus or dictionary. In the example on the left, the recognised words and phrases are coloured and the most common classifications shown. UNCLASSIFIED © UK MOD Crown Copyright, 2010 Techniques • • • • • • • • Word analysis – frequencies, patterns Key Word In Context (KWIC) Hierarchical analysis – higher links Cluster analysis – groups, data reduction Cross tabulation – control factors Correspondence analysis – word associations Heatmaps – uses colour, maintains fidelity Thematic analysis – categories and links UNCLASSIFIED © UK MOD Crown Copyright, 2010 Recommendation • Collective Analysis • Define – themes, trends, unusual effects, correlations, cause/effect, test hypothesis • Routine – apply to all lessons • Bespoke – respond to specific requests • Manage – apply basic controls • Software – purchase • Science - apply • Subject Matter Expertise (SME) - involve UNCLASSIFIED © UK MOD Crown Copyright, 2010 Expected benefits • Time – text mining speeds up the processing of lessons • Cost – text mining and consultancy reduce the effort required by the analyst • Quantity - text mining increases the capability to analyse more lessons • Quality – scientific techniques make analysis more objective, impartial and auditable UNCLASSIFIED © UK MOD Crown Copyright, 2010 Solution UNCLASSIFIED © UK MOD Crown Copyright, 2010 Tools • Purchase suite of tools from Provalis Research: – QDA Miner – WordStat – SimStat • Not networked • Copy of RESTRICTED DLIMS database • Soon to get copy of SECRET DLIMS database • Accepts non DLIMS sources UNCLASSIFIED © UK MOD Crown Copyright, 2010 QDA Miner • Enables filtering/searching, and then coding/annotating/retrieving/analysing of documents and images – Projects – Cases – Variables – Codes UNCLASSIFIED © UK MOD Crown Copyright, 2010 QDA Miner - Cases UNCLASSIFIED © UK MOD Crown Copyright, 2010 QDA Miner - Variables UNCLASSIFIED © UK MOD Crown Copyright, 2010 QDA Miner - Codes UNCLASSIFIED © UK MOD Crown Copyright, 2010 WordStat • Enables text mining and content analysis of large amounts of unstructured information – Dictionaries – Frequencies – Phrase finder – Crosstab UNCLASSIFIED © UK MOD Crown Copyright, 2010 Dictionaries • Exclusion dictionary: words/phrases to ignore • Inclusion dictionary: MOD Taxonomy • Residual words: not excluded or included • Hierarchies • Synonyms • Duplicates UNCLASSIFIED © UK MOD Crown Copyright, 2010 MOD Taxonomy Level 1 UNCLASSIFIED © UK MOD Crown Copyright, 2010 MOD Taxonomy Level 2 UNCLASSIFIED © UK MOD Crown Copyright, 2010 MOD Taxonomy Level 4 UNCLASSIFIED © UK MOD Crown Copyright, 2010 Dictionary rules UNCLASSIFIED © UK MOD Crown Copyright, 2010 Dictionary: Excluded words UNCLASSIFIED © UK MOD Crown Copyright, 2010 Frequencies: Included words UNCLASSIFIED © UK MOD Crown Copyright, 2010 Frequencies: Leftover words UNCLASSIFIED © UK MOD Crown Copyright, 2010 Frequencies: Phrase Finder UNCLASSIFIED © UK MOD Crown Copyright, 2010 Frequencies: Dictionary UNCLASSIFIED © UK MOD Crown Copyright, 2010 Keyword in Context (KWIC) UNCLASSIFIED © UK MOD Crown Copyright, 2010 Keyword Retrieval UNCLASSIFIED © UK MOD Crown Copyright, 2010 Hierarchical clustering UNCLASSIFIED © UK MOD Crown Copyright, 2010 Concept mapping UNCLASSIFIED © UK MOD Crown Copyright, 2010 Proximity plot UNCLASSIFIED © UK MOD Crown Copyright, 2010 Cross tabulation UNCLASSIFIED © UK MOD Crown Copyright, 2010 Correspondence map UNCLASSIFIED © UK MOD Crown Copyright, 2010 Future UNCLASSIFIED © UK MOD Crown Copyright, 2010 Schedule • Publish Air themes Nov • Paper Design of Collective Analysis by Nov • Build Air dictionary (80%) by Dec • Analyse non DLIMS lessons by Feb • Offer generic capability in text analysis by Mar UNCLASSIFIED © UK MOD Crown Copyright, 2010 Demonstration • QDA Miner • WordStat UNCLASSIFIED © UK MOD Crown Copyright, 2010
© Copyright 2026 Paperzz