Auto-classification and Taxonomies

Just the Facts
Auto-classification and Taxonomies
Don Miller
Vice President of Sales
Concept Searching
[email protected]
Twitter @conceptsearch
© Concept Searching 2014
Expert Speakers
Doug Miles – Director of Market Intelligence at AIIM
International researches and documents end user drivers
and priorities for information management and collaboration,
as AIIM’s chief analyst. He has surveyed users and authored
reports on ECM, records management, scanning and capture,
BPM, mobile, social, cloud, and big data.
Don Miller – Vice President of Sales at Concept Searching
has over 20 years’ experience in knowledge management.
He is a frequent speaker on records management, and
information architecture challenges and solutions, and has
been a guest speaker at Taxonomy Boot Camp, and
numerous SharePoint events about information organization
and records management.
© Concept Searching 2014
Agenda
•
•
•
•
•
•
•
•
•
•
•
Introduction
The Role of Metadata
Auto-classification Primer
Example: Auto-classification in Records Management
Typical Enforcement Challenges
Responsibilities
The Role of Taxonomies
Evaluating Solutions – Food for Thought
Types of Questions to Ask Solution Providers
Calculating ROI
Addendums
• Case Studies
© Concept Searching 2014
Best Practice – Training – Thought Leadership
• Download at:
•
•
www.aiim.org/research
Plus many other reports and
AIIM White Papers
Training at:
www.aiim.org/training
Volume of Electronic Records
Have had significant compliance or litigation
issues in the last 2 years
Keep ALL emails, or have no policy on email
archiving
Agree: “Our strategy for managing increasing
content volume is to buy more discs”
Of content is of no business value – i.e. is ROT
redundant, obsolete or trivial
Information Chaos
Automating Compliance
How do we encourage, enforce and automate information governance?
Know they are storing stuff they don’t need
Not confident that they know what is safe to delete
Feel automated classification is the only way to keep
up with the volumes
Delete?
Automated Classification – plans
How would you best describe your overall plans for automated
declaration/classification of records?
We have no
plans, 24%
We are doing it
successfully
across a
number of
content types,
8%
We are doing it
successfully
across one or
two content
types, 6%
It’s something
we plan to do
in the future,
28%
We’re just
getting started,
25%
We are keen to
automate as
soon as we
can, 10%
14% mature
across content
types
25% just getting
started
10% keen to get
going
Half overall
have immediate
interest
© AIIM 2014
Automated Classification – benefits
What would you expect to be/what have been the two biggest
benefits from automated classification? (Max TWO)
0%
Improved searchability
5%
10% 15% 20% 25% 30% 35% 40% 45% 50%
#1 Searchability
General staff productivity
Defensible compliance
Repository alignment
#2 Productivity
#3 Compliance
Storage volume reduction
Staff cooperation
Big data readiness
Migration
Not sure yet
© AIIM 2014
The Global Leader in
Managed Metadata Solutions
•
Company founded in 2002
• Product launched in 2003
• Focus on management of structured and unstructured information
•
Technology Platform
• Delivered as a web service
• Automatic concept identification, content tagging,
auto-classification, and taxonomy management
• Only statistical vendor that can extract conceptual metadata
•
•
8 years KMWorld ‘100 Companies that Matter in Knowledge Management’
7 years KMWorld ‘Trend Setting Product’
•
Authority to operate enterprise wide US Air Force, enterprise wide
NETCON US Army, and Canadian SLSA
•
Locations: US, UK, and South Africa
•
Client base: Fortune 500/1000 organizations
•
•
Microsoft Gold Certification in Application Development
Microsoft Business-Critical SharePoint elite program partner
•
Smart Content Framework™ for information governance comprising
• conceptClassifier for SharePoint and conceptClassifier for Office 365
• Concept Searching Technology Platform and conceptClassifier Platform
• Add on – conceptTaxonomyWorkflow, and conceptClassifier for OneDrive for Business
© Concept Searching 2014
Metadata
Before we start...
auto-classification is great
but what about metadata?
© Concept Searching 2014
Metadata
Definition
Metadata describes other data. It provides
information about a certain item's content.
For example, an image may include
metadata that describes how large the
picture is, the color depth, the image
resolution, when the image was created,
and other data. A text document's
metadata may contain information about
how long the document is, who the author
is, when the document was written, and a
short summary of the document.
TechTerms.com
© Concept Searching 2014
Types of Classification Metadata
Intrinsic
Information that can be extracted directly
from an object (file name, size)
Administrative/Management
Information used to manage the document
(author, date created, date to be reviewed)
Descriptive
Information that describes the object
(title, subject, audience)
Semantic
Ability to extract concepts from within content
and generate the metadata (intelligent metadata)
© Concept Searching 2014
Metadata Matters
The challenges of content overload
• 80% of enterprise data is unstructured (IDC)
• 60% of documents are obsolete (eLaw)
• 50% of documents are duplicates (Equivio)
The benefits of automatic semantic metadata generation
• Elimination of costs and errors associated with end user tagging
• Identification and protection of secure content assets from
unauthorized access and portability in accordance with compliance
procedures
• Automatic identification and tagging of documents of record
• Normalization of content across functional and geographic boundaries
• Integration with search
• Ability to apply policy consistently across diverse repositories and
environments
© Concept Searching 2014
Garbage In – Garbage Out
The quality of your metadata will impact the quality of auto-classification
and ultimately negate your records management outcomes – and
increase organizational risk and non-compliance
• Manual tagging efforts are labor-intensive, error prone,
redundant, and inconsistent
• Best solution is automated semantic metadata generation
• Defines which words are meaningful in which contexts
• Creates facets for drill-down capabilities
(user does not necessarily know the ‘right’ words to
use to achieve the ‘right’ results when searching)
• Automatically tags and classifies content in real time
• Workflows to any metadata application
(records management, security)
• Eliminates end user tagging, yet provide overrides for
authorized users
© Concept Searching 2014
Why is metadata so hard to get right?
A manual metadata approach will fail 95%+ of the time
Issue
Organizational Impact
Inconsistent
Less than 50% of content is correctly indexed, meta-tagged or
efficiently searchable rendering it unusable to the organization (IDC)
Subjective
Highly trained information specialists will agree on meta tags between
33%-50% of the time (C. Cleverdon)
Cumbersome - expensive
Average cost of manually tagging one item runs from $4 - $7 per
document and does not factor in the accuracy of the meta tags nor the
repercussions from mistagged content (Hoovers)
Malicious compliance
End users select first value in list
(Perspectives on Metadata, Sarah Courier)
No perceived value for end user
What’s in it for me? End user creates document, does not see value
for organization nor risks associated with litigation and
non-conformance to policies
What have you seen
Metadata will continue to be a problem due to inconsistent human
behavior
© Concept Searching 2014
OK, we have our metadata, what’s next?
Auto-classification
© Concept Searching 2014
What is Auto-classification?
•
A feature found in some content management systems or
records management applications that will scan the contents
of a document and automatically assign metadata,
categories, and keywords based on the document contents
•
Content based assignment of one or more pre-defined
categories to documents (records), usually machine
learning, statistical pattern recognition, or neural network
approaches that are used to construct classifiers
automatically
© Concept Searching 2014
The History of Auto-classification
Content Based
Weight given to a particular subject in a document determines
the class to which the document is assigned
Request Based
Sometimes referred to as indexing, classification in which the
anticipated requests from users influence how documents are
classified
Policy Based
Classification that is aimed at a particular audience or
user group
© Concept Searching 2014
Automatic Document Classification
Supervised
Some external mechanism, such as human feedback,
provides information on the correct classification
Unsupervised
Also known as document clustering, where the
classification has no reference to external information
Semi-supervised
Where parts of the documents are labeled by an
external mechanism and some by human intervention
+
© Concept Searching 2014
Auto-classification Systems – Statistical
Taxonomies and thesauri are the foundation of an auto-classifier. They provide the
vocabulary against which rules are built and ‘teach’ the machine how to ‘understand’
and categorize content
Accuracy rates with auto-classification systems are approximately 60% - 85%
Statistical
• Often use Bayes’ theorem: measures ‘degrees of belief’ (or ‘degrees of aboutness’)
• Use frequency and location to determine important or useful concepts
• Feed the system example text for the specific category
• Statistically identifies and extracts significant keywords and patterns
• Document training sets
• Match word/concept patterns to categories
• Often need sets of 50+ documents, or more!
• Poor document choice can cause pollution/noise
• Drawbacks
• Effort required to create the training set
• Relies on the availability of keyword-rich text
• Hard to determine problems
© Concept Searching 2014
Auto-classification Systems – Rules Based
Taxonomies and thesauri are the foundation of an auto-classifier. They provide the vocabulary
against which rules are built and ‘teach’ the machine how to ‘understand’ and categorize
content
Rules-based
• Rely on Boolean (and, or, not) categorization rules to find either a positive or negative
evidence of a match to a category
• LexisNexis: and, not, not w/n, not w/para, or, pre/n, pre/, w/, not w/seg, not
w/sent, w/n, w/p, w/seg, w/s, atleast, allcaps, caps, nocaps, plural, singular
• More control over behavior
• More work!
• Success depends on quality of rules
• Example: (Google OR Salesforce) NOT LinkedIn
• Drawbacks
• Dependent on the richness of the taxonomy and collection of synonyms/keywords
• Creating and/or tweaking the rules for each category – can be onerous
Most popular taxonomy management suites include auto-classification modules
• With few exceptions, taxonomy tools are generally rules-based systems
© Concept Searching 2014
Auto-classification Systems – Other
Linguistic
• No commitment to a taxonomic tree, based on parts
of speech and their relationships, typically not
scalable
• Related to parts of speech, syntactic parses, or
semantic interpretations
Machine Learning
• Subfield of computer science (CS) and artificial
intelligence (AI) that deals with the construction and
study of systems that can learn from data, rather
than follow only explicitly programmed instructions
Semantic Networks
• Refers to a set of relationships between concepts
and words, including parts of speech and
real-world relationships
• These can include rules of various types – not just
Boolean
© Concept Searching 2014
Pros and Cons of Most Widely Used Classification Techniques
• Most widely used are statistical and rules-based
• Several are a combination of both statistical and rules-based
Statistical
Rules-based
Work involved in building good
training sets
Work involved in building
exhaustive rules
(mitigated by taxonomy tools)
If there’s a problem, can be difficult
to diagnose and rectify/retrain
If there’s a problem, go back to the
rule set and tweak
Machine learning can augment
accuracy or lead to pollution
(accuracy can wax and wane)
System doesn’t evolve without new
rules, but high degree of control
(accuracy mostly increases)
© Concept Searching 2014
Auto-classification Systems – What do they do?
Document Preparation
Parsing
Weighting
Classification
•
•
•
•
•
•
•
Split into language
blocks (paragraphs,
headings), formatting,
layout
•
Entity extraction
NLP: parts of speech,
phrases
Terms, variants
•
•
•
Not all classification solutions are created equal!
Frequency
Location in text,
phrase
Proximity
Combination
Format of text
If threshold reached
Can influence search
results
This is where rules
vs statistics come
into play…
© Concept Searching 2014
What Types of Tools are Available?
There has been an influx of information governance tools in
recent years – know the difference between the problem you
are trying to solve and the solution you choose
• eDiscovery indexing engines that aim to index all of an
organization’s content across however many
repositories/applications it uses
• Records management tools that apply classification and
retention rules to content
• e-mail archive tools that are more ambitious than the
previous generation of email archives, offer features
supporting the auto-classification of email
• Plugins for SharePoint to bridge the gaps, often used for
enterprise search and content management
• Clean-up tools for shared drives
• Migration tools
© Concept Searching 2014
Example – Compliance and Records Management
“More than 100,000 international laws and regulations are
potentially relevant to Forbes Global 1000 companies – ranging
from financial disclosure requirements to standards for data
retention and privacy. Additionally, many of these regulations are
evolving and often vary or even contradict one another across
borders and jurisdictions.”
Lorrie Luellig is of counsel, Ryley Carlock & Applewhite, PC
© Concept Searching 2014
Auto-classification in Records Management
Has different requirements from general document classification systems or approaches
Classification
•
Records management repository is typically the only definitive information
management taxonomy managed by an organization
Declaration
•
•
•
Conscious decision to determine what is a business record and what is not
Goal is to remove decision making from the end user
Requires classification and workflow
Retention Management
•
•
•
Information governance
Preserve, retain, move, dispose
In-place retention management
• Retention and disposition managed across multiple retention management
repositories, applications, collaboration environments, file systems
• No need to relocate the content
Security and Auditability
•
•
Safe-haven for records
• A guarantee of integrity, security, authenticity, and availability
Must provide a full audit trail that can withstand legal scrutiny
Difference between auto-classification and auto-declaration
© Concept Searching 2014
Implications of the Cloud
• Cloud is not often used as the records management repository, nor for
applications that can impact compliance, due to where and how information is
stored
• Storage issue – what country?
• Privacy/security issue
• Integration issue
• Compliance issue
• Infrastructure, licensing decision, not necessarily functional
BUT….
• Can be used for auto-classification for any metadata application such as search,
text analytics, social tagging, migration, identification of data exposures
• Entirely dependent on classification/taxonomy software
• Evaluate vendors who have an integrated solution
• Ask for production clients using the integrated solution
© Concept Searching 2014
Implications Across the Organization
eDiscovery
• Legal fees, fines and damages could be
reduced by 25% if companies applied best
practices to records management, security and
eDiscovery (AIIM)
Even if content is not declared
a record, it still must be
governed
Search
• Estimates indicate that end users spend
2.5 hours per day to find information
necessary to do their jobs (IDC)
• 85% of relevant documents are never retrieved
in search (IDC)
Security and Privacy
• 70% of data breaches are due to a mistake or
malicious intent by end users
(Ponemon Institute)
• 88% are attributed to negligence
(Wharton Information Security Best
Practices Conference)
© Concept Searching 2014
The Problem of Course is the End User
“74% of organizations continue to depend on individuals to manually
comply with legal, regulatory, and record management requirements.
Given the projected growth and the inability of employees to
manually manage information, organizations need to start
automating the tasks associated with classifying, managing, and
disposing of information assets.”
Council for Information Auto-Classification (CIAC)
“It is simply not realistic to expect broad sets of employees to
navigate extensive classification options while referring to a records
schedule that may weigh in at more than 100 pages.”
Forrester Research/ARMA International Survey
© Concept Searching 2014
Typical Enforcement Challenges Business User
What Are the Typical Challenges?
Electronic information is growing at a rate of 30% to 60% per year – electronic
records typically constitute 90% of an organization’s records
• Users
• Trained (rarely done, if done minimalist approach)
• Policies enforcement
• Reality – biggest stumbling block in classification is the end user
• Impact
• Classification can be subjective, erroneous, or non-existent, if content not
tagged correctly
• Impacts productivity
• Increases organizational risk
• End user classification – 20% to 80% accurate (typically more on the low end)
• Automated classification – 80% to 90% accurate, if tuned and managed
© Concept Searching 2014
Typical Challenges for IT and Management
Using auto-classification with the goal of streamlining
the records management application, and removing
the end user where possible from the process, will
require
• Updating policies
• Creating new processes and workflows
• Creating content based retention schedules
Transitioning to
metadata, autoclassification, and
taxonomy technologies
will require one or more
of the items listed,
depending on the
application
• Identifying exemplar documents
• Creating test programs
• Audit and review processes
• Ensure overall transparency
© Concept Searching 2014
We still have one more missing piece!
Taxonomies
© Concept Searching 2014
Types of Taxonomies
Taxonomy is sometimes called the ‘world’s oldest profession’ (Aristotle)
Swedish botanist Carolus Linnaeus is regarded as the father of taxonomy
List, Picklist, Controlled Vocabulary, Authority Files
List of lead or preferred terms, selected by the end user,
may or may not have relationships among the terms, can
include a synonym ring
Synonym Lists
The use of synonyms allows one concept to be instantiated
as the same as the other, but still allows a term to be
preferred over another
Hierarchical
Each content item resides in only one category, referred to
as a ‘tree’
• Piano
• Musical instrument
© Concept Searching 2014
Types of Taxonomies
Polyhierarchical, Faceted, Thesauri
Content items can exist in more than one category,
more structured controlled vocabulary, provides
information about each term and its relationship to
other terms, features of a hierarchical taxonomy plus
associative relationships
• Piano
• Musical instrument
• Stringed instrument
• Percussion instrument
Ontology
Multiple taxonomies with additional relationships
added to specify concepts within a domain
Sources: Marlene Rockmore – The Taxonomy Blog, and Heather Hedden, author of ‘The Accidental Taxonomist’
© Concept Searching 2014
Evaluating Solutions
© Concept Searching 2014
Evaluating Solutions
Solution Components *Basic*
•
•
•
Semantic metadata generation, automated classification, taxonomy management
• Stay away from lengthy implementations, required use of vendor consultants,
stand-alone applications, use of new languages
Look for a flexible solution that addresses more than ‘just’ records management and is
extendable to other applications, such as search, privacy, migration, and text analytics
• Accomplished through the development of an enterprise metadata repository
Evaluate solution and richness of function as well as ease of use
• Who will maintain? IT or Business?
• How easy for Subject Matter Experts to contribute?
• Identify how the taxonomy tool integrates with other applications and how this
reduces time and effort, yet delivers high quality
Records Management *As an example, each application will have its own benefits*
• Automatic identification and declaration of documents of record as content is created
or ingested
• In-place retention management option
• Elimination of end user tagging
• Cloud integration
• Workflow capabilities including actions on documents/records
© Concept Searching 2014
A Few Questions to Ask to Get You Started
• How often should a drive or repository be indexed for
new content?
• Does the system need to perform in real time?
• Should old content be re-classified to determine if it
should be classified according to a different
category?
• How are classification errors solved?
• Should the user have the ability to override the
classification assignment?
• Who should manage the system – IT or Business?
Or both?
• How long should deployment and ongoing
management take?
• Can end user involvement be eliminated?
• How does the system handle vocabulary and/or
language ambiguities?
© Concept Searching 2014
Recommendation – Enterprise Metadata Repository
“The metadata infrastructure provides the critical glue that binds the
information infrastructure to the underlying IT infrastructure.
Sound information governance practices would take advantage of the
metadata infrastructure to ensure that content and data are managed
consistently and adhere to written policies, across on-premise and
cloud based environments.”
2010 IDC Digital Universe Study
The Advantages
•
Ability to develop a single repository of organizationally relevant metadata to be
made available to any application that requires the use of metadata
•
Elimination of costs and errors associated with end user tagging
•
Normalization of content across functional and geographic boundaries to remove
ambiguity in vocabulary
•
Metadata managed and changed in one place
•
Ability to apply policy consistently across diverse repositories and applications
•
Provides flexibility to rapidly make changes to the repository for regulatory
compliance where changes are immediately available for use by applications
© Concept Searching 2014
Calculating ROI
© Concept Searching 2014
Real World Business Drivers
• Three areas
• Business Impact
• IT Impact
• Process Impact
• Most enterprises will
see highest ROI from
Process Impact
© Concept Searching 2014
ROI Objectives
The primary research
collected in this
reference white paper
illustrates
• There are several benefits
spanning the categories of
IT, process, and business
impact, all of which have
moderate to high levels of
business impact
• The leading ROI value
drivers are related to
processes and effective
decision making
• The value of process and
IT drivers is manifested via
their business impact
Pique Solutions – Connected Value: The ROI Benefits of Business Critical SharePoint
© Concept Searching 2014
Real World Savings
The Business Solutions
• Search
• Records Management
• Migration
• Data Security
• eDiscovery/Litigation
Support, FOIA
• Information Governance
• Text Analytics
• Business Social Networking
• Collaboration
• Content Management
• Metadata Management
Pique Solutions
© Concept Searching 2014
Freebie
If you would like to calculate the ROI, including soft
and hard benefits for a specific application solution,
please contact us on
[email protected]
or 1 703 531 8564
© Concept Searching 2014
Questions?
© Concept Searching 2014
Thank You
Don Miller
Vice President of Sales
Concept Searching
[email protected]
Twitter @conceptsearch
© Concept Searching 2014
© Concept Searching 2014
Smart Content Framework™
Sum of parts is greater than whole
Metadata driven applications - conceptClassifier for SharePoint platform has been
deployed by clients in diverse industries to automatically generate metadata and use
that metadata to apply and enforce information governance policies
© Concept Searching 2014
Building an Information Governance Concept Index - Example
Concept Searching has a unique approach to ensure success
•
Concept Searching’s unique statistical concept identification underpins all technologies
•
Multi-word suggestion is explicitly more valuable than single term suggestion algorithms
Concept Searching
provides Automatic
Concept Term Extraction
Triple
Heart
Bypass
Baseball
Three
Organ
Center
Highway
Avoid
•
conceptClassifier for SharePoint will generate conceptual
metadata by extracting multi-word terms that identify
‘triple heart bypass’ as a concept as opposed to single keywords
•
Metadata can be used by any search engine index or any
application/process that uses metadata.
© Concept Searching 2014
Case Study – Intelligent Search
Situation:
• Nonprofit organization that contributes to the prevention and cure
of cancer
• More than 30,000 users
• Outpatient treatment programs that record more than 328,300
visits a year
Challenge:
• Portal to enable patients to access information relevant to their
specific health situations
• Accurate, medically sound, and secure information necessary
• Aggregate content from internal and external sources
Solution:
• conceptClassifier for SharePoint platform
• SharePoint 2010
• Microsoft FAST Search
• Integrated solution with partner Aeturnum
“With more than 30,000 current users,
the MyMoffitt Patient Portal has seen
significant growth, and of the new
patients that come to Moffitt, 87%
register for a patient portal account. All
developments and enhancements are
about improving the patient experience.”
Jennifer Camps, Director of Portal
Technologies and Data Management,
Moffitt Cancer Center
Read the Case Study
Benefits:
• Accuracy of search
• Relevance of results
• Confidence in data
• Control and trust
© Concept Searching 2014
Case Study - Records Management
Situation:
• UK County Council
• Serves approximately 1 million citizens
Challenge:
• Management of digital records
• Reduction in paper records
• Reduce physical and digital storage
Solution:
• conceptClassifier for SharePoint platform
• SharePoint Search
• SharePoint Records Management
• Managed Metadata Services
“One way to manage records
whatever their medium”
Benefits:
• Migration of thousands of documents from File Shares to SharePoint
to deliver and integrated view of all information
• Cleanse content existing in multiple repositories
• Real time identification and classification from all sources of ingestion
(fax, scanned, and email) as well as from all semi-structured and
unstructured content
• Improved search to quickly identify value of document/record in ‘plain
English’
© Concept Searching 2014
Case Study - Search, Data Privacy, Records Management,
Migration
Situation:
• Budget of $6.9 billion
• Over 60,000 users
• Runs 75 hospitals and clinics providing care to more than 2.6 million
beneficiaries
Challenge:
• Data Privacy
• Intelligent Migration
• Before and after
• Records Management
• Pilot project: 72,000 Site Collections, 5,300 retention codes,
classify 200,000 documents per hour with minimum resources
Solution:
• conceptClassifier for SharePoint platform
Benefits:
• Automatic tagging based on organizational vocabulary and descriptors
• Automatic routing and the ability to change the SharePoint content type
• Eliminated manual tagging, removes from unauthorized access and
portability
• No security exposures or breaches in 4 years
“Concept Searching’s Taxonomy
Manager provides our Subject
Matter Experts with a user friendly
web interface enabling the
development of controlled
vocabularies that can be used to
filter search results and autoclassify content to folder
structures.”
J.D. Whitlock, Lt Col, USAF,
MSC, CPHIMS
Air Force Medical Service
Read the Case Study
© Concept Searching 2014
Case Study - eDiscovery, Litigation Support, FOIA
Situation:
• Legal Department
• FOIA Processing
Challenge:
• Reduce costs associated with litigation support
• Overabundance of content - much was not tagged
• Increase relevance in finding appropriate information
Solution:
• conceptClassifier for SharePoint
Benefits:
• Vocabulary normalization
• Enables ‘concept based’ searching and eliminate the
construction of complex queries
• Removes the ambiguity in content
• Enables the identification of new information to be captured and
identified early in the discovery process and immediately made
available to the discovery team
• Eliminates manual tagging unless authorized
© Concept Searching 2014
Case Study – Intelligent Migration
Situation:
• Multiple Clients
Challenge:
• Simply moving content to new location did not
provide any benefits
• Human error and time was too costly
• Quantity of content too great
conceptClassifier for SharePoint
identified 66,000 duplicates out of a
total of 270,000 documents,
representing a 24% reduction in disk
space.
Solution:
• conceptClassifier for SharePoint platform
Benefits:
• Cleanses irrelevant and unnecessary documents
• Dramatically reduces the time for migration
• Eliminates manual intervention
• Improves the outcome enabling improvements in:
• Search
• Records management
• Data privacy
• eDiscovery and litigation support
• Text analytics
Automotive Parts Company
The goal was to improve search for
147,000 business users but needed to
migrate literally millions of documents.
conceptClassifier for SharePoint was
used for the pre and post migration and
for enabling concept based searching
with their existing search engine and
taxonomy based search after the
migration.
© Concept Searching 2014
Case Study – Automatic Tagging, Policy, and Governance
Situation:
• Budget of $6.9 billion
• Over 60,000 users
• Runs 75 hospitals and clinics providing care to more than 2.6 million
beneficiaries
Challenge:
• Data Privacy
• Intelligent Migration
• Before and after
• Records Management
• 72,000 Site Collections, 5,300 retention codes, classify 200,000
documents per hour with minimum resources (Proof of Concept)
Solution:
• conceptClassifier for SharePoint platform
The US Air Force deployed the
technologies to implement data
privacy protection processes and
after five years has not had a data
breach
Read the Case Study
Benefits:
• Automatic tagging based on organizational vocabulary and descriptors
• Automatic routing and the ability to change the SharePoint content type
• Eliminated manual tagging, removes from unauthorized access and
portability
• No security exposures or breaches in 5 years, since deployed
© Concept Searching 2014
Case Study – Social Tagging and Collaboration
Situation:
• 8,000 registered users
• 800 virtual workspaces and communities of practice
• Share and disseminate mission critical information and technical
excellence
• Collaboration
Challenge:
• Poor search results
• No collaboration capabilities
• Lack of security and control of content assets for each end user
• No robust knowledge management capabilities (wikis, blogs,
collaboration, poor search, etc.)
Solution:
• Partner Triune Group provided the knowledge management and
collaboration platform
• conceptSearch embedded into the solution
Benefits:
• Accuracy of search and relevant results
• Confidence in data
• Increase in quality of decision making
“With the integration of the Concept
Searching intelligent search capability,
Triune Group was able to provide us with
a robust and scalable collaboration tool
that delivers not only powerful advanced
searching capabilities, but also a
controlled and secure environment.”
Brian Follen
NASA Safety Center (NSC)
KnowledgeNow Program Manager
Client received KMWorld ‘Reality Award
Winner’ for successful deployment and
recognition for a superior Knowledge
Management Solution
Read the Case Study
© Concept Searching 2014