the PDF version here.

Shedding Light on
Building and
Implementing
Successful
Taxonomies
records taxonomy is a corporate-wide schema for the identification, retrieval, and
disposition of all business records. It
provides an easy-to-use system that
allows people to easily store and retrieve the documents they use in their
day-to-day business. (See the sidebar
on page 43 for an example.)
Although the term is often used in a
limited sense to describe the hierarchical classification structure to be
used for storing documents or records,
in the modern and technological environment, it should certainly include
more.
A
For all the importance assigned to taxonomies, few studies
have shed light on what taxonomy design methods are
effective, how much should be spent on the design, or how
to make designs more successful. A September 2010 survey,
“Records Classification Systems – Taxonomies,” included
approximately 4,000 members from the United Kingdom
(UK), North America, and Records Management Association
of Australasia ListServes. The survey results, which included
more than 170 responses worldwide, provide clues about
how to improve taxonomy design methods and implementation planning to significantly increase their success rate.
James Connelly, CRM
Ontology
(Relationships)
Records
Taxonomy
Classification
(Heirarchy
& Clustering)
Naming
(Thesauri and
Metadata)
Taxonomies should comprise classification schemes and relationship
models, as well as detailed naming
conventions, as the figure shows.
With the advent of enterprise content
management (ECM) tools, we can
now easily manage “function-based”
groups of records within repositories
42
MAY/JUNE 2011 INFORMATIONMANAGEMENT
that allow for proper disposition and,
at the same time, we can display the
documents or records in user-friendly
structures that can include personalized taxonomies, faceted relationships, or business team-related
arrangements.
Along with policy and procedures, a
taxonomy is one of the most crucial el-
ements of a records program. In fact, it
is a linchpin. Once it is built, other key
program elements can be established,
including a records retention schedule
and disposition program, a vital
records program, and a program to
manage personal information banks.
Without a taxonomy, these elements
are difficult to establish and maintain.
What were the main reasons your organization decided to build a records
classification system/taxonomy? (Respondents could select more than one reply.)
Improved Records Control or System
Improved Lifecycle Management
Following RM Best Practices
Enterprise Content Mgmt. Implementation
Mission-Critical
73%
62%
60%
52%
49%
Productivity Improvement
Improved Hard Copy Management
Improved File Server Management
Cost Savings
Needed to Improve SharePoint
35%
31%
26%
22%
18%
Providing Rationale
for Building a Taxonomy
It is interesting that the most commonly selected rationales, “better access and control” and “improved
lifecycle management,” are traditional
and embody basic records management principles.
Access and Control
During the design of a taxonomy, it
is common to identify all business
records in the data-gathering process.
Simply knowing their location and
ownership gives an organization better control over existing records. Also
providing a structure to simplify storage and retrieval enhances access to
individual records or documents.
Improved Lifecycle Management
A taxonomy is always the foundation of a records retention schedule.
Legislation and policies often mandate recordkeeping for records that
support specific business functions.
By grouping similar business functions, it becomes easier to assess the
need for retention, select an appropriate retention period, and make
retention periods and “event triggers” more consistent throughout
the organization.
Business Case Rationale
On the other hand, creating a business case for taxonomy development
appears necessary. Expenditures on
taxonomy development do not appear
adequate for such a critical component
of an organization’s RM infrastructure. (See the section on costs and
budgeting.)
It would be logical also to play up
the improvements in productivity. A
simple return on investment (ROI)
study would show that an intuitive
taxonomy would considerably reduce
the time taken to store and retrieve
documents. Here is an example using
assumptions:
Assume that each staff member creates about 800 documents a year (a little more than 3 documents x 250
A Taxonomy Example
For the purpose of this example, assume that there are only two major business functions
involved in managing vehicles: fleet management and risk management. A functional
breakdown may look like the following:
Fleet Management
Risk Management
Use (logs) & ownership (by business unit)
Vehicle insurance (policies) (by policy #)
Maintenance of vehicles (by year & make)
Vehicle accidents (by accident date)
Vehicle disposal (by disposal date & VIN)
Vehicle claims (payments) (by VIN)
Metadata elements that may be captured with respect to vehicles may include both
“common” and “unique” identifiers: Vehicle Identification Number (VIN), Year of Manufacture, Make, Model, Year of Acquisition, Year of Disposal, Disposition Status (e.g., sale,
transfer, and write-off), Claim No., and Accident Date.
Series Structure
Each records series may require folders and documents to be kept in different order. Vehicle maintenance may be kept by vehicle “Year of Manufacture” and “Make” for ease of
reference, whereas vehicle accidents may be kept by “Incident/Accident Date.” This is
why user involvement is so crucial in taxonomy design. Each business function must be
approached and discussions held to help determine what hierarchical order will assist
them in conducting business and in retrieving information.
Metadata
Each records series may need different metadata, but if common metadata can be found
and used, the opportunity exists to ensure that all documents related to a specific vehicle
can be easily located. In such a case, should there be an injury accident and litigation,
legal discovery would be simple. Common metadata can be used within “folder names”
in both hard copy and electronic environments or “document names” within a server environment. Naming conventions such as these are critical to successful implementations.
Ontologies
Note that the responsibility of records analysts does not end with the structural design. It
needs to be carried on to metadata elements so content management systems can be
taken advantage of with respect to workflow or real simple syndication (RSS) feeds. With
proper metadata and naming, the relationships between records series can be maintained.
In the above example, if “Year of Manufacture,” “Make,” and “Model” are used as naming
conventions in ALL series, searches across the entire taxonomy will allow for easier identification of all information related to a specific vehicle, which could be crucial if there was
litigation related to an injury accident.
workdays). Each document will have
to be stored, so there will be 800 storage actions. If only 20% of documents
are ever retrieved, there will be 160 retrievals. With an inadequate classification system, it could take as long as
two minutes to name and store each
document, and it could easily take 10
minutes to retrieve a document.
An intuitive system will certainly
take less time to use. A conservative
estimate is that for each of the 960
combined storage and retrieval actions, 15 seconds can be saved, for a
total savings of 14,400 seconds –
which is 240 minutes, or four hours,
per year. At an average wage (including benefits) of $30 an hour, each employee would save $120 each year, so
an organization of 1,000 people would
potentially save a total of about
$120,000.
MAY/JUNE 2011 INFORMATIONMANAGEMENT
43
Choosing a Classification
System Type
It is perhaps a tribute to ISO 15489
that functional classification continues to be the choice of most organizations. But in the last few years, the
survey shows a trend toward hybrid
systems.
Function-Based Taxonomies
Function- and activity-based systems do assist records and archives
staff in managing records over long
periods of time. They also ensure that
all aspects of a business are included
within the taxonomy. But, there are
limits to strict functional systems.
An analysis of the survey results
shows that functional designs are
quite effective when implementing
large-scale ECM projects in large organizations. It appears that the uniformity of the functional design at
high levels creates consistent classification across large organizations.
However, it is also evident that
smaller and unique business units
may not gain much advantage from
the ECM project, as these broad functional classification systems appear
to cause problems at more granular
levels.
Hybrid Taxonomies
In the late 1990s, the limitations of
functional systems became apparent
in systems that had been recently introduced. This was readily apparent
in matrix-based organizations (i.e.,
those with a number of diverse business functions), but it was even more
so in case file systems where a number of business functions related to a
particular project, activity, or case file.
Taxonomies for projects or programs
that are matrix-based may need considerable adaptation. Also, case files,
which may not be linear in nature,
may cause problems for strict functional designs.
For example, when a purchasing
organization’s business unit needs to
acquire new software, it may conduct
44
MAY/JUNE 2011 INFORMATIONMANAGEMENT
a needs analysis; do market research;
prepare a request for proposal; interview prospective vendors; analyze
proposals; negotiate contracts; acquire and test software; and finally
approve a purchase. Although there
are eight distinct business functions,
documentation is often retained by
purchasing managers in a “projectbased” folder for ease of reference
while the software is being acquired.
In such a situation, the taxonomy
may need to adjust its top-level functions, while retaining function activity structures within a “case” file.
Note that ECM solutions can allow
for function-based libraries while allowing individual business units to
view documents from a number of
functions within a case file folder.
This allows for a modified or personal
taxonomic display while retaining official records within a strict functional structure.
Subject-Based Taxonomies
Subject systems were popular in
the 1970s and 1980s. These were
based on library or encyclopedic-style
systems where records are grouped
by topic rather than by function. In
the sidebar example, the risk management function of the organization
managed vehicle insurance, accidents, and claims. In a subject-based
system, the topic might be “vehicles,”
and all six records series cited would
be grouped together irrespective of
who was using the records or how the
records were being used.
The advantage of subject-based
systems is that they are easier to understand. But, often they lead to clasTypes of Classification Systems
Which type of classification system
does your organization use or wish to
develop?
Function-Based
Hybrid-Based
(function, subject, project)
Subject-Based
Project-Based
52%
38%
10%
1%
sification errors and poor retention
schedules. Also, they tend to be arbitrary and are frequently changed by
new staff that prefers to use their
own terminology.
Choosing an Approach
to Building Taxonomies
From these results and participant
comments, it is evident that designs
are following an in-house process similar to the business process analysis
that was used in the 1980s to assist in
the design of computer systems.
What methodology was used to
develop the classification system?
Proprietary (in-house)
37%
Proprietary (consultant)
15%
Adapted Existing
Classification System
14%
DIRKS (interview approach)
13%
DIRKS (user focus groups approach) 12%
Other
10%
Gathering Background Data
Considerable effort is required to
gather background data. Designers
must have access to organization
charts, job descriptions, websites, business models, and strategic plans to
help them understand the business
functions of the organization they are
trying to assist. Several participants
also commented on the support
received from their IT groups as the
design work parallels and often
complements business architecture
designs.
Shelf or high-level inventories are
needed to identify all hard copy
records, volumes, subject areas covered, existing identification schemes,
and naming conventions. Electronic
inventories should identify folder
structures, document volumes, and
storage space used. This volume information is crucial for planning and
costing implementations.
Armed with such information, designers frequently conduct interviews
to determine what business functions
are supported and how documents
should be stored. An interview can certainly elicit information as to business
functions, but it can also introduce designer bias. The classification system
often changes from supporting users to
supporting the records management
program, as reflected in this survey
comment, which represents a common
refrain of many participants:
“The classification system was
designed by records managers for
records managers, but unfortunately does not provide actual
business value to anyone else in
the organization. … It is not
merely a matter of change management and helping staff learn
a new way of working in adopting the classification scheme, it
actually works quite actively
against the information flows
within the organization. We are
now just starting to implement
our EDRMS … systems, which
use the classification scheme,
and it has become a problem –
the classification scheme will
probably work on the EDRMS
system, which is managed and
controlled centrally by records
managers, but on the more dispersed systems, including the
file-share network drives, it will
struggle.”
Insistence on meeting the needs of
the records community leads inevitably to failure or, at best, tepid acceptance of the new system.
This survey shows that relying on
user-based focus groups involving
management, professional, and technical staff is a more effective process
than one-on-one interviews. By working with three or four people from a
business unit, the process becomes inclusive and, in some cases, empowers
the users. A designer can control structure and format but still give the business unit the opportunity to build its
own system.
Adapting Taxonomies
It was somewhat surprising that a
considerable number of organizations
adapted existing systems from similar
organizations. Although some may see
this as a time-saver, it is more likely to
render the taxonomy unacceptable to
staff, who may say such things as, “I
wasn’t consulted,” “That’s not the way
we do things in this company,” or “I
don’t understand the reasoning behind
this design.”
A key element of taxonomy design is
obtaining buy-in. Those indiviuduals
who don’t involve users in the process
and give them the opportunity for
input will not have their cooperation
during implementation.
On the other hand, having knowledge of other organizations’ designs
should be part of the research. This allows designers to offer users alternatives, design options, or different ways
of organizing their records. But, to
take a model and impose it on an organization is often a recipe for failure.
Budgeting for Design
and Implementation
A cross tab review reveals that
larger organizations (more than 5,000
staff) either planned to or did indeed
spend more on their taxonomies
(greater than $100,000) and to that
end were quite successful with both
design and implementation.
However the average organization,
with a staff of 1,000 to 5,000, spent less
than $25,000 on either design or implementation. The caveat appears to
be that organizations should understand that designing a taxonomy is not
a clerical process. It includes business
process analysis, use of communication strategies and tools, and legislative and compliance reviews, as well as
engaging management, professional,
and technical staff in a work-altering
project.
Design Costs
To complete a design properly in a
large organization (i.e., 1,000-plus people) may involve a project coordinator,
two or three teams of designers, each
of which could include a subject matter expert, as well as an assistant who
can ensure that the design is transcribed properly and that individuals’
needs or concerns are addressed.
During a design process or focus
group, the designer is often fixed on
the task or the results that are being
developed and can neglect individual
concerns, which can result in an overlooked individual becoming a thorn in
the side.
Each business unit may require
several days of data gathering, as well
as about a half-day focus session and
several days to compile the design and
provide follow up to portions of the design that are complex. If there are 30
to 40 business units, this could
amount to 500-plus person-days of
work. Using a mix of existing staff
and, perhaps, consultants, could
amount easily to $150,000 worth of
effort.
This survey suggests that timeframes and budgets for taxonomy designs should be commensurate with
the anticipated productivity gains and
value-added improvements that a
well-designed and user-accepted taxonomy offers. In other words, if the
ROI anticipates productivity gains of
$100,000, an organization should be
prepared to pay at least that for design and a similar amount for implementation.
Implementation Costs
Implementation costs within a
server structure are usually low. It
takes a team of two usually less than
a half-day to work with a business
unit to convert its electronic files to a
new system. Although some folder or
document-naming changes may take
longer, they can usually be assigned to
clerical staff to ensure the completeness of the change-over. The methodology for such conversions, although
technical, is usually simple to arrange
with IT staff.
Hard copy implementations normally take one linear foot of records
MAY/JUNE 2011 INFORMATIONMANAGEMENT
45
per day with a good classifier at the
helm. If case files are involved, speeds
of three linear feet per day can be
reached. Much of this estimate depends on the complexity of the records,
as well as the skills of the assigned
staff. So, for every 1,000 linear feet to
be converted, assume at least 600 days
of implementation time.
Determining how much should be
converted brings up the issue of legacy
data.
Legacy Data Costs
Of all the survey results, this was
the most surprising: 40% of respondents said they were leaving legacy
data behind. Of the remainder, 26%
were spending $50,000 or more. From
a simple cost analysis, it is not surprising that organizations would feel
this way and have tried to minimize
how much information needs to be
brought forward into a new taxonomy.
As long as legacy data is accessible,
there may be no problem. For example,
if electronic records are moved into a
searchable archive drive and past hard
copy is carefully boxed, listed, and labeled – and if these repositories can
easily use the newly developed records
retention schedule – there may be little need to bring forward all the
records.
However, there are still dangers.
Litigation is often a concern, and since
legal discovery processes involve all
records, avoiding the implementation
of legacy data could be fatal to some organizations. It is clear that some form
of risk assessment must be done. At
minimum, each records series should
be reviewed. Anything that could put
the organization at risk should be assessed as to the cost of bringing it forward versus the potential for financial
risk should the organization be unable
to produce the document in the event
of litigation.
Quite simply, the value of bringing
forward documents/records to a new
structure or system must be balanced
against the costs of such an exercise.
46
MAY/JUNE 2011 INFORMATIONMANAGEMENT
Implementing a Taxonomy
Fifty-eight percent of replying organizations indicated that their plan
was to deploy their taxonomy across
the entire organization. Although most
organizations were simply planning
their roll-out, 20% of organizations said
they had completed their organizationwide roll-outs.
Once a design has been completed, it
is important to validate the design and
roll it out to each business unit as soon
as possible. Delays in implementation
can adversely affect the use of the new
classification system. In fact, many anecdotal comments indicated that many
of these roll-outs were bumpy at best.
Many expressed frustration at the
“stop-and-start” nature of the project.
Difficulties in implementation can
usually be traced to one of three things:
planning, validation, or training.
Planning
In the section “Budgeting for Design and Implementation,” there is
considerable information as to time,
effort, and resources required in an
implementation project. Most organizations underestimate the extent of
such a project.
At the outset of implementation,
the designers should be able to provide a list of supportive business
units. Also there should be a list of
those business units most in need.
From these lists, plan which business
units to address and in what order,
and then communicate this to all staff
so they know when to expect the implementation teams. Delays must
also be communicated to all staff. Periodic status
reports are an integral part of communication
strategies.
There should also be a large project
“task map” prominently displayed
that identifies each business unit and
the status of each implementation –
from the completed design, to the validated design, to the initial training,
to the support and follow-up needed
to ensure a successful implementa-
tion. Delays should be noted and the
reason for the delays addressed in a
“lessons learned” meeting at the end
of each business unit implementation.
Validating the Taxonomy
Will the taxonomy work? This is
such a fundamental question that it
is astounding that few organizations
focus on this. The most commonly
identified validation method indicated was to send a copy of the taxonomy to a business unit and ask them
to review it. Testing the classification
structure in a pilot project or sending
it to managers was not as effective as
a full review process.
Remember that most users are
not sure what to look for in the taxonomy.
If the structure is returned for review, ask the focus group or business
unit to identify perhaps three to five
commonly used documents. For each
document, ask the focus group or
business unit whether or not it is
immediately apparent where these
documents would be stored or found.
If difficulties are encountered, it is
often a result of inadequate or incorrect naming. Adapting records series
names at this juncture can significantly improve the implementation
success rate.
How was the records classification
system validated?
Returned to Business Units
Returned to Focus Groups
Pilot Implementation
Managerial Review
None
Training Approach
31%
27%
18%
15%
9%
Training is of fundamental importance when introducing a new taxonomy. Survey respondents used a
variety of approaches, but business
unit training followed by individual
and personal support was the most
common approach (27%).
Those 12% of respondents who
used the “train the trainer” ap-
proach had much more successful
implementations than other approaches on their own. (As with
many of the statements in this article, this was determined by filtering
the responses of each approach and
reviewing the success rate of implementations.)
In this process, an individual from
each business unit is given extensive
training in the new system so he or
she has in-depth knowledge of the
new structure and a good understanding of:
n Why records or documents are
grouped the way they are
n The relationships between records
series
n The retention periods for groups of
records
n The reasoning behind the selection
of the retention period
That person becomes the “trainer”
for that business unit, conducting
the introductory session and following up with each staff member. This
approach essentially creates a “super
user,” a “champion,” or a “go-to” person in each business unit to help
anyone who is having difficulty finding or storing documents.
Identifying Critical
Success Factors
It is interesting to note that 65%
of the “World” users rate their taxonomy development and implementation process as “excellent,” “very
good,” or “good,” while just 55% of
“North American” users did. There
are a number of reasons for this.
Early Involvement in Recordkeeping
The British Empire of the 19th
and 20th centuries had a tremendous impact on “records keeping”
across the globe, and, in particular,
the use of registry (records) offices
introduced formal records keeping to
many governments. As a result,
global records classification system
management has been slightly
ahead of North American systems
for some time.
More specifically, Australia was a
key driving force in the development
of ISO 15489-1 Information and
Documentation – Records Management – Part 1: General, the international standard for recordkeeping.
Both Australia and the UK were
quick to adopt the standard’s strong
recommendation for functional classification schema, and their longer
experience with taxonomy designs
and implementations has apparently led them to a better knowledge
of what works and what doesn’t.
Also, the survey revealed key differences in the approach the World
took in designing and implementing
taxonomies versus those North
America took.
User Participation
and Management Involvement
The need for user participation is
clear; the global figure of 71% of
users participating in the design vs.
43% in North America is telling. By
filtering responses, the survey shows
that organizations had a 65-70%
success rate where user participation was a success factor.
By filtering responses again, the
survey shows that organizations had
a 75-80% success rate where management involvement was a success
factor. In a more detailed question
regarding management involvement, almost 60% of respondents
said they had received management
approval of their taxonomy outline
before proceeding with design and
implementation.
When both user participation and
management involvement were
identified as critical success factors,
the success rate jumped to 85%.
Although employing a communications strategy and design validation did not significantly increase
the success rate of taxonomies much
above 85%, it appears these approaches also contribute to the success of projects.
Communications Strategy
A communication strategy involves more than sending a periodic
newsletter to tell people what is happening. It must include change management techniques and allow the
project to respond to business units
where problems occur. It should also
focus on successes and the satisfaction of users. Through peer pressure,
organizations can move stalled implementations forward.
Design Validation
If users are not convinced that the
new system can work and is working, they will subvert the system.
Sending a design for review is useful, but working with groups to look
at specific examples of records and
document retrieval is better. This
will allow the team to adjust terminology so it reflects what users expect. Also, it allows designers to
assess the ability of users to
“browse” the structure.
Summing it Up
One survey respondent listed the
following five critical success factors:
1. A detailed roadmap or implementation plan
2. An executive steering committee,
chaired by an executive sponsor
3. Committed and budgeted resources,
both capital and human
4. External and unbiased subject matter experts/consultants
5. Detailed training/communications/
change management strategies
Although reflective of an individual organization’s approach, this is
a wonderfully concise explanation of
how to succeed in designing and implementing a taxonomy.
Editor’s Note: View the survey in its
entirety at http://content.arma.org/
IMM.
Jim Connelly can be contacted at
[email protected]. See his bio on
page 51.
MAY/JUNE 2011 INFORMATIONMANAGEMENT
47