biovia ekb for catalysis

BIOVIA EKB FOR CATALYSIS
KNOWLEDGE MANAGEMENT
FOR THE CATALYST SCIENCES
WHITE PAPER
EXECUTIVE SUMMARY
Knowledge management issues in the catalyst sciences are as acute as they are unique. The
largely unaddressed need for effective knowledge management runs so deep that many
companies do not realise it is there until a key scientist retires, leaving behind a huge gap
in corporate knowledge filled only by a set of incomprehensible and disconnected spread
sheets. Meanwhile, scientists elsewhere amass huge libraries of “single use” data dispersed in
disconnected or inaccessible stores.
This document discusses what makes the knowledge management challenge of catalysis unique
and why informatics is such an important part of the solution. We also explore how the right
knowledge management solution can also provide immediate benefits that not only streamline
the execution of new product development but also reduce the number of experiments required
to achieve new product innovation.
We introduce a new informatics product, BIOVIA EKB, which includes a number of catalysisspecific features that have made it the knowledge management solution of choice for 3 out of
the top 10 catalysis companies.
WHAT MAKES CATALYSIS SPECIAL?
Overwhelmingly large experimental space
On first inspection, the synthesis of a catalyst may look to an outsider like any other type of
general formulation: a mixture of ingredients needing optimization of the relative amounts of
those ingredients.
The axioms of such a scenario resolve to a simple relationship between formulation and business value.
For example:
• Increase the amount of active ingredient for a more active product
• Replace one active ingredient with another to achieve a cheaper product with the same
performance.
On closer inspection though, this simple model breaks down in the case of catalysis because there
are so many other factors that significantly impact overall product performance:
• Active Ingredients: The materials that make the catalyst active. How much of them there is,
and where in the catalyst they are.
• Supplemental Ingredients: Things that are used in preparation but are not present in the final
product (but which might change the physical characteristics of the product).
• Process and parameters: Temperatures, durations, die sizes, pressures, addition orders,
addition rates, stir speeds, stir types: the list is very long.
• Reactor Conditions: Temperature, pressure and flow rate of materials into and out of the
reaction. The requirement is to understand how different combinations of these impact
catalyst performance.
• Feedstock Range: Many catalysis applications need to deal with variation in feedstocks
which may contain elements that degrade or damage the catalyst over time.
• Reactor Design: Geometry and size of target reactors. The need is to understand their impact
on catalyst performance
Taken together, these factors present a huge experimental space to cover (meaning that the
number of combinations of factors is very large). There is also a very complex connection between
these factors and their eventual impact on product performance. The simple axioms of the
standard formulation problem do not apply.
This diversity of input factors presents both a big challenge but also a big opportunity;
any company able to properly understand how these factors interact to give final product
performance characteristics will be in a position to outperform its competitors by a considerable
margin. Moreover, greater understanding of how these factors interact gives a company a distinct
competitive advantage.
The challenge this large “experimental space” presents means that catalyst science often feels
more like “art” than science. Indeed, many senior scientists, when asked why they designed an
experiment a certain way answer that “it felt right.”
Shifting optimization targets
Optimization is the process of finding the combination of preparation variables that achieves the
maximum value for a given performance measure. As with preparation variables, in catalysis the
number of potential performance variables for assessing a catalyst is also large. Here are some
examples
• Activity: The efficiency of desired product production, usually measured at a given
temperature or energy level
• Selectivity: Proportion of the “favoured” product in the output of the reaction versus other
reaction products
• Longevity: How long the catalyst is likely to last when run at given conditions (feed, pressure,
temperature) and how performance is likely to drop off
• Robustness: How sensitive is the catalyst to variations (feed, pressure, temperature, breaks
in production)
In catalysis, scientists are typically interested in more than one of these performance
characteristics over time. This is because the needs of the market shift continuously, driven by
factors such as economic growth, raw material costs, regulation changes and many other factors.
In an ideal world, a scientist who has recently become interested in the selectivity of a certain
group of catalysts previously prepared for activity tests would be able to trivially re-analyse the
data already collected. In practice, data collected for activity tests commonly includes only activity
data-making it necessary to repeat experiments.
To understand the statistical connections between these performance characteristics and the
preparation variables, one needs to maintain a consistently measured and complete set of
preparation variables and performance measures.
Interdisciplinary and Interdepartmental nature of work
Catalysis research typically involves three or more distinct groups of people working together.
• A preparation team typically creates relatively large numbers of research samples.
• These are then passed to a characterization team which performs standard tests to measure
or confirm the physical and chemical characteristics of the samples.
• The samples are then passed to a testing team responsible for determining the performance
characteristics of the catalysts.
In many cases the picture is made more complex by other factors.
• Preparation teams often undertake some of their own characterizations.
• Commonly, multiple characterization teams exist with different capabilities (and perhaps at
different sites).
• Testing teams are sometimes split according to the “scale” of the test (screen / high
throughput / low throughput / pilot).
• A common distinction between “scientists” and “technicians” within preparation and testing
teams creates a division within the group.
The number of organizational boundaries present in catalysis environments is unusually high
when looked at in the context of other research environments. In most cases, there is no unified
system of coordination (other than email and instant messages) between all of these silos,
resulting in huge information loss between these departments.
As we have already seen, the need to consistently record everything throughout the catalyst’s
life-cycle is critical to the success of any knowledge management strategy. Yet, a single catalyst
sample may well pass through three or more organization boundaries in its lifetime.
Nature of Catalysis data
Catalysis provides a unique challenge in terms of the diversity of its data. It produces:
• Low throughput, low content data: e.g., manual operations done by hand such as measuring
LOI or the dimensions of a sample
• Low throughput, high content data: e.g., a bench scale reactor logging 25-75 sensor
readings every 30 seconds, or an XRD instrument producing high resolution spectra data
• High throughput, low content data: e.g., a screening reactor, measuring pure catalyst
activity for 24-96 samples in parallel
• High throughput, high content data: e.g., a high throughput rig with 16-32 parallel
reactors logging 25+ sensor readings every 30 seconds or more, often with online product
characterization instruments (LCMS, GCMS) recording data at a lower frequency
Outside of catalysis, it is normal to find one or two of these; it would be very unusual to see all
four combinations at once.
Since each different type of data requires a different approach to data collection and storage,
there is normally no single system which adequately deals with all four. However, in catalysis it
is critical to be able to piece all of these different types of data together in order to get the most
complete picture of the catalysts being made and tested in the lab.
Diversity of Instrumentation
Related to the nature of data issue, catalysis also involves highly diverse instrumentation in
the making and testing of products. Unlike many other industries, there is no standard “set” of
instruments that all companies use.
Part of the reason for this is that there is still quite a lot of innovation going on in instruments
that can prepare and test catalysts, particularly in the high throughput area. The high rate of
innovation means that the instrumentation set a company has is largely driven by the budget to
purchase instrumentation and the availability of instruments at the time.
The net result of this is that the instrument-driven working practices at each company are quite
different.
There is complexity at every level, and often very little industry-wide consensus on the best
way to run certain types of instruments. It also means that thus far it has been impossible for
any information systems vendor to create a definitive software suite that meets the needs of all
catalysis companies.
Conclusion: Why is catalysis special?
Catalyst science presents a special challenge to knowledge management because the experimental
space is very large and the optimization targets change over time. Probably as a result of these
factors, companies engaged in catalyst development have created departmental boundaries
that facilitate “single task efficiency” but impede knowledge sharing. Whereas in other research
areas BIOVIA Laboratory Information Management System and BIOVIA Enterprise Lab Notebook
systems have been able to step in and fill the knowledge sharing gap, in Catalysis success
has been very limited with these types of systems because the nature of catalysis data and
instrumentation is highly variable.
WHY KNOWLEDGE MANAGEMENT IS IMPORTANT IN CATALYSIS
Sustainability
Knowledge management is of particular importance to a catalysis company because, up until
now, so much of the work of a catalyst scientist has been more like “art” than science. This
means a number of things, including that it is not possible to reconstitute the knowledge of a
senior scientist if he or she retires or leaves the company. Therefore, knowledge management is
a sustainability issue as well as a business continuity issue.
Most knowledge-driven companies are intrinsically aware of this issue, but don’t realise the
detrimental and long-term impact knowledge loss has until it happens.
The role of Informatics in Knowledge Management
Knowledge-driven companies are not generally aware that something can be done about this
issue, but it requires a concerted strategic effort to bring about the changes required to move from
“art” back into the realms of science.
The diagram above shows how the interconnectedness of data leads it to be more applicable
to more situations and, as a result, more valuable to the organization. The understanding of
relationships, patterns and principles are the things that move an organization up the knowledge
ladder. These advances happen inside the minds of scientists over years of experimentation,
becoming manifest as “tacit knowledge.” This tacit knowledge is typically referred to as the “art”
part of their jobs and is very hard to transfer to others.
So what role can informatics play? It is critical to note that the upward transitions in the diagram
above rely on first having the data and then having the tools to connect the data together in ever
more abstract ways. This combination of storage and tools is something that can also be delivered
by an informatics system.
Importantly, if an informatics system is helping scientists understand patterns and principles,
then it is being done in a sustainable way because the analyses and connections made within the
system are reproducible and can be recorded. An informatics system also has the potential to be
much faster at spotting patterns (and with less data) than an individual scientist can.
A complete knowledge management strategy can move a research organization quickly up
the “knowledge ladder.” In the next section, we look at the current approaches to knowledge
management that are common in catalysis companies today.
Speed of innovation
For our purposes, we will define “innovation” as the generation of new knowledge within the
corporation. This new knowledge is of course intended to lead directly to product innovation,
meaning getting a new product to market gives a competitive advantage.
Most research companies describe their innovation pipeline using some variation of the following
graphic:
In catalysis, a very large amount of effort goes into the “Development” part of the pipeline. This
is where teams are producing hundreds (if not thousands) of research samples per year which are
then characterized and tested. Typically, the results of the characterization and testing then feed
back into another cycle of preparation—and so the loop goes on until the project goals are met.
Clearly, the number of experimental cycles is a key driver of the overall length of time required
for a project to meet its goals. Reducing the number of cycles therefore has a positive impact on
the speed of innovation.
Critically, a proper knowledge management strategy can help the project team reduce the
number of cycles by providing:
• A head start: Easy access to data from previous projects (assuming the data has been
recorded with enough context to be useful in the new project) enables a project lead to reuse
knowledge from previous experiments when designing current project plans.
• Richer datasets for analysis: Teams can draw in relevant data from other projects (past and
present) to help support decision making about the next cycle of experimentation. More data
normally leads to better decisions.
• Better designed experiments: Teams can learn from past mistakes (what didn’t work
before) or by investigating things noticed in previous generations (promising leads). An ideal
knowledge management strategy will provide scientists with tools that help them make use
of statistical design of experiments (or DoE) to optimize the amount of new knowledge they
gain with each round of experimentation.
Efficiency of innovation
As well as reducing the number of innovation cycles, a knowledge management system can also
help reduce the time taken by each of the innovation cycles. While this is not strictly the direct
intent of a knowledge management strategy, it is a very welcome additional benefit of deploying
one.
A Knowledge Management (KM) system can accelerate experiment cycle execution by facilitating:
• All data in one place: A project team no longer has to spend time searching around in
different places for information.
• Clear, consistent plans: If the KM system encompasses the planning of experiments, then
the technicians get a clear and consistent picture of what is expected, regardless of which
scientist or department the requests come from. Time is saved through less back-and-forth
communication.
• Team coordination: If the KM system encompasses the tracking of experiment execution,
then everyone on the project team knows exactly what has been done and what needs to
be done next.
This saves time because everyone knows what they are doing on a day-to-day basis.
• Automated data analysis: Turning raw data into results for a given experiment step often
involves many manual steps. A KM system that encompasses data workup will provide tools
that speed this up, with the benefit of ensuring consistency in how the data is treated after
an experimental step.
• Improved collaboration: If the KM system includes “social” features, then team members can
write and share thoughts and ideas in a way likely to result in smart experimentationThese
are benefits that arrive early in the implementation of a KM strategy and often have
justification value of their own. Such short-term benefits are important milestones and proof
points in the execution of any long-term strategy. Indeed, they alone commonly form the
basis of the tangibles in Return on Investment (ROI) calculations made by companies when
making purchasing decisions.
CURRENT APPROACHES TO KNOWLEDGE MANAGEMENT
Data collection and analysis systems at catalysis companies are commonly a patchwork that
contains large gaps in any catalyst’s end-to-end workflow. This is partly due to the large number
of organizational boundaries and the natural tendency for each department to have a separate
approach to data collection and management. Common approaches are:
• Microsoft Excel
• Notebooks
• Laboratory Information Management Systems (LIMS)
• Instrument Vendor Systems
We discuss each of these in turn:
Microsoft Excel
Excel usage in its simplest manifestation involves users keeping worksheets on their own personal
drives in a way that is organized and formatted for them, but not for any other person. The
accessibility and understand-ability of such data is questionable indeed.
As a result, organizations typically attempt to standardize on a location and format for spread
sheet data. Locations used are typically shared network drives or document management
systems like Microsoft SharePoint.
In practice, standardization and centralization of Microsoft Excel is only a small improvement on
individual file management and the relative payback on the additional maintenance effort is small.
This is because:
• The spread sheet formats change over time. This makes it impossible to compare past data
with present.
• The flexibility of Excel means ‘inventive’ users can easily introduce their own columns,
coloring and organization
• The repository grows rapidly, resulting in an overwhelming number of files and directories.
This commonly leads individuals to take private copies of the data back to their personal
drives so they can find it easily—of course defeating the purpose of centralization.
• There is no effective way to search Microsoft Excel data when it is distributed across many
files. Since Excel files are normally organized along the lines of the projects or experiments
that the data originally came from, it becomes difficult to re-use data from multiple
experiments or projects because they are almost always contained in different workbooks.
The above four points are major data management issues in themselves. However, these are only
the issues internal to the department and do not describe an even bigger issue. Each department
will likely have a different approach to data management with Microsoft Excel, resulting in very
little possibility of sharing or combining data between silos.
Microsoft Access
Often seen as a step up from Excel, sometimes project teams choose to build a local database of
their project data. Typically the database schema is designed by a single IT-minded scientist who
then provides some data entry forms for the rest of the team to use in data entry.
The Access approach has a definite advantage over Excel when the project team wants to analyse
their data: SQL queries are possible, making it easy to join data across what would have been
multiple Excel sheets previously.
There are also a number of disadvantages. The first is that the level of IT knowledge required to
set up and maintain such a system is quite high. As a result, the project team will have less ability
to do science because they are spending time maintaining and extending the Access system (not
part of their job role). This combination of time constraint and limited IT knowledge means that
such systems rarely grow beyond their original project intent, and fall into disuse.
The long-term legacy of such an approach can be a disjointed trail of defunct “project databases”
with inconsistent data structures that no one in the company understands any more.
Laboratory Notebooks
Use of laboratory notebooks within catalyst labs is very common. These are typically paper based,
but the use of electronic laboratory notebooks (ELNs) is increasing. It helps to remember that IP
protection is a primary benefit of the ELN. Notebooks have long been used as primary evidence
in patent disputes and applications.
Therefore, whether on paper or “on glass,” the notebook’s document-centric focus makes it
a good repository for documentary evidence of invention, but a poor repository for catalyst
experiment data.
In some cases, research departments in materials science companies have attempted to make an
ELN their primary knowledge management solution. Success at these companies has been limited
because of the ELN’s shortfalls when it comes to understanding the “structure” of data related to
materials. The shortfalls are:
• Searching an ELN yields a hit list of “documents.” Each document could include data
from anywhere between 1 and 100 materials depending on what the author decided the
experiment to be. There is normally no way of searching for the materials themselves.
• Any data pasted into an ELN typically gets translated to text and its context is lost. As a
result, it becomes a manual effort to extract data back out into an analysable format (e.g.,
Excel). This effort becomes even greater if the data needs to be extracted from multiple ELN
entries, especially when the data has not been recorded consistently.
• Because of the textual nature of the medium, search is commonly limited to text terms,
or a small number of metadata fields chosen at the time that the ELN was commissioned.
Searches like “Catalysts with activity over 90% that have less than 5% platinum” are not
possible.
Therefore, while switching from paper to an ELN can provide a quantum leap in data visibility and
IP protection, it will not on its own provide a complete knowledge management solution.
Laboratory Information Management System (LIMS)
Typically the domain of characterization groups, LIMS are used to manage the throughput of
samples and (in some cases) the registration of analysed data values. Since many characterization
departments operate as a service department within the wider company (with a pseudo
commercial charging system), the LIMS has an important role in managing the monthly “billing”
of customer departments.
In many cases, analytical instruments will produce very detailed data about the catalyst samples.
It is, however, not typical for all this data to be loaded directly into the LIMS. Instead, the raw
data is kept on a shared or local drive and the key values are manually entered into the LIMS. As a
result, there are two tiers of information: the information available to the characterization group
and the information available to their customer.
In practice, traditional LIMS have rarely been successful outside the bounds of the catalyst
characterization department for the following reasons:
• Lack of suitably designed experiments: Preparation typically involves the creation of
multiple similar samples, each with a large set of process variables. LIMS are designed for the
requesting of set methods with very little variation each time.
• Missing multi-step workflow features: Preparation normally involves a number of steps
before a catalyst is created (e.g., Mix, Extrude, Dry, Calcine). LIMS operate with single-step
requests for each method. This makes it cumbersome, as the user must make multiple
requests for each workflow and the LIMS won’t typically connect them.
• Inability to store high-throughput and high-context data: LIMS are typically designed
to store requests and results; the full data needs to be stored elsewhere. This means that
original data is lost over time or disconnected and hard to find. This has a particular impact
on the usefulness of LIMS for testing data.
• Limited data analysis and reporting options: LIMS are typically optimized for producing
“reports per request”—meaning that the requester gets a canned report when their sample
is processed.
Reporting across requests (across experiments or projects) is usually not possible because the
LIMS isn’t design to understand experiment and project relationships.
Therefore, while a LIMS system can satisfy the needs of many characterization departments,
there are shortfalls that make it a poor choice for preparation and testing departments. By
extension, this makes it a poor choice as an overarching knowledge management platform.
Instrument Vendor Systems
Instrument vendors such as Perkin Elmer, Chemspeed and Freeslate commonly choose to develop
the software that controls their hardware in such a way that they share common elements and
form a “platform.” Typically, this means there is a central data store (on the file system or in
a database) and each connected instrument has a “driver” that transfers data to and from the
central data store.
Increasingly, these instrument vendors are being asked to provide drivers for third-party
instruments that are not their own—so that a single instrument vendor’s system can provide a
central data system for a given laboratory. This approach makes a lot of sense for instrument
vendors who know they cannot provide 100% of the hardware solution for a given customer but
can partner with another vendor and provide an integrated software solution.
As a knowledge management system, however, there are a number of drawbacks:
• Motivation: Instrument Vendors’ key motivation is to sell hardware, not software. Therefore,
software updates (including those necessary to keep up with versions of operation systems)
are commonly not available or require costly re-configuration.
• Data Structure: The data structure of the central repository is typically aimed only at data
capture and storage, since that is the instrument hardware’s purpose. However, the data
structure is typically not ideal for extraction for analysis. In many cases, the data model is
designed around a particular type of instrument the vendor sells, and everything else is forcefitted to this model.
• Architecture: Since instrument control is normally only possible via software physically
installed on client PCs, it is common to find elements of an instrument vendor’s solution that
are not web based, which in turn creates an IT maintenance issue.
In conclusion, while an instrument vendor’s system has a definite role in the capture and
collection of raw instrument data, as a candidate for an enterprise-wide knowledge management
solution it has a number of key drawbacks.
Conclusion: Current Approaches
A typical catalyst company will have some or all of the above approaches in use. The most
common setup looks like this:
• Preparation: Notebooks + Excel
• Characterization: LIMS + Shared Drives + Vendor Tools (for raw data analysis)
• Testing: Instrument Vendor system + Excel
From preparation through characterization to testing, a catalyst may appear in multiple different
systems, typically with different IDs. This makes it hard to understand and piece together any
end-to-end workflow in full.
To conclude, current approaches to informatics in catalysis companies do not provide a viable
knowledge management solution. Moreover, none of the systems already in place could be
expanded to meet all knowledge management needs in a single system.
HOW BIOVIA HAS HELPED CATALYSIS CUSTOMERS
BIOVIA has a long history of working with companies engaged in scientific research, including a
good number of catalysis companies. The solutions we have provided cover both the informatics
and modelling and simulation spaces. Over the years, BIOVIA has acquired knowledge of what
makes the catalyst sciences special and has built a number of individual informatics solutions to
meet catalysis needs.
Introducing BIOVIA EKB
Beginning in 2009, BIOVIA launched a concerted effort to build an experiment knowledge
management software framework that was generic enough to be re-usable but configurable
enough to be adapted to meeting specific customer needs. The result was a framework called
BIOVIA Experiment Knowledge Base (EKB), used by BIOVIA Professional Services as a means of
accelerating the implementation of knowledge management systems.
In 2012 BIOVIA decided to make EKB a fully-fledged software product, and BIOVIA EKB was born.
BIOVIA EKB is now a central part of BIOVIA Materials Science and Engineering software suite
along with BIOVIA Notebook, BIOVIA Materials Studio and BIOVIA CISPro for chemical inventory
management.
Special features of BIOVIA EKB for catalysis
The early customers of BIOVIA EKB were catalysis organizations, and as a result, the needs of the
catalysis sector have heavily shaped the out-of-the-box capabilities of the product. In this section
we will explore specifically what those capabilities are.
Experiment Design (Preparation, Characterization, Testing)
BIOVIA EKB allows a project team to define an experiment’s design before executing it. This
is key to driving the best practice of thoughtful and peer-reviewed experiment design before
committing resources to executing the experiment.
An BIOVIA EKB experiment is unique in the breadth of its scope because it encompasses:
• Steps of the experiment (preparation, characterization, testing)
• Process parameters of the steps
• The number and naming of the samples to be created
• The people to be involved in the experiment
• The equipment to be involved in the experiment
• The ingredients to be involved in the experiment
The EKB experiment plan defines the materials, resources and workflow of what is to follow.
Having a capability to plan so many aspects of experiments is important for catalysis because
it means the experiment can bring together departments and disciplines, thereby addressing
one of the special issues discussed in the “What Makes Catalysis Special” section above. BIOVIA
customers using EKB experiment planning see much greater collaboration and understanding
between different disciplines (technicians and scientists) and departmental groups (preparation,
characterization, testing).
Another special feature of the catalysis domain is the very large experimental space. The EKB
experiment plan allows the project team to decide which parameters to vary (the factors) and
which to maintain as fixed for all samples. The user interface presents scientists with a familiar
“design grid” of the chosen factors vs. the samples in the experiment.
The grid can be filled in manually, or as a best practice, by a DoE algorithm. EKB provides Factorial,
Central Composite and Taguchi as out-of-the-box options. Designs from external design programs
such as JMP, Minitab, Statistica and others can also be imported.
Nested Design
Mix
Extrude
Dry
Calcine
Calcine
Extrude
Dry
Calcine
Calcine
Calcine
Dry
Calcine
Calcine
Calcine
Dry
Calcine
Calcine
Calcine
Dry
Calcine
Calcine
Calcine
Dry
Calcine
Calcine
Calcine
Calcine
The ability to plan and design such broad experiments is unique to EKB and is essential in the
multidisciplinary, multi-department catalysis environment.
Branching Workflows (Preparation)
A “branching workflow” is common in catalysis preparation. This means that the preparation
team may make a large number of samples but start with a single mix and gradually split the
material throughout the workflow, treating each subsample differently to finally arrive at the
large number of planned samples with minimal effort.
Example of a branching workflow:
• Mix a big batch (1 sample)
• Extrude half with Die #1, the other half with Die #2 (2 samples)
• Dry at 250, 260, 270 and 280°C (8 samples)
• Calcine for 30, 40, 50 and 60 minutes (32 samples)
EKB supports branching workflows as part of the experiment plan by allowing users to define
“split steps” between the steps of the experiment. A visual representation of the branching
workflow as a tree is provided to help users understand the plan:
Screenshot of an EKB nested
design tree for a typical catalyst
preparation workflow starting
with 1 sample and finishing
with 32
Importantly, EKB keeps track of the unique identifiers of each of the samples in the plan—
providing label printing facilities where needed. It also understands the parent/child nature of the
split—that data properties recorded for parent samples are relevant to the child and grandchildren.
The ability to plan branching workflows and to capture the genealogy of samples as they are
split is unique to EKB. It is a vital piece needed to ensure the usability of the system in catalysis.
Technician Instructions (Preparation, Testing)
The existence of an experiment plan at the outset of experimentation also enables another very
important feature for catalysis groups: The translation of the plan to actionable instructions.
When part of an experiment plan is ready to execute, EKB produces a printable instruction sheet
for the technician to reference in the lab when performing the work. The clarity and consistency
of these instructions is a major improvement for many technicians BIOVIA has worked with,
saving them time and drastically reducing instances of misunderstanding.
When technicians have specific needs, EKB can also be configured to provide customized
instructions.
Screenshot of an example EKB
instruction sheet—this one to
assist the technician with a fixed
reactor bed layout.
For example, if scientists like to plan experiments in terms of ingredient proportions (percentages)
and technicians like to execute experiments in terms of absolute amounts (grams), then the
instruction sheet can perform the necessary calculations to convert from one to the other.
By extension, this means that EKB can also automate more complex calculations that are
normally the domain of the technician’s special private Microsoft Excel worksheets. Common
examples include determining impregnation concentrations to achieve a specific metal loading, or
calculating out the times of day that reactor conditions need to be changed for a given reactor run.
Experiment Execution and Tracking
The EKB experiment plan becomes the execution plan. At a glance, the project team can see
exactly where each sample is and what needs to be done next.
Screenshot of a typical EKB
experiment during execution.
This experiment has 4 samples
and 6 steps. The dashboard
shows that currently the first 4
preparation steps are complete
and the 5th step is partially
complete.
This has a very positive impact on team efficiency, especially when the experiment execution
spans multiple departments or sites. Time is saved through reduced communication time and
fewer chase-up phone calls and emails.
This kind of end-to-end workflow planning and execution capability is unique to EKB. As discussed
in the earlier LIMS section, a LIMS is designed around the idea of individual requests and therefore
lacks the ability to connect requests to this kind of multi-step, multi-sample workflow.
Of course, not everyone involved is interested in the end-to-end workflow. For example,
characterization technicians prefer to see a view of the world through a per-instrument, permethod or per-laboratory lens. For these users, EKB provides different views showing the queue
of work appropriately filtered.
EKB’s “laboratory view,”
showing activity as a filtered list
of actionable tasks.
Integration to LIMS (Characterization)
Through its “laboratory view,” EKB can play the part of a traditional LIMS well. However, it is
commonly undesirable to replace a characterization group’s existing LIMS with EKB. This is
particularly the case in catalysis environments where the characterization team may be serving
many other types of customers not using EKB.
In these cases, EKB can be integrated with an existing LIMS so that EKB injects requests into the
LIMS on behalf of its users. EKB can then monitor the LIMS and pull result data out of the LIMS
when the request is completed (automatically progressing the EKB workflow view).
This arrangement provides the best of both worlds. The characterization team can continue as
before, and EKB will still contain the complete end-to-end data of the samples—all without the
user having to make manual uploads or interventions.
Automated and Semi-automated Data workup (Characterization, Testing)
One of the special features of catalysis covered earlier is that the nature of data is very broad when
compared to other areas of research.
One particularly tricky area is experimental steps that produce high content data, meaning a
rich amount of raw data is produced which must then be interpreted to create final result data.
Commonly, this kind of “data workup” activity is manual and time-consuming and can consume
a large proportion of a technician’s or scientist’s time.
EKB helps in this regard by adding a custom “validation wizard” through which users and data are
funnelled at the end of the step (after the raw data is uploaded into EKB).
Validation wizards are always user-centric and customer-specific. Because they are authored
using BIOVIA Pipeline Pilot they can be quick to implement and very powerful as they enable
users to leverage the BIOVIA component collections including:
• Reporting Collection: For creating tools that present the data to the user in the most
appropriate format (tables, plots, charts)
• Interactive Reporting Collection: Enabling the user to interact with tables, plots and charts
to select, highlight or annotate data
• Analytical Instrument Collection: Ideal for processing XRD and NMR spectra
• Imaging Collection: For automatically evaluating a wide range of image types, including SEM
and TEM
• Data Modelling Collection: Providing many common statistical components for automatic
determination of aggregated or calculated result data
• Integration Collection: Connecting and presenting data from separate files or databases so
the user doesn’t need to manually look it up
Validation wizards are a unique feature of EKB that enable a very high level of workflow
integration and automation. BIOVIA’s wealth of tools and prior experience mean most validation
wizards can be assembled quickly.
Time Series Visualization (Testing)
Performance testing of catalysis almost always involves running the catalysis in some kind of
reactor. Research reactors exist at a number of different scales (Nano-tube, micro-tube, bench
scale) and come in low throughput and high throughput (8, 16, 32 reactor) varieties. In all cases,
the reactor rigs will produce a wealth of “time-series data” through a collection of sensors which
report values at given intervals.
As discussed in the earlier “What Makes Catalysis Special” section, the nature of data in catalysis
is a particular challenge. It is especially challenging when it comes to time-series data because:
• It is voluminous (always high content, sometimes also high throughput)
• Commonly there is “offline” data (analytical subsamples) to be merged
• Often there are “online” analyses happening on separate instruments whose data needs to
be merged (for example GC, GCMS, LCMS of product or feed)
• The data requires interpretation to be useful
• Interpretation is objective specific, i.e., interpreting for activity is different from interpreting
for selectivity.
In practice, dealing with online and offline data collection and aggregation from rigs can be very
time consuming, easily occupying more than 50% of a testing technician’s time.
EKB helps the technician in a number of regards:
• Automatically collecting online data from source (historian, rig database, file system)
• Automatically collecting offline data (from LIMS or EKB sub-steps)
• Merging online and offline data
• Doing the above periodically during the test run, so as to provide a “monitor” function
• Permanently storing the collected raw data (because most historian systems do not)
Once the data is merged, the next job is to interpret it. This is normally done by the scientist, using
a tool like Microsoft Excel. Because the task is largely manual, there is usually not time to interpret
the data beyond the needs of the current project. As a result, the data may only be interpreted
for one performance measure.
As discussed earlier, one of the special issues in catalysis is the shifting of performance objectives
over time. This makes it vital that performance tests are interpreted for multiple performance
measures so that there is a greater chance of reusing results in future projects.
EKB steps in at this point by making it easy to analyse for multiple performance objectives
through its Time Series Data Visualization Applet.
EKB’s Time Series Visualization
Applet showing values of 2
activity measures during an 8
condition reactor run (where flow
rate is progressively decreased.
EKB’s Time Series Visualization
Applet showing selectivity to 2
products during the same 8
condition run.
The example screenshots above show charts that help the scientist interpret activity and
selectivity. The charts are pre-defined as part of the configuration of EKB according to the
requirements of the scientists. When using the tool, the scientist can toggle between these views
quickly and easily—a fraction of the time that would otherwise be taken to paste data into an
Excel spread sheet and adjust Excel chart settings and data ranges to suit the run.
Also shown on the plot are the offline data points from periodic characterizations of the reaction
product (larger crosses and triangles on the activity plot).
The visualization applet supports the visualization of multiple concurrent reactors and provides
the ability to quickly zoom and pan through the data. The applet also marks data points as
“invalid” if they appear to be misreported.
Per-Condition data aggregation (Testing)
Given that reactor tests are expensive to set up and time-consuming to run, it is common practice
in catalyst performance tests to vary the conditions of the reactor during the run. This practice
considerably increases the amount of information available from a single run. However, the
resulting datasets can be more difficult to interpret.
Consider a typical testing plan:
• Condition -1: Warm up the reactor
• Condition 0: Spike the feed to prime the catalyst for 4 hours
• Condition 1: Low temperature, low flow rate for 3 days
• Condition 2: Low temperature, high flow rate for 3 days
• Condition 3: High temperature, high flow rate for 3 days
• Condition 4: High temperature, low flow rate for 3 days
Assuming we are only scientifically interested in the “testing” conditions (1-4) of the above,
then there are really 4 almost separate experimental results here, one from each condition. At
the beginning of each condition, we can expect a certain amount of “settling” before the catalyst
resumes performing at a steady state.
The EKB Time Series Visualization applet provides the scientist with a point-and-drag means of
visually defining the “stable phases” of the above data. EKB can then automatically calculate
the performance characteristics for multiple performance objectives on a per-phase basis, for
example:
• Activity for Condition 1, 2, 3 and 4
• Selectivity A for Condition 1, 2, 3 and 4
• Selectivity B for Condition 1, 2, 3 and 4
• Longevity for Conditions 1, 2, 3 and 4
The separate per-phase descriptors of performance stored against the catalyst in the EKB
database are available for searching and for use as part of data visualization and mining.
EKB’s validation wizard and EKB’s time series visualization features provide a unique way for
scientists to quickly and consistently process the complex data coming out of their reactor tests.
Not only does this save time over the traditional Excel methodology, but it also results in a larger
number of performance descriptors being calculated—making the test data far more likely to be
reusable in the future.
Data Analysis: Full Featured Catalyst Search
Because EKB spans all planning, execution and data collection activities across preparation,
characterization and testing, it eventually contains a vast wealth of connected and contextualized
data about catalysts.
The breadth of the data allows users to ask and answer questions that previously would have
required a huge forensic effort in piecing together data from different systems. For example:
Date Area(s)
Example Query
Preparation
“Find all catalysts prepared with more than 5% platinum and heated to
more than 1200°C.”
Preparation &
Characterization
“Find all catalysts prepared with more than 5% platinum that had less
than 4.5% platinum when measured by ICP.”
“Find all catalysts heated to more than 1200°C which have more than 20%
micropores.”
Characterization Testing
“Find all catalysts with less than 25% micropores that had an activity of
over 95%.”
Preparation & Testing
“Find all Catalysts prepared with less than 5% by volume of active metal that
had an activity of over 80%.”
Testing
“Find all catalysts that had selectivity over 90%, an activity over 90% in a
condition where the temperature was less than 350°C.”
Preparation,
Characterization & Testing
“Find all catalysts not prepared with Additive B that had a crush strength
of over 100lb. that had a performance drop-off of less than 1% per month.
This capability allows users to ask the “have we ever made...” questions they have always wanted
to ask. No other knowledge management system can provide this breadth of data to search on.
Getting the answers to questions such as this gives project teams using EKB a big advantage
when starting a new project because they can effectively use and review past results to give them
a head start.
Screenshot of EKB’s search query
builder. The example query joins
data from preparation,
characterization and testing
Data Analysis: Catalyst Data Visualization and Data Mining
As well as providing a very full featured search engine, EKB also offers a visualization engine that
can present catalyst preparation, characterization and testing data in almost any way.
As with validation wizards, EKB leverages BIOVIA Pipeline Pilot to author visualization plugins
for EKB. BIOVIA Pipeline Pilot has a wealth of relevant component collections which makes
assembling these visualization plugins very cost effective:
• Reporting Collection: Scatter plots, histograms, bar charts, radar plots, pivot tables and more
• Interactive Reporting Collection: Linked plots and tables for highlighting and data exploration
• Advanced Data Modelling Collection: Builds predictive models from experimental data using
a number of statistical approaches, identifying clusters and similarities in experimental data
• R-Statistics: Access to all the features of the popular statistics package, R, including
visualizations such as correlation matrices and pairs plots
Visualization examples of EKB
catalysis data. Left is a
correlation Matrix generated by
the R-Statistics Collection. Right
is a Scatter Plot generated for
one of the highlighted
correlations using the BIOVIA
Enterprise Platform’s Reporting
Collection.
EKB’s flexible visualization model brings the best of two worlds together: easy access to data
visualization for the average user, and the possibility of advanced custom visualization for
advanced users.
Conclusion: Special Features of EKB for Catalysis
Catalysis presents special informatics challenges that current approaches do not support. BIOVIA
is committed to providing the catalysis market with knowledge management solutions that
address these special challenges. EKB is already in production at 3 of the top 10 catalyst research
companies, and their users are already reaping the benefits of a strong and future-proofed
knowledge management strategy.
APPENDIX
List of catalysis-specific methods
BIOVIA already has a wealth of experiments that is useful for configuring EKB to suit the needs
of a catalysis research group. The BIOVIA Professional Services team has implemented many
“methods” including:
Preperation
Characterization
Testing
[Catalyst Synthesis]
[Catalyst Characteristics]
[Low Throughput]
Mix
Impregnate
Precipitation
HT Impregnate (e.g.
Chemspeed)
Extrude
Pellet / Compact
Mill / Grind
Sieve / Filter
Dry / Calcine / Heat Treat
High Throughput
Precipitation
Mercury Porosimetry (Micromeritics)
Nitrogen Porosimetry (Micromeritics)
ICP
XRD (including 2D XRD)
SEM, TEM
Pellet Dimensions (Manual / Automated)
Crush Strength
TGA / DSC
Particle Characterization
Chemisorption
Physisorption
Densitometry
Zeta Potential analysis
Custom Benchscale (1,2,4
reactors)
Pi connected rigs
Aspen connected rigs
[Local Characterizations]
[Product Characteristics]
[Online Analysis]
Loss on Ignition
Water Pore Volume
Density
Viscosity
Cloud point, Flash Point
High Temp Simulated Distillation
GC
GCMS
LCMS
[High Throughput]
Screening (Freeslate)
Avantium Florence
HTE
Amtech
Custom High throughput
rigs (8, 16, 32)
[Product Composition]
Sulphur and Nitrogen Analysis
GCMS
IR/NIR
Total Acid Number
Asphaltenes
Aromatics
PIANO
HCONS
MCR
..and many more
List of instrument vendors
In prior EKB projects, BIOVIA has worked with a wide range of vendors including:
Equipment Vendors
HTE
Avantium
Amtech
Freeslate
Chemspeed
Zinsser
Agilent
Micromeritics
Brucker
Horiba
Dillon
Leco
Microtrac
Netzsch
EKB ROI for Catalysis
The diagram below shows the expected benefits a typical EKB catalysis research operation over
time. The biggest benefits come from the presence of a well filled knowledge management
system, which can take 2-4 years to achieve.
The diagram also illustrates many other immediate benefits that come more immediately which,
when taken together, add up considerably. EKB’s catalyst-specific features provide a number of
benefits like this. The value of even just these near-term benefits is normally more than sufficient
to justify the cost of purchasing, installing and configuring an EKB system.
Typical EKB ROI model
Tangibles
Time Saving
with EKB
(hours)
How is the time/money saved?
1. Searching for a sample’s data
1
Searching for a single sample in a paper- based
system takes about this time on average. In EKB
this takes no time at all.
2. “Repeating” a sample
4
If the average sample takes X hours, then X is the
time saved per sample not repeated.
3. Communication about
samples/experiments inside
team per day
0.5
Time spent and saved in a day by team member
communicating about samples and experiments
4. Data upload
.25
Time taken to manually transcribe data from
data set
5. Data Workup
.3
Time taken to manually work up the data from a
piece of equipment (normally manual Excel)
6. Creating cross-experiment
reports/visualizations per week
4
Time that would otherwise have been used
manually creating reports
7. Creating cross-project reports/
visualizations per week
6
Time that would otherwise have been used
manually gathering data and manually creating
reports
In the example above, very conservative estimates are placed on the time saved for each of the
tangible benefits. However, when multiplied through for this 50-user system, the benefits add
up greatly:
Tangibles
$100 [Example]
Number of samples executed per week
50 [Example]
Average raw material cost of single sample
$10 [Example]
Number of EKB users
50 [Example]
Tangibles
Answer Benefit Per
Year($)
1. Number of sample searches per week
10
52,000
2. Estimated “repeated” samples per week
5
106,600
3. Time spent communicating about samples/experiments inside team
per day
1
650,000
4. Number of equipment data sets to upload per sample
3
195,000
5. Number of equipment data sets to analyse per sample
3
234,000
6. Number of canned cross-experiment reports/visualizations per week
5
104,000
7. Number of cross-project searches + reports or visualizations per week
2
62,400
TOTAL
©2014 Dassault Systèmes. All rights reserved. 3DEXPERIENCE, the Compass icon and the 3DS logo, CATIA, SOLIDWORKS, ENOVIA, DELMIA, SIMULIA, GEOVIA, EXALEAD, 3D VIA, BIOVIA and NETVIBES are commercial trademarks
or registered trademarks of Dassault Systèmes or its subsidiaries in the U.S. and/or other countries. All other trademarks are owned by their respective owners. Use of any Dassault Systèmes or its subsidiaries trademarks is subject to their express written approval.
Fully loaded FTE cost per hour
$1.4M
The ROI model and inputs for each catalyst department will obviously vary, but this illustration
shows how the savings from an EKB system can add up significantly even when only considering
some of the immediate-term benefits of the system.
Our 3DEXPERIENCE Platform powers our brand applications, serving 12 industries, and provides a rich
portfolio of industry solution experiences.
Dassault Systèmes, the 3DEXPERIENCE Company, provides business and people with virtual universes to imagine sustainable innovations. Its world-leading
solutions transform the way products are designed, produced, and supported. Dassault Systèmes’ collaborative solutions foster social innovation, expanding
possibilities for the virtual world to improve the real world. The group brings value to over 170,000 customers of all sizes in all industries in more than 140
countries. For more information, visit www.3ds.com.
Dassault Systèmes Corporate
Dassault Systèmes
175 Wyman Street
Waltham, Massachusetts
02451-1223
USA
BIOVIA Corporate Americas
BIOVIA
5005 Wateridge Vista Drive,
San Diego, CA
92121
USA
BIOVIA Corporate Europe
BIOVIA
334 Cambridge Science Park,
Cambridge CB4 0WN
England
WP-5506-1114