BIOVIA EKB FOR CATALYSIS KNOWLEDGE MANAGEMENT FOR THE CATALYST SCIENCES WHITE PAPER EXECUTIVE SUMMARY Knowledge management issues in the catalyst sciences are as acute as they are unique. The largely unaddressed need for effective knowledge management runs so deep that many companies do not realise it is there until a key scientist retires, leaving behind a huge gap in corporate knowledge filled only by a set of incomprehensible and disconnected spread sheets. Meanwhile, scientists elsewhere amass huge libraries of “single use” data dispersed in disconnected or inaccessible stores. This document discusses what makes the knowledge management challenge of catalysis unique and why informatics is such an important part of the solution. We also explore how the right knowledge management solution can also provide immediate benefits that not only streamline the execution of new product development but also reduce the number of experiments required to achieve new product innovation. We introduce a new informatics product, BIOVIA EKB, which includes a number of catalysisspecific features that have made it the knowledge management solution of choice for 3 out of the top 10 catalysis companies. WHAT MAKES CATALYSIS SPECIAL? Overwhelmingly large experimental space On first inspection, the synthesis of a catalyst may look to an outsider like any other type of general formulation: a mixture of ingredients needing optimization of the relative amounts of those ingredients. The axioms of such a scenario resolve to a simple relationship between formulation and business value. For example: • Increase the amount of active ingredient for a more active product • Replace one active ingredient with another to achieve a cheaper product with the same performance. On closer inspection though, this simple model breaks down in the case of catalysis because there are so many other factors that significantly impact overall product performance: • Active Ingredients: The materials that make the catalyst active. How much of them there is, and where in the catalyst they are. • Supplemental Ingredients: Things that are used in preparation but are not present in the final product (but which might change the physical characteristics of the product). • Process and parameters: Temperatures, durations, die sizes, pressures, addition orders, addition rates, stir speeds, stir types: the list is very long. • Reactor Conditions: Temperature, pressure and flow rate of materials into and out of the reaction. The requirement is to understand how different combinations of these impact catalyst performance. • Feedstock Range: Many catalysis applications need to deal with variation in feedstocks which may contain elements that degrade or damage the catalyst over time. • Reactor Design: Geometry and size of target reactors. The need is to understand their impact on catalyst performance Taken together, these factors present a huge experimental space to cover (meaning that the number of combinations of factors is very large). There is also a very complex connection between these factors and their eventual impact on product performance. The simple axioms of the standard formulation problem do not apply. This diversity of input factors presents both a big challenge but also a big opportunity; any company able to properly understand how these factors interact to give final product performance characteristics will be in a position to outperform its competitors by a considerable margin. Moreover, greater understanding of how these factors interact gives a company a distinct competitive advantage. The challenge this large “experimental space” presents means that catalyst science often feels more like “art” than science. Indeed, many senior scientists, when asked why they designed an experiment a certain way answer that “it felt right.” Shifting optimization targets Optimization is the process of finding the combination of preparation variables that achieves the maximum value for a given performance measure. As with preparation variables, in catalysis the number of potential performance variables for assessing a catalyst is also large. Here are some examples • Activity: The efficiency of desired product production, usually measured at a given temperature or energy level • Selectivity: Proportion of the “favoured” product in the output of the reaction versus other reaction products • Longevity: How long the catalyst is likely to last when run at given conditions (feed, pressure, temperature) and how performance is likely to drop off • Robustness: How sensitive is the catalyst to variations (feed, pressure, temperature, breaks in production) In catalysis, scientists are typically interested in more than one of these performance characteristics over time. This is because the needs of the market shift continuously, driven by factors such as economic growth, raw material costs, regulation changes and many other factors. In an ideal world, a scientist who has recently become interested in the selectivity of a certain group of catalysts previously prepared for activity tests would be able to trivially re-analyse the data already collected. In practice, data collected for activity tests commonly includes only activity data-making it necessary to repeat experiments. To understand the statistical connections between these performance characteristics and the preparation variables, one needs to maintain a consistently measured and complete set of preparation variables and performance measures. Interdisciplinary and Interdepartmental nature of work Catalysis research typically involves three or more distinct groups of people working together. • A preparation team typically creates relatively large numbers of research samples. • These are then passed to a characterization team which performs standard tests to measure or confirm the physical and chemical characteristics of the samples. • The samples are then passed to a testing team responsible for determining the performance characteristics of the catalysts. In many cases the picture is made more complex by other factors. • Preparation teams often undertake some of their own characterizations. • Commonly, multiple characterization teams exist with different capabilities (and perhaps at different sites). • Testing teams are sometimes split according to the “scale” of the test (screen / high throughput / low throughput / pilot). • A common distinction between “scientists” and “technicians” within preparation and testing teams creates a division within the group. The number of organizational boundaries present in catalysis environments is unusually high when looked at in the context of other research environments. In most cases, there is no unified system of coordination (other than email and instant messages) between all of these silos, resulting in huge information loss between these departments. As we have already seen, the need to consistently record everything throughout the catalyst’s life-cycle is critical to the success of any knowledge management strategy. Yet, a single catalyst sample may well pass through three or more organization boundaries in its lifetime. Nature of Catalysis data Catalysis provides a unique challenge in terms of the diversity of its data. It produces: • Low throughput, low content data: e.g., manual operations done by hand such as measuring LOI or the dimensions of a sample • Low throughput, high content data: e.g., a bench scale reactor logging 25-75 sensor readings every 30 seconds, or an XRD instrument producing high resolution spectra data • High throughput, low content data: e.g., a screening reactor, measuring pure catalyst activity for 24-96 samples in parallel • High throughput, high content data: e.g., a high throughput rig with 16-32 parallel reactors logging 25+ sensor readings every 30 seconds or more, often with online product characterization instruments (LCMS, GCMS) recording data at a lower frequency Outside of catalysis, it is normal to find one or two of these; it would be very unusual to see all four combinations at once. Since each different type of data requires a different approach to data collection and storage, there is normally no single system which adequately deals with all four. However, in catalysis it is critical to be able to piece all of these different types of data together in order to get the most complete picture of the catalysts being made and tested in the lab. Diversity of Instrumentation Related to the nature of data issue, catalysis also involves highly diverse instrumentation in the making and testing of products. Unlike many other industries, there is no standard “set” of instruments that all companies use. Part of the reason for this is that there is still quite a lot of innovation going on in instruments that can prepare and test catalysts, particularly in the high throughput area. The high rate of innovation means that the instrumentation set a company has is largely driven by the budget to purchase instrumentation and the availability of instruments at the time. The net result of this is that the instrument-driven working practices at each company are quite different. There is complexity at every level, and often very little industry-wide consensus on the best way to run certain types of instruments. It also means that thus far it has been impossible for any information systems vendor to create a definitive software suite that meets the needs of all catalysis companies. Conclusion: Why is catalysis special? Catalyst science presents a special challenge to knowledge management because the experimental space is very large and the optimization targets change over time. Probably as a result of these factors, companies engaged in catalyst development have created departmental boundaries that facilitate “single task efficiency” but impede knowledge sharing. Whereas in other research areas BIOVIA Laboratory Information Management System and BIOVIA Enterprise Lab Notebook systems have been able to step in and fill the knowledge sharing gap, in Catalysis success has been very limited with these types of systems because the nature of catalysis data and instrumentation is highly variable. WHY KNOWLEDGE MANAGEMENT IS IMPORTANT IN CATALYSIS Sustainability Knowledge management is of particular importance to a catalysis company because, up until now, so much of the work of a catalyst scientist has been more like “art” than science. This means a number of things, including that it is not possible to reconstitute the knowledge of a senior scientist if he or she retires or leaves the company. Therefore, knowledge management is a sustainability issue as well as a business continuity issue. Most knowledge-driven companies are intrinsically aware of this issue, but don’t realise the detrimental and long-term impact knowledge loss has until it happens. The role of Informatics in Knowledge Management Knowledge-driven companies are not generally aware that something can be done about this issue, but it requires a concerted strategic effort to bring about the changes required to move from “art” back into the realms of science. The diagram above shows how the interconnectedness of data leads it to be more applicable to more situations and, as a result, more valuable to the organization. The understanding of relationships, patterns and principles are the things that move an organization up the knowledge ladder. These advances happen inside the minds of scientists over years of experimentation, becoming manifest as “tacit knowledge.” This tacit knowledge is typically referred to as the “art” part of their jobs and is very hard to transfer to others. So what role can informatics play? It is critical to note that the upward transitions in the diagram above rely on first having the data and then having the tools to connect the data together in ever more abstract ways. This combination of storage and tools is something that can also be delivered by an informatics system. Importantly, if an informatics system is helping scientists understand patterns and principles, then it is being done in a sustainable way because the analyses and connections made within the system are reproducible and can be recorded. An informatics system also has the potential to be much faster at spotting patterns (and with less data) than an individual scientist can. A complete knowledge management strategy can move a research organization quickly up the “knowledge ladder.” In the next section, we look at the current approaches to knowledge management that are common in catalysis companies today. Speed of innovation For our purposes, we will define “innovation” as the generation of new knowledge within the corporation. This new knowledge is of course intended to lead directly to product innovation, meaning getting a new product to market gives a competitive advantage. Most research companies describe their innovation pipeline using some variation of the following graphic: In catalysis, a very large amount of effort goes into the “Development” part of the pipeline. This is where teams are producing hundreds (if not thousands) of research samples per year which are then characterized and tested. Typically, the results of the characterization and testing then feed back into another cycle of preparation—and so the loop goes on until the project goals are met. Clearly, the number of experimental cycles is a key driver of the overall length of time required for a project to meet its goals. Reducing the number of cycles therefore has a positive impact on the speed of innovation. Critically, a proper knowledge management strategy can help the project team reduce the number of cycles by providing: • A head start: Easy access to data from previous projects (assuming the data has been recorded with enough context to be useful in the new project) enables a project lead to reuse knowledge from previous experiments when designing current project plans. • Richer datasets for analysis: Teams can draw in relevant data from other projects (past and present) to help support decision making about the next cycle of experimentation. More data normally leads to better decisions. • Better designed experiments: Teams can learn from past mistakes (what didn’t work before) or by investigating things noticed in previous generations (promising leads). An ideal knowledge management strategy will provide scientists with tools that help them make use of statistical design of experiments (or DoE) to optimize the amount of new knowledge they gain with each round of experimentation. Efficiency of innovation As well as reducing the number of innovation cycles, a knowledge management system can also help reduce the time taken by each of the innovation cycles. While this is not strictly the direct intent of a knowledge management strategy, it is a very welcome additional benefit of deploying one. A Knowledge Management (KM) system can accelerate experiment cycle execution by facilitating: • All data in one place: A project team no longer has to spend time searching around in different places for information. • Clear, consistent plans: If the KM system encompasses the planning of experiments, then the technicians get a clear and consistent picture of what is expected, regardless of which scientist or department the requests come from. Time is saved through less back-and-forth communication. • Team coordination: If the KM system encompasses the tracking of experiment execution, then everyone on the project team knows exactly what has been done and what needs to be done next. This saves time because everyone knows what they are doing on a day-to-day basis. • Automated data analysis: Turning raw data into results for a given experiment step often involves many manual steps. A KM system that encompasses data workup will provide tools that speed this up, with the benefit of ensuring consistency in how the data is treated after an experimental step. • Improved collaboration: If the KM system includes “social” features, then team members can write and share thoughts and ideas in a way likely to result in smart experimentationThese are benefits that arrive early in the implementation of a KM strategy and often have justification value of their own. Such short-term benefits are important milestones and proof points in the execution of any long-term strategy. Indeed, they alone commonly form the basis of the tangibles in Return on Investment (ROI) calculations made by companies when making purchasing decisions. CURRENT APPROACHES TO KNOWLEDGE MANAGEMENT Data collection and analysis systems at catalysis companies are commonly a patchwork that contains large gaps in any catalyst’s end-to-end workflow. This is partly due to the large number of organizational boundaries and the natural tendency for each department to have a separate approach to data collection and management. Common approaches are: • Microsoft Excel • Notebooks • Laboratory Information Management Systems (LIMS) • Instrument Vendor Systems We discuss each of these in turn: Microsoft Excel Excel usage in its simplest manifestation involves users keeping worksheets on their own personal drives in a way that is organized and formatted for them, but not for any other person. The accessibility and understand-ability of such data is questionable indeed. As a result, organizations typically attempt to standardize on a location and format for spread sheet data. Locations used are typically shared network drives or document management systems like Microsoft SharePoint. In practice, standardization and centralization of Microsoft Excel is only a small improvement on individual file management and the relative payback on the additional maintenance effort is small. This is because: • The spread sheet formats change over time. This makes it impossible to compare past data with present. • The flexibility of Excel means ‘inventive’ users can easily introduce their own columns, coloring and organization • The repository grows rapidly, resulting in an overwhelming number of files and directories. This commonly leads individuals to take private copies of the data back to their personal drives so they can find it easily—of course defeating the purpose of centralization. • There is no effective way to search Microsoft Excel data when it is distributed across many files. Since Excel files are normally organized along the lines of the projects or experiments that the data originally came from, it becomes difficult to re-use data from multiple experiments or projects because they are almost always contained in different workbooks. The above four points are major data management issues in themselves. However, these are only the issues internal to the department and do not describe an even bigger issue. Each department will likely have a different approach to data management with Microsoft Excel, resulting in very little possibility of sharing or combining data between silos. Microsoft Access Often seen as a step up from Excel, sometimes project teams choose to build a local database of their project data. Typically the database schema is designed by a single IT-minded scientist who then provides some data entry forms for the rest of the team to use in data entry. The Access approach has a definite advantage over Excel when the project team wants to analyse their data: SQL queries are possible, making it easy to join data across what would have been multiple Excel sheets previously. There are also a number of disadvantages. The first is that the level of IT knowledge required to set up and maintain such a system is quite high. As a result, the project team will have less ability to do science because they are spending time maintaining and extending the Access system (not part of their job role). This combination of time constraint and limited IT knowledge means that such systems rarely grow beyond their original project intent, and fall into disuse. The long-term legacy of such an approach can be a disjointed trail of defunct “project databases” with inconsistent data structures that no one in the company understands any more. Laboratory Notebooks Use of laboratory notebooks within catalyst labs is very common. These are typically paper based, but the use of electronic laboratory notebooks (ELNs) is increasing. It helps to remember that IP protection is a primary benefit of the ELN. Notebooks have long been used as primary evidence in patent disputes and applications. Therefore, whether on paper or “on glass,” the notebook’s document-centric focus makes it a good repository for documentary evidence of invention, but a poor repository for catalyst experiment data. In some cases, research departments in materials science companies have attempted to make an ELN their primary knowledge management solution. Success at these companies has been limited because of the ELN’s shortfalls when it comes to understanding the “structure” of data related to materials. The shortfalls are: • Searching an ELN yields a hit list of “documents.” Each document could include data from anywhere between 1 and 100 materials depending on what the author decided the experiment to be. There is normally no way of searching for the materials themselves. • Any data pasted into an ELN typically gets translated to text and its context is lost. As a result, it becomes a manual effort to extract data back out into an analysable format (e.g., Excel). This effort becomes even greater if the data needs to be extracted from multiple ELN entries, especially when the data has not been recorded consistently. • Because of the textual nature of the medium, search is commonly limited to text terms, or a small number of metadata fields chosen at the time that the ELN was commissioned. Searches like “Catalysts with activity over 90% that have less than 5% platinum” are not possible. Therefore, while switching from paper to an ELN can provide a quantum leap in data visibility and IP protection, it will not on its own provide a complete knowledge management solution. Laboratory Information Management System (LIMS) Typically the domain of characterization groups, LIMS are used to manage the throughput of samples and (in some cases) the registration of analysed data values. Since many characterization departments operate as a service department within the wider company (with a pseudo commercial charging system), the LIMS has an important role in managing the monthly “billing” of customer departments. In many cases, analytical instruments will produce very detailed data about the catalyst samples. It is, however, not typical for all this data to be loaded directly into the LIMS. Instead, the raw data is kept on a shared or local drive and the key values are manually entered into the LIMS. As a result, there are two tiers of information: the information available to the characterization group and the information available to their customer. In practice, traditional LIMS have rarely been successful outside the bounds of the catalyst characterization department for the following reasons: • Lack of suitably designed experiments: Preparation typically involves the creation of multiple similar samples, each with a large set of process variables. LIMS are designed for the requesting of set methods with very little variation each time. • Missing multi-step workflow features: Preparation normally involves a number of steps before a catalyst is created (e.g., Mix, Extrude, Dry, Calcine). LIMS operate with single-step requests for each method. This makes it cumbersome, as the user must make multiple requests for each workflow and the LIMS won’t typically connect them. • Inability to store high-throughput and high-context data: LIMS are typically designed to store requests and results; the full data needs to be stored elsewhere. This means that original data is lost over time or disconnected and hard to find. This has a particular impact on the usefulness of LIMS for testing data. • Limited data analysis and reporting options: LIMS are typically optimized for producing “reports per request”—meaning that the requester gets a canned report when their sample is processed. Reporting across requests (across experiments or projects) is usually not possible because the LIMS isn’t design to understand experiment and project relationships. Therefore, while a LIMS system can satisfy the needs of many characterization departments, there are shortfalls that make it a poor choice for preparation and testing departments. By extension, this makes it a poor choice as an overarching knowledge management platform. Instrument Vendor Systems Instrument vendors such as Perkin Elmer, Chemspeed and Freeslate commonly choose to develop the software that controls their hardware in such a way that they share common elements and form a “platform.” Typically, this means there is a central data store (on the file system or in a database) and each connected instrument has a “driver” that transfers data to and from the central data store. Increasingly, these instrument vendors are being asked to provide drivers for third-party instruments that are not their own—so that a single instrument vendor’s system can provide a central data system for a given laboratory. This approach makes a lot of sense for instrument vendors who know they cannot provide 100% of the hardware solution for a given customer but can partner with another vendor and provide an integrated software solution. As a knowledge management system, however, there are a number of drawbacks: • Motivation: Instrument Vendors’ key motivation is to sell hardware, not software. Therefore, software updates (including those necessary to keep up with versions of operation systems) are commonly not available or require costly re-configuration. • Data Structure: The data structure of the central repository is typically aimed only at data capture and storage, since that is the instrument hardware’s purpose. However, the data structure is typically not ideal for extraction for analysis. In many cases, the data model is designed around a particular type of instrument the vendor sells, and everything else is forcefitted to this model. • Architecture: Since instrument control is normally only possible via software physically installed on client PCs, it is common to find elements of an instrument vendor’s solution that are not web based, which in turn creates an IT maintenance issue. In conclusion, while an instrument vendor’s system has a definite role in the capture and collection of raw instrument data, as a candidate for an enterprise-wide knowledge management solution it has a number of key drawbacks. Conclusion: Current Approaches A typical catalyst company will have some or all of the above approaches in use. The most common setup looks like this: • Preparation: Notebooks + Excel • Characterization: LIMS + Shared Drives + Vendor Tools (for raw data analysis) • Testing: Instrument Vendor system + Excel From preparation through characterization to testing, a catalyst may appear in multiple different systems, typically with different IDs. This makes it hard to understand and piece together any end-to-end workflow in full. To conclude, current approaches to informatics in catalysis companies do not provide a viable knowledge management solution. Moreover, none of the systems already in place could be expanded to meet all knowledge management needs in a single system. HOW BIOVIA HAS HELPED CATALYSIS CUSTOMERS BIOVIA has a long history of working with companies engaged in scientific research, including a good number of catalysis companies. The solutions we have provided cover both the informatics and modelling and simulation spaces. Over the years, BIOVIA has acquired knowledge of what makes the catalyst sciences special and has built a number of individual informatics solutions to meet catalysis needs. Introducing BIOVIA EKB Beginning in 2009, BIOVIA launched a concerted effort to build an experiment knowledge management software framework that was generic enough to be re-usable but configurable enough to be adapted to meeting specific customer needs. The result was a framework called BIOVIA Experiment Knowledge Base (EKB), used by BIOVIA Professional Services as a means of accelerating the implementation of knowledge management systems. In 2012 BIOVIA decided to make EKB a fully-fledged software product, and BIOVIA EKB was born. BIOVIA EKB is now a central part of BIOVIA Materials Science and Engineering software suite along with BIOVIA Notebook, BIOVIA Materials Studio and BIOVIA CISPro for chemical inventory management. Special features of BIOVIA EKB for catalysis The early customers of BIOVIA EKB were catalysis organizations, and as a result, the needs of the catalysis sector have heavily shaped the out-of-the-box capabilities of the product. In this section we will explore specifically what those capabilities are. Experiment Design (Preparation, Characterization, Testing) BIOVIA EKB allows a project team to define an experiment’s design before executing it. This is key to driving the best practice of thoughtful and peer-reviewed experiment design before committing resources to executing the experiment. An BIOVIA EKB experiment is unique in the breadth of its scope because it encompasses: • Steps of the experiment (preparation, characterization, testing) • Process parameters of the steps • The number and naming of the samples to be created • The people to be involved in the experiment • The equipment to be involved in the experiment • The ingredients to be involved in the experiment The EKB experiment plan defines the materials, resources and workflow of what is to follow. Having a capability to plan so many aspects of experiments is important for catalysis because it means the experiment can bring together departments and disciplines, thereby addressing one of the special issues discussed in the “What Makes Catalysis Special” section above. BIOVIA customers using EKB experiment planning see much greater collaboration and understanding between different disciplines (technicians and scientists) and departmental groups (preparation, characterization, testing). Another special feature of the catalysis domain is the very large experimental space. The EKB experiment plan allows the project team to decide which parameters to vary (the factors) and which to maintain as fixed for all samples. The user interface presents scientists with a familiar “design grid” of the chosen factors vs. the samples in the experiment. The grid can be filled in manually, or as a best practice, by a DoE algorithm. EKB provides Factorial, Central Composite and Taguchi as out-of-the-box options. Designs from external design programs such as JMP, Minitab, Statistica and others can also be imported. Nested Design Mix Extrude Dry Calcine Calcine Extrude Dry Calcine Calcine Calcine Dry Calcine Calcine Calcine Dry Calcine Calcine Calcine Dry Calcine Calcine Calcine Dry Calcine Calcine Calcine Calcine The ability to plan and design such broad experiments is unique to EKB and is essential in the multidisciplinary, multi-department catalysis environment. Branching Workflows (Preparation) A “branching workflow” is common in catalysis preparation. This means that the preparation team may make a large number of samples but start with a single mix and gradually split the material throughout the workflow, treating each subsample differently to finally arrive at the large number of planned samples with minimal effort. Example of a branching workflow: • Mix a big batch (1 sample) • Extrude half with Die #1, the other half with Die #2 (2 samples) • Dry at 250, 260, 270 and 280°C (8 samples) • Calcine for 30, 40, 50 and 60 minutes (32 samples) EKB supports branching workflows as part of the experiment plan by allowing users to define “split steps” between the steps of the experiment. A visual representation of the branching workflow as a tree is provided to help users understand the plan: Screenshot of an EKB nested design tree for a typical catalyst preparation workflow starting with 1 sample and finishing with 32 Importantly, EKB keeps track of the unique identifiers of each of the samples in the plan— providing label printing facilities where needed. It also understands the parent/child nature of the split—that data properties recorded for parent samples are relevant to the child and grandchildren. The ability to plan branching workflows and to capture the genealogy of samples as they are split is unique to EKB. It is a vital piece needed to ensure the usability of the system in catalysis. Technician Instructions (Preparation, Testing) The existence of an experiment plan at the outset of experimentation also enables another very important feature for catalysis groups: The translation of the plan to actionable instructions. When part of an experiment plan is ready to execute, EKB produces a printable instruction sheet for the technician to reference in the lab when performing the work. The clarity and consistency of these instructions is a major improvement for many technicians BIOVIA has worked with, saving them time and drastically reducing instances of misunderstanding. When technicians have specific needs, EKB can also be configured to provide customized instructions. Screenshot of an example EKB instruction sheet—this one to assist the technician with a fixed reactor bed layout. For example, if scientists like to plan experiments in terms of ingredient proportions (percentages) and technicians like to execute experiments in terms of absolute amounts (grams), then the instruction sheet can perform the necessary calculations to convert from one to the other. By extension, this means that EKB can also automate more complex calculations that are normally the domain of the technician’s special private Microsoft Excel worksheets. Common examples include determining impregnation concentrations to achieve a specific metal loading, or calculating out the times of day that reactor conditions need to be changed for a given reactor run. Experiment Execution and Tracking The EKB experiment plan becomes the execution plan. At a glance, the project team can see exactly where each sample is and what needs to be done next. Screenshot of a typical EKB experiment during execution. This experiment has 4 samples and 6 steps. The dashboard shows that currently the first 4 preparation steps are complete and the 5th step is partially complete. This has a very positive impact on team efficiency, especially when the experiment execution spans multiple departments or sites. Time is saved through reduced communication time and fewer chase-up phone calls and emails. This kind of end-to-end workflow planning and execution capability is unique to EKB. As discussed in the earlier LIMS section, a LIMS is designed around the idea of individual requests and therefore lacks the ability to connect requests to this kind of multi-step, multi-sample workflow. Of course, not everyone involved is interested in the end-to-end workflow. For example, characterization technicians prefer to see a view of the world through a per-instrument, permethod or per-laboratory lens. For these users, EKB provides different views showing the queue of work appropriately filtered. EKB’s “laboratory view,” showing activity as a filtered list of actionable tasks. Integration to LIMS (Characterization) Through its “laboratory view,” EKB can play the part of a traditional LIMS well. However, it is commonly undesirable to replace a characterization group’s existing LIMS with EKB. This is particularly the case in catalysis environments where the characterization team may be serving many other types of customers not using EKB. In these cases, EKB can be integrated with an existing LIMS so that EKB injects requests into the LIMS on behalf of its users. EKB can then monitor the LIMS and pull result data out of the LIMS when the request is completed (automatically progressing the EKB workflow view). This arrangement provides the best of both worlds. The characterization team can continue as before, and EKB will still contain the complete end-to-end data of the samples—all without the user having to make manual uploads or interventions. Automated and Semi-automated Data workup (Characterization, Testing) One of the special features of catalysis covered earlier is that the nature of data is very broad when compared to other areas of research. One particularly tricky area is experimental steps that produce high content data, meaning a rich amount of raw data is produced which must then be interpreted to create final result data. Commonly, this kind of “data workup” activity is manual and time-consuming and can consume a large proportion of a technician’s or scientist’s time. EKB helps in this regard by adding a custom “validation wizard” through which users and data are funnelled at the end of the step (after the raw data is uploaded into EKB). Validation wizards are always user-centric and customer-specific. Because they are authored using BIOVIA Pipeline Pilot they can be quick to implement and very powerful as they enable users to leverage the BIOVIA component collections including: • Reporting Collection: For creating tools that present the data to the user in the most appropriate format (tables, plots, charts) • Interactive Reporting Collection: Enabling the user to interact with tables, plots and charts to select, highlight or annotate data • Analytical Instrument Collection: Ideal for processing XRD and NMR spectra • Imaging Collection: For automatically evaluating a wide range of image types, including SEM and TEM • Data Modelling Collection: Providing many common statistical components for automatic determination of aggregated or calculated result data • Integration Collection: Connecting and presenting data from separate files or databases so the user doesn’t need to manually look it up Validation wizards are a unique feature of EKB that enable a very high level of workflow integration and automation. BIOVIA’s wealth of tools and prior experience mean most validation wizards can be assembled quickly. Time Series Visualization (Testing) Performance testing of catalysis almost always involves running the catalysis in some kind of reactor. Research reactors exist at a number of different scales (Nano-tube, micro-tube, bench scale) and come in low throughput and high throughput (8, 16, 32 reactor) varieties. In all cases, the reactor rigs will produce a wealth of “time-series data” through a collection of sensors which report values at given intervals. As discussed in the earlier “What Makes Catalysis Special” section, the nature of data in catalysis is a particular challenge. It is especially challenging when it comes to time-series data because: • It is voluminous (always high content, sometimes also high throughput) • Commonly there is “offline” data (analytical subsamples) to be merged • Often there are “online” analyses happening on separate instruments whose data needs to be merged (for example GC, GCMS, LCMS of product or feed) • The data requires interpretation to be useful • Interpretation is objective specific, i.e., interpreting for activity is different from interpreting for selectivity. In practice, dealing with online and offline data collection and aggregation from rigs can be very time consuming, easily occupying more than 50% of a testing technician’s time. EKB helps the technician in a number of regards: • Automatically collecting online data from source (historian, rig database, file system) • Automatically collecting offline data (from LIMS or EKB sub-steps) • Merging online and offline data • Doing the above periodically during the test run, so as to provide a “monitor” function • Permanently storing the collected raw data (because most historian systems do not) Once the data is merged, the next job is to interpret it. This is normally done by the scientist, using a tool like Microsoft Excel. Because the task is largely manual, there is usually not time to interpret the data beyond the needs of the current project. As a result, the data may only be interpreted for one performance measure. As discussed earlier, one of the special issues in catalysis is the shifting of performance objectives over time. This makes it vital that performance tests are interpreted for multiple performance measures so that there is a greater chance of reusing results in future projects. EKB steps in at this point by making it easy to analyse for multiple performance objectives through its Time Series Data Visualization Applet. EKB’s Time Series Visualization Applet showing values of 2 activity measures during an 8 condition reactor run (where flow rate is progressively decreased. EKB’s Time Series Visualization Applet showing selectivity to 2 products during the same 8 condition run. The example screenshots above show charts that help the scientist interpret activity and selectivity. The charts are pre-defined as part of the configuration of EKB according to the requirements of the scientists. When using the tool, the scientist can toggle between these views quickly and easily—a fraction of the time that would otherwise be taken to paste data into an Excel spread sheet and adjust Excel chart settings and data ranges to suit the run. Also shown on the plot are the offline data points from periodic characterizations of the reaction product (larger crosses and triangles on the activity plot). The visualization applet supports the visualization of multiple concurrent reactors and provides the ability to quickly zoom and pan through the data. The applet also marks data points as “invalid” if they appear to be misreported. Per-Condition data aggregation (Testing) Given that reactor tests are expensive to set up and time-consuming to run, it is common practice in catalyst performance tests to vary the conditions of the reactor during the run. This practice considerably increases the amount of information available from a single run. However, the resulting datasets can be more difficult to interpret. Consider a typical testing plan: • Condition -1: Warm up the reactor • Condition 0: Spike the feed to prime the catalyst for 4 hours • Condition 1: Low temperature, low flow rate for 3 days • Condition 2: Low temperature, high flow rate for 3 days • Condition 3: High temperature, high flow rate for 3 days • Condition 4: High temperature, low flow rate for 3 days Assuming we are only scientifically interested in the “testing” conditions (1-4) of the above, then there are really 4 almost separate experimental results here, one from each condition. At the beginning of each condition, we can expect a certain amount of “settling” before the catalyst resumes performing at a steady state. The EKB Time Series Visualization applet provides the scientist with a point-and-drag means of visually defining the “stable phases” of the above data. EKB can then automatically calculate the performance characteristics for multiple performance objectives on a per-phase basis, for example: • Activity for Condition 1, 2, 3 and 4 • Selectivity A for Condition 1, 2, 3 and 4 • Selectivity B for Condition 1, 2, 3 and 4 • Longevity for Conditions 1, 2, 3 and 4 The separate per-phase descriptors of performance stored against the catalyst in the EKB database are available for searching and for use as part of data visualization and mining. EKB’s validation wizard and EKB’s time series visualization features provide a unique way for scientists to quickly and consistently process the complex data coming out of their reactor tests. Not only does this save time over the traditional Excel methodology, but it also results in a larger number of performance descriptors being calculated—making the test data far more likely to be reusable in the future. Data Analysis: Full Featured Catalyst Search Because EKB spans all planning, execution and data collection activities across preparation, characterization and testing, it eventually contains a vast wealth of connected and contextualized data about catalysts. The breadth of the data allows users to ask and answer questions that previously would have required a huge forensic effort in piecing together data from different systems. For example: Date Area(s) Example Query Preparation “Find all catalysts prepared with more than 5% platinum and heated to more than 1200°C.” Preparation & Characterization “Find all catalysts prepared with more than 5% platinum that had less than 4.5% platinum when measured by ICP.” “Find all catalysts heated to more than 1200°C which have more than 20% micropores.” Characterization Testing “Find all catalysts with less than 25% micropores that had an activity of over 95%.” Preparation & Testing “Find all Catalysts prepared with less than 5% by volume of active metal that had an activity of over 80%.” Testing “Find all catalysts that had selectivity over 90%, an activity over 90% in a condition where the temperature was less than 350°C.” Preparation, Characterization & Testing “Find all catalysts not prepared with Additive B that had a crush strength of over 100lb. that had a performance drop-off of less than 1% per month. This capability allows users to ask the “have we ever made...” questions they have always wanted to ask. No other knowledge management system can provide this breadth of data to search on. Getting the answers to questions such as this gives project teams using EKB a big advantage when starting a new project because they can effectively use and review past results to give them a head start. Screenshot of EKB’s search query builder. The example query joins data from preparation, characterization and testing Data Analysis: Catalyst Data Visualization and Data Mining As well as providing a very full featured search engine, EKB also offers a visualization engine that can present catalyst preparation, characterization and testing data in almost any way. As with validation wizards, EKB leverages BIOVIA Pipeline Pilot to author visualization plugins for EKB. BIOVIA Pipeline Pilot has a wealth of relevant component collections which makes assembling these visualization plugins very cost effective: • Reporting Collection: Scatter plots, histograms, bar charts, radar plots, pivot tables and more • Interactive Reporting Collection: Linked plots and tables for highlighting and data exploration • Advanced Data Modelling Collection: Builds predictive models from experimental data using a number of statistical approaches, identifying clusters and similarities in experimental data • R-Statistics: Access to all the features of the popular statistics package, R, including visualizations such as correlation matrices and pairs plots Visualization examples of EKB catalysis data. Left is a correlation Matrix generated by the R-Statistics Collection. Right is a Scatter Plot generated for one of the highlighted correlations using the BIOVIA Enterprise Platform’s Reporting Collection. EKB’s flexible visualization model brings the best of two worlds together: easy access to data visualization for the average user, and the possibility of advanced custom visualization for advanced users. Conclusion: Special Features of EKB for Catalysis Catalysis presents special informatics challenges that current approaches do not support. BIOVIA is committed to providing the catalysis market with knowledge management solutions that address these special challenges. EKB is already in production at 3 of the top 10 catalyst research companies, and their users are already reaping the benefits of a strong and future-proofed knowledge management strategy. APPENDIX List of catalysis-specific methods BIOVIA already has a wealth of experiments that is useful for configuring EKB to suit the needs of a catalysis research group. The BIOVIA Professional Services team has implemented many “methods” including: Preperation Characterization Testing [Catalyst Synthesis] [Catalyst Characteristics] [Low Throughput] Mix Impregnate Precipitation HT Impregnate (e.g. Chemspeed) Extrude Pellet / Compact Mill / Grind Sieve / Filter Dry / Calcine / Heat Treat High Throughput Precipitation Mercury Porosimetry (Micromeritics) Nitrogen Porosimetry (Micromeritics) ICP XRD (including 2D XRD) SEM, TEM Pellet Dimensions (Manual / Automated) Crush Strength TGA / DSC Particle Characterization Chemisorption Physisorption Densitometry Zeta Potential analysis Custom Benchscale (1,2,4 reactors) Pi connected rigs Aspen connected rigs [Local Characterizations] [Product Characteristics] [Online Analysis] Loss on Ignition Water Pore Volume Density Viscosity Cloud point, Flash Point High Temp Simulated Distillation GC GCMS LCMS [High Throughput] Screening (Freeslate) Avantium Florence HTE Amtech Custom High throughput rigs (8, 16, 32) [Product Composition] Sulphur and Nitrogen Analysis GCMS IR/NIR Total Acid Number Asphaltenes Aromatics PIANO HCONS MCR ..and many more List of instrument vendors In prior EKB projects, BIOVIA has worked with a wide range of vendors including: Equipment Vendors HTE Avantium Amtech Freeslate Chemspeed Zinsser Agilent Micromeritics Brucker Horiba Dillon Leco Microtrac Netzsch EKB ROI for Catalysis The diagram below shows the expected benefits a typical EKB catalysis research operation over time. The biggest benefits come from the presence of a well filled knowledge management system, which can take 2-4 years to achieve. The diagram also illustrates many other immediate benefits that come more immediately which, when taken together, add up considerably. EKB’s catalyst-specific features provide a number of benefits like this. The value of even just these near-term benefits is normally more than sufficient to justify the cost of purchasing, installing and configuring an EKB system. Typical EKB ROI model Tangibles Time Saving with EKB (hours) How is the time/money saved? 1. Searching for a sample’s data 1 Searching for a single sample in a paper- based system takes about this time on average. In EKB this takes no time at all. 2. “Repeating” a sample 4 If the average sample takes X hours, then X is the time saved per sample not repeated. 3. Communication about samples/experiments inside team per day 0.5 Time spent and saved in a day by team member communicating about samples and experiments 4. Data upload .25 Time taken to manually transcribe data from data set 5. Data Workup .3 Time taken to manually work up the data from a piece of equipment (normally manual Excel) 6. Creating cross-experiment reports/visualizations per week 4 Time that would otherwise have been used manually creating reports 7. Creating cross-project reports/ visualizations per week 6 Time that would otherwise have been used manually gathering data and manually creating reports In the example above, very conservative estimates are placed on the time saved for each of the tangible benefits. However, when multiplied through for this 50-user system, the benefits add up greatly: Tangibles $100 [Example] Number of samples executed per week 50 [Example] Average raw material cost of single sample $10 [Example] Number of EKB users 50 [Example] Tangibles Answer Benefit Per Year($) 1. Number of sample searches per week 10 52,000 2. Estimated “repeated” samples per week 5 106,600 3. Time spent communicating about samples/experiments inside team per day 1 650,000 4. Number of equipment data sets to upload per sample 3 195,000 5. Number of equipment data sets to analyse per sample 3 234,000 6. Number of canned cross-experiment reports/visualizations per week 5 104,000 7. Number of cross-project searches + reports or visualizations per week 2 62,400 TOTAL ©2014 Dassault Systèmes. All rights reserved. 3DEXPERIENCE, the Compass icon and the 3DS logo, CATIA, SOLIDWORKS, ENOVIA, DELMIA, SIMULIA, GEOVIA, EXALEAD, 3D VIA, BIOVIA and NETVIBES are commercial trademarks or registered trademarks of Dassault Systèmes or its subsidiaries in the U.S. and/or other countries. All other trademarks are owned by their respective owners. Use of any Dassault Systèmes or its subsidiaries trademarks is subject to their express written approval. Fully loaded FTE cost per hour $1.4M The ROI model and inputs for each catalyst department will obviously vary, but this illustration shows how the savings from an EKB system can add up significantly even when only considering some of the immediate-term benefits of the system. Our 3DEXPERIENCE Platform powers our brand applications, serving 12 industries, and provides a rich portfolio of industry solution experiences. Dassault Systèmes, the 3DEXPERIENCE Company, provides business and people with virtual universes to imagine sustainable innovations. Its world-leading solutions transform the way products are designed, produced, and supported. Dassault Systèmes’ collaborative solutions foster social innovation, expanding possibilities for the virtual world to improve the real world. The group brings value to over 170,000 customers of all sizes in all industries in more than 140 countries. For more information, visit www.3ds.com. Dassault Systèmes Corporate Dassault Systèmes 175 Wyman Street Waltham, Massachusetts 02451-1223 USA BIOVIA Corporate Americas BIOVIA 5005 Wateridge Vista Drive, San Diego, CA 92121 USA BIOVIA Corporate Europe BIOVIA 334 Cambridge Science Park, Cambridge CB4 0WN England WP-5506-1114
© Copyright 2026 Paperzz