artificial intelligence for your daily business

Complimentary eBook offered by
ARTIFICIAL INTELLIGENCE FOR
YOUR DAILY BUSINESS
Understanding the Spectrum of AI-technologies for
Review, Investigation and Contract Analytics in eDiscovery
“
ARTIFICIAL INTELLIGENCE IS NO MATCH FOR NATURAL STUPIDITY”
( Anonymous )

Techniques from the world of Artificial Intelligence (AI) are rapidly finding their way into today’s business practices. They are being used to accelerate the speed and efficiency of an
organization’s internal processes. The main reason for this success is that after several decades of research, AI techniques are not only allowing us to process enormous amounts of
data 24/7 at staggering speed, but are also consistently performing on par with (and often
even better and more consistently than) humans. This results in revolutionary productivity
gains.
Although it is clear there is high value in Artificial Intelligence for review, investigation, contract analytics, eDiscovery and other legal fact-finding missions, organizations still struggle
to understand the different techniques involved. In this eBook we will explain different techniques and illustrate these with practical business-related examples.
cept Search Technology Assisted Review Topic Modeling Big Data Analy
Machine Learning Clustering Data Mining Predictive Coding
Text Mining Natural Language Processing Machine Translation Audio Se
AI - WHAT’S IN IT FOR ME?
Machine Learning, Natural Language Processing (NLP) and similar techniques such as dataor text-mining, big data-analysis, predictive coding, Technology Assisted Review (TAR), concept search, topic modeling, clustering, audio search and machine translation are all Artificial Intelligence techniques and can be used to identify specific document categories and to
search for relevant information in documents. These techniques are being used to enhance
the speed and efficiency of eDiscovery practices and can also be used to accelerate other
legal processes.
cept Search Technology Assisted Review Topic Modeling Big Data Analy
Machine Learning Clustering Data Mining Predictive Coding
Text Mining Natural Language Processing Machine Translation Audio Se
Machine Learning
Natural Language Processing
ining Natural Language Processing Machine Translation Aud
Machine Learning Clustering Data Mining Predictive Coding
earch Technology Assisted Review Topic Modeling Big Data
Machine Learning is the process by which software
recognizes patterns and relationships within large
datasets. A classification system first learns using
“training data”. New pieces of data are then classified based on the (latent) patterns that have been
learnt in the training data. After sufficient training,
the behavior of new data can be predicted, and it is
even possible to distill information from previously
unknown patterns and semantic relationships.
ZyLAB’s Machine Learning uses the most advanced
machine learning algorithms in combination with
advanced statistical and semantic methods to represent the content of a document.
Natural Language Processing refers to the ability
of a computer program to understand spoken language. NLP is also based on Machine Learning and
uses word processing techniques that treat text
like a random sequence of symbols, but that also
considers the hierarchical structure of language;
words form a phrase, phrases make a sentence
and sentences convey a message.
Text Mining
ining Natural Language Processing Machine Translation Aud
Machine Learning Clustering Data Mining Predictive Coding
earch Technology Assisted Review Topic Modeling Big Data
Text Mining, also known as Text Analysis, refers to
the use of varied techniques to automatically enrich data in large data volumes and then search
for hidden patterns and relationships. Once identified, this data can be filtered, sorted, and visualized; and discovered topics and categories can
be prioritized. Text mining identifies and highlights
information from patterns and semantic relationships which were previously unknown.
Technology Assisted Review
ining Natural Language Processing Machine Translation Aud
Machine Learning Clustering Data Mining Predictive Coding
earch Technology Assisted Review Topic Modeling Big Data
Technology Assisted Review (TAR), also known
as Computer Assisted Review (CAR) or Predictive
Coding, uses a series of algorithms to search and
sort documents relevant for data investigation or
eDiscovery. TAR also utilizes Machine Learning.
ZyLAB uses a variety of methods for automatic
document classification to support Technology
Assisted Review (TAR). These patented methods
vary from straightforward search-based, regular
expressions and gazetteers (dictionaries), to advanced methods using NLP and Machine Learning.
100 %
ZyLAB Machine Learning TAR
Machine Learning for Automatic
Document Classification
RECALL
OCR on Bitmaps, Visual Classification, Text-Mining, Audio
Search & Machine Translation
Search on Extracted Metadata
(document properties, file
properties, forensics)
Fuzzy, Wildcard,
Quorum, Proximity,
Relevance
Ranking
Traditional
Boolean
Search
0%
ZyLAB Rules-based TAR
Topic Modeling / Clustering
ining Natural Language Processing Machine Translation Aud
Machine Learning Clustering Data Mining Predictive Coding
earch Technology Assisted Review Topic Modeling Big Data
Topic Modeling & Cluster Analysis
Two other approaches to text mining. A topic model is used to statistically explore abstract concepts
(topics) that occur within a set of documents.
Cluster analysis uses perceived relationships between various groups of objects to create new subgroups (clusters). These documents are ideal for
use with Machine Learning.
COMBINING DIFFERENT APPROACHES
The advantage of full-text search and text-mining techniques are that they are transparent,
and that every contract lawyer knows how to use full-text search and how to combine different search techniques. The problem of an incorrectly classified document can be fixed
by the lawyer simply changing the query. The effort of writing queries can be combined in
libraries of full-text queries, which can be shared and re-used. The queries can also be easily
translated into other languages.
This is not always the case when using Machine Learning, which is more of a black box that
either works or not and, in the latter case, is hard to fix. Furthermore, Machine Learning is
not transparent enough for users to directly understand why a document is classified into
a specific category. Because Machine Learning uses specific document sets for “training
data”, the learn patterns are not always relevant for documents that differ too much from
them.
As each technique clearly has its own advantages and disadvantages, it is best to allow the
user to combine the different methods to achieve the highest possible recall and precision.
This is exactly what ZyLAB does: it starts with simple, straightforward and transparent techniques and expands into more advanced methods when needed.
PRACTICAL USE CASE
AI IN LITIGATION & ARBITRATION (EDISCOVERY)
ZyLAB eDiscovery is a complete end-to-end solution for all your discovery and regulatory
needs. AI-techniques are used for:
• Automatic identification of relevant documents for litigation and arbitration (eDiscovery)
using sample documents;
• Automatic clustering and classification of documents into relevant groups and sub-groups;
• Searching the content of images and videos without the need to add textual descriptions;
• Automated machine translation technology to quickly translate all information up front:
this can then be tagged and reviewed in ZyLAB’s highly intuitive review platform. This way
relevant data is quickly uncovered and critical information can be routed for specialized
human translation if needed.
cept Search Technology Assisted Review Topic Modeling Big Data Analy
Machine Learning Clustering Data Mining Predictive Coding
Text Mining Natural Language Processing Machine Translation Audio Se
PRACTICAL USE CASE
MERGERS & ACQUISITIONS (M&A) AND LARGE CORPORATE TRANSACTIONS: AI FOR
CONTRACT DISCOVERY, REVIEW AND ANALYSIS
Many organizations keep track of their agreements and other relevant documents in a contract management system. Next to monitoring deadlines, notice periods, warranties and
guarantees, these systems are also used to generate documents used to fill a data room
with the relevant documents.
ZyLAB’s eDiscovery technology helps to identify contracts from live data locations such as
email boxes, SharePoint or file shares. During processing, all documents are analyzed for
additional metadata, specific content, email threads, duplicates, privileged information and
much more. The outcome of this process can be used to generate documents used to fill a
data room with the relevant documents.
Get better insight in your data without having to search and review the actual data itself. Text
analysis helps you find entities such as organizations, persons and more. Code words and
other patterns like sentiments, requests and travel activities can be extracted and can guide
you straight to the relevant information.
PRACTICAL USE CASE
AI FOR LEGAL FACT FINDING, FRAUD AND INTERNAL INVESTIGATIONS
Legal fact finding is key in all data investigations, whether conducted in relation to a crime,
an internal fraud case or a request for disclosure of government documents.
ZyLAB’s own indexing engine can index up to TBs of data per day and supports access to
over 750 different file formats. ZyLAB has been a leader in legal and investigative full-text
search since 1983, offering not only industry-standard search functionality, but also unique
operations such as our fast and world-famous fuzzy, quorum, wildcard, proximity, phrase
and regular expression searches.
In addition, ZyLAB allows users to search numeric ranges, dates and file names, and to use
text delimiters to define key fields and text ranges on the fly. These extensive search capabilities, combined with our fast multi-threaded and distributed indexes, help in finding relevant
information faster than any other tool on the market. Hits from your search are highlighted
on every document, even if these were originally image based.
cept Search Technology Assisted Review Topic Modeling Big Data Analy
Machine Learning Clustering Data Mining Predictive Coding
Text Mining Natural Language Processing Machine Translation Audio Se
PRACTICAL USE CASE - AI FOR REDACTION FOR DATA PROTECTION
Identification of any personal data which must be deleted, redacted or anonymized.
1
2
3
The Automated Redaction Process
Unique pseudonyms
Identified names can also be replaced by a unique pseudonym. This way the Personally
Identifiable Information (PII) is redacted and protected, but the relationship between the
persons or companies is maintained by the pseudonyms. Reviewers can review or adjust the
automatic redactions by using sampling or manual review.
PRACTICAL USE CASE - AI FOR FOIA AND PUBLIC RECORDS DISCLOSURES
As the number of information requests has increased exponentially over the past years, organizations worldwide can no longer process all information requests in time. When handling
public records requests, there are many possible levels of automation which can optimize the
process, making it possible to use resources more effectively and to deal with ever increasing
data volumes.
ZyLAB implements automation for collection, processing, deduplication, data enrichment,
translation, categorization, data visualization, disclosure cost reporting, keyword hit highlighting, search and tagging, audio and video search, Vaughn Index Creation and bulk redaction.
ZyLAB ONE eDiscovery
ZyLAB ONE eDiscovery uses the latest Artificial Intelligence and Data Science
tools to accelerate truth finding missions along the typical dimensions Who, When,
Where, Why, What, How, and How Much.
ZyLAB is positioned as “leader” in Gartner’s latest Magic Quadrant for eDiscovery
Software, ranked #1 for complete EDRM eDiscovery in Gartner’s “Critical Capabilities for E-Discovery Software” report and has received numerous other industry
accolades over the last three decades.