www.basistech.com [email protected] +1 617-386-2090 Anti-Money Laundering with Text Analytics Name Matching Strategies for Compliance, Risk Reduction and Business Growth Pg. 1 Anti-Money Laundering with Text Analytics Name Matching Strategies for Compliance, Risk Reduction and Business Growth INTRODUCTI O N Vigorous enforcement of Anti-Money Laundering (AML) regulations has dramatically impacted financial institutions around the globe. HSBC’s $1.9 billion settlement1 is the biggest fine on record, and numerous other banks have faced stiff fines for failing to adhere to AML regulations. According to Reuters, “U.S. and European banks have now agreed to settlements with U.S. regulators totaling some $5 billion in recent years on charges they violated U.S. sanctions and failed to police potentially illicit transactions.”2 These fines are not only being imposed by U.S. regulators. Recently the Reserve Bank of India penalized six banks for violation of know-your-customer (KYC) and AML rules. In light of this risk, Citigroup and JPMorgan have pulled back on operations in profitable emerging markets3 in the Middle East, Africa and Asia. Additionally, Barclays has chosen to stop doing business with over 250 money transfer companies4, used largely by Somali diaspora communities to send remittances – as much as $2 billion annually – to friends and family. At the same time that “U.S. and European banks have now agreed to settlements with U.S. regulators totaling some $5 billion in recent years on charges they violated U.S. sanctions and failed to police potentially illicit transactions.” these banks and financial institutions are scaling back their riskier operations, they are also being forced by regulators5 to improve their money laundering controls. This represents an increase in costs and a substantial loss in potential revenue. Central to AML and KYC regulations are the mandate to match new and existing customers with the latest watchlists from OFAC6, FinCen7, FATCA8, The United Nations9, The European Union10, The World Bank11, and other regulatory bodies around the globe. Fundamentally, a financial organization needs to ensure that systems and procedures are in place to correctly identify and flag the names of these individuals – taking into account variations in spelling, word order, languages, writing systems, and other factors (e.g., source, quality, completeness, accuracy, consistency) in order to be compliant. If you can do this well, you can reduce your risk and expand your potential markets across the globe. © 2015 Basis Technology Corporation. “Basis Technology Corporation” , “Rosette” and “Highlight” are registered trademarks of Basis Technology Corporation. “Big Text Analytics” is a trademark of Basis Technology Corporation. All other trademarks, service marks, and logos used in this document are the property of their respective owners. (2014-12-18-AML) Pg. 2 Anti-Money Laundering with Text Analytics Name Matching Strategies for Compliance, Risk Reduction and Business Growth THE COMM O N E L E ME N T: N A ME S From wire recipients to check payees, the one factor KYC systems can rely on having is a name. But, a name can appear many different ways: as nicknames, as foreign spellings, as “bad” spellings, and in multiple writing systems. Comparing a name to the OFAC list, as well as other government lists, is no trivial task. Would you be able to find the following people on watch lists if they applied to be your customer? -- An attorney for the Colombian Medellín cartel accused of laundering drug money -- A Saudi national denied entry into the United Kingdom for financing terrorist groups -- A British businessman fighting extradition to the U.S. to face a bank fraud case -- The brother of a Latin American politician tied to Mexico’s Zambada crime syndicate The answer, and its consequences, depends on the type of name matching you use. Employ the wrong type and one of two outcomes may occur: 1. You will not correctly match the applicant with a name on a list (a false negative); or 2.You will incorrectly match the applicant with a name on a list (a false positive). A false negative means that you will say “yes” to someone you don’t want as a customer. A false positive means you’ll say “no” to valuable business, or you’ll waste time and effort, laboriously and manually, “clearing their name”. Either outcome can harm your reputation, expose you to legal liability, and result in lost business. Matching names against watch lists would be much easier if everyone spelled names consistently. But even people who speak the same language often spell the same name in different ways (Sean vs. Shawn vs. Shaun), use a common nickname (Charles vs. Charlie vs. Charley), include different words in a name (First Middle Last vs. First Last) or change the order of the words in a name (First Last vs. Last, First). And the problem is compounded for names in non-Latin scripts. Consider Mohamed Mustafa ElBaradei, the Director General of the International Atomic Energy Agency. A review of recent news articles yields at least four different spelling variations: -- Mohammad Al Baradei -- Mohamad al Baradi -- Mohamed ElBaradei -- Mohamed Elbaradei None of the above spellings are “right” because his name is properly written in Arabic as: محمد مصطفى الربادعى. How his name appears in English depends on how it sounds to different people. Pg. 3 Anti-Money Laundering with Text Analytics Name Matching Strategies for Compliance, Risk Reduction and Business Growth To complicate things further, each watchlist may use a different convention for writing names. In fact the Arabic name محمدis commonly transliterated into English up to 11 different ways: ----- Mahammad Mohamad Mohammad Mohamed ----- Mohammed Muhamad Muhammad Muhamed -- Muhammed -- Muhamet -- Muhammet This does not include the numerous ways in which Mohammad might be “misspelled”. Consider on top of that, the fact that محمدmight be transliterated to another language entirely, such as Chinese: 穆罕默德. For example, the Japanese Financial Intelligence Center maintains a Taliban/AlQaeda watchlist12 with names written in English and Japanese; all of which are transliterations of the original Arabic name. In the age of global commerce, you need a name matching system that understands linguistics – so that the right matches will occur, and the wrong matches (false positives) will not occur, even if name spelling varies. NAÏVE VS LIN GU IST IC ME T H O D O LO GIE S Traditional methods of name matching do not use linguistic knowledge. They apply what might be called the naïve or “brute force” approach. This approach matches names by addressing them simply as a sequence of letters. If the two names have the same letters in the same order (or a simple approximation) then they’re considered a match. Naïve methods attempt to handle name variability (as in the Mohammed example) by maintaining an exhaustive list of possible name options. The downside to this process is that it requires the continuous collection and storage of name variations. In addition, it relies on enormous computing power to continuously check every name against a massive and ever-growing list. This approach is also particularly vulnerable to new transliterations of non-English names, exposing you to an unacceptable risk of not matching the new name variation to an existing person on a watch list. A knowledge-based approach is significantly better. Rather than try to specify every possible name variation, a computer is “taught” to identify variations based on automated linguistic methods – the same linguistic patterns people use to construct such variations in real life. This often involves applying phonology (how words sound) to orthography (how words are written). In the “Charley versus Charlie” example a linguistic engine knows that in this case, “ley” and “lie” sound the same, and that both forms are reasonable variations of the same name. Other types of knowledge also apply, such as formal vs informal names, where the “e” sounding suffix indicates Pg. 4 Anti-Money Laundering with Text Analytics Name Matching Strategies for Compliance, Risk Reduction and Business Growth a nickname for the formal name “Charles.” Linguistic algorithms can recognize that the many different potential spellings of Mohammed “sound” the same and in fact, refer to the name محمد, without the need to maintain an exhaustive list containing each variation. CULTURAL KN O W L E D GE If a name is on a watch list, you’re responsible for matching it – regardless of your staff’s language competency. What happens when a Saudi person of interest applies for an account in London using an English spelling of his name based on his dialect that differs significantly (but predictably) from his name on the watch list? Or what happens when a bank employee in Chicago puts a Mexican national’s paternal surname into his system’s middle name field not knowing that naming customs place it before the maternal surname? Cases like this can easily generate false negatives and positives without the cultural awareness. Fortunately these types of name variations do not occur at random. Rather, they follow established patterns of human practice depending on the specific language and cultural context. By identifying the cultural origin of a name, the patterns can be “reverse engineered” to identify the many ways that different words are used to represent similar names. ADVANTAGE S O F T H E L IN GU IST IC MO D E L The benefits of a knowledge-based approach are clear. Because a computer can apply the knowledge humans use to derive name variations, you don’t need actual humans to specify as many variations as they can. Nor does the computer have to make potentially billions of naïve comparisons. Results are more accurate, they take less time to achieve, and the systems that produce these results can be much more scalable. The linguistic approach therefore has several key advantages: -- Fewer false positives: Computers employ the same knowledge humans would -- Fewer false negatives: Humans do not have to know all possible variations -- Faster checking: Not all variations are compared explicitly -- Greater scalability: Fewer (or smaller) machines can check more names ctor ople, places, and organizations Pg. 5 connections in your data er ween many variations names into English hing In Sight Analyzer ments Of Your Text ROSETTE Entity Extractor Tagged Entities Anti-Money Laundering with Text Analytics Name Matching Strategies for Compliance, Risk Reduction and Business Growth ver slator REX RES RNI ROSETTE Entity Resolver Real Identities ROSETTE Name Indexer Matched Names As linguistics experts with deep understanding at the intersection of language and technology, RNT ROSETTE Name Translator Basis Technology has developed Rosette Name Indexer (RNI), a knowledge-basedTranslated solution to the Names challenge of name matching. ROSETTE RNI performs an intelligent comparison based on linguistic, orthographic, and phonologic RCA Categorizer Sorted Content algorithms. By working on the name in its original script, as opposed to translating the name into English, RNI takes advantage of all the available contextual information to properly match names to ROSETTE aRSA target list without introducing the inevitable errors that transliteration introduces. Sentiment Analyzer Actionable Insights Example: U.N. Al-Qaida Sanctions Data13 Name (original script): فهد محمد عبد العزيز الخشيبان Transliterated Name: FAHD MUHAMMAD 'ABD AL-'AZIZ AL-KHASHIBAN Identified aliases: -- Fahad H. A. Khashayban -- Fahad Mohammad Abdulaziz Alkhoshiban -- Fahad H. A. al-Khashiban -- Fahad H. A. Khasiban -- Fahd Muhammad’Abd al-‘Aziz al-Khushayban -- Fahad al-Khashiban -- Fahd Khushaiban -- Fahad Muhammad A. al-Khoshiban R N I A N D A R A B IC N A ME S Arabic names may be written with honorifics, given name, family name, patronymics (son of x, father of y), tribal affiliation, city of birth, and more. All of this information can provide valuable clues when matching one name to another. Take the following example: TITLE GIVEN NAME PATRONYMIC FAMILY NAME Al-Sheikh Abdullah Bin Hassan Al-Ashqar الشيخ عبد الله بن حسن ٔالشقر Since the U.N. list provides the original Arabic script, name identification becomes significantly less ambiguous as compared to the multiple spelling variations of the transliterated name. RNI These names may appear in Arabic as: matches all these documented AKAs along with the numerous variants that are not listed. -- Al-Sheikh Abdullah Al-Ashqar (no patronymic) or -- Abdullah Al-Ashqar (no title, no patronymic) or -- Al-Sheikh Abdullah Bin Hassan Bin Mohammad AlAshqar (with grandfather’s patronymic) It’s the same name, but the variations are many and complex. RNI understands the structures of names in each language, so instead of generating countless variations to look up, it does an intelligent comparison of names based on linguistic, orthographic, and phonological algorithms. Pg. 6 Anti-Money Laundering with Text Analytics Name Matching Strategies for Compliance, Risk Reduction and Business Growth INTEG RATIN G R N I RNI’s API provides output to optimize your decision-making process. Matching names are returned with a similarity score from 0 to 100% and minimum match thresholds can be set to constrain the quality of the results returned and to balance speed and accuracy. Specific rules can also be applied to ignore particular words, force certain names to match, and adjust the score of other kinds of matches to align match results to business rules. RNI has two integration options to fit your use case. The first is to be integrated more loosely as an independent index of names, parallel to your existing data repository with application logic for joining results. The second is to be integrated more tightly as a plug-in to that repository (e.g., Apache Solr), storing names alongside non-name data while spanning both queries. B A SIC R N I WO R K F LOW 1. N A M E I N P U T RNI DE NY 3. OUTPUT 2 . WATC H L I ST M ATC H I N G AP P R OVE E VALUATE When a name is processed by RNI, it compares the name to a series of watchlists to determine a match. Based on the similarity score returned by RNI, the system can trigger different next steps in your workflow. Pg. 7 Anti-Money Laundering with Text Analytics Name Matching Strategies for Compliance, Risk Reduction and Business Growth UNDERS TAN D IN G W H O YO U R CU STO ME R S ARE PROTEC T S YO U A N D T H E M AML and KYC compliance is a highly complex challenge with intricacies that are constantly in flux. Associating a person’s name with all of its different variations is not a trivial task—one which must be done with careful consideration and accuracy. RNI allows for the most accurate means for resolving names across languages and scripts. The efficient name matching technology in RNI handles spelling variations and errors, non-standard Romanization, and the cultural vagaries of how names are written in each language. In addition, RNI understands the structures of names in each language, so instead of generating countless variations to look up, it does an intelligent comparison of names based on linguistic, orthographic, and phonological algorithms. The results are also ranked by relevancy with a match score so that further analysis can be done more efficiently. With RNI, organizations can increase the coverage and accuracy of searching for foreign names and more readily become compliant within the labyrinth of the regulatory environment. 1. http://www.reuters.com/article/2012/12/11/us-hsbc-probe-idUSBRE8BA05M20121211 2. http://www.reuters.com/article/2012/12/11/us-hsbc-probe-idUSBRE8BA05M20121211 3. http://www.ft.com/intl/cms/s/0/47c3432a-aa5d-11e2-9a38-00144feabdc0.html?siteedition=intl#axzz2lIhpkzmo 4. http://www.ft.com/intl/cms/s/0/f0eb197e-ff4d-11e2-8a07-00144feabdc0.html#axzz2lIhpkzmo 5. http://www.huffingtonpost.com/2013/03/26/citigroup-money-laundering_n_2956270.html 6. http://www.treasury.gov/resource-center/sanctions/SDN-List/Pages/default.aspx 7. http://www.fincen.gov/ 8. http://www.irs.gov/Businesses/Corporations/Foreign-Account-Tax-Compliance-Act-(FATCA) 9. http://www.un.org/sc/committees/1267/aq_sanctions_list.shtml 10. http://eeas.europa.eu/cfsp/sanctions/consol-list_en.htm 11. http://web.worldbank.org/external/default/main?contentMDK=64069844&menuPK=116730&pagePK=64148989&piPK=64148984&quer ycontentMDK=64069700&theSitePK=84266 12. http://www.npa.go.jp/sosikihanzai/jafic/todoke/list/list.pdf 13. http://www.un.org/sc/committees/1267/AQList.htm
© Copyright 2026 Paperzz