||||||m|||||m | | | 11111151111411!l!!!m1 11 1111111111 United States Patent [19] [11] Patent Number: Tsuzuki [45] [54] NEAR-SYNONYM GENERATING METHOD _ _ __ _ _ Date of Patent: 3-015980 [75] Inventor: Kouichi Tsuzuki, Kawasaki, Japan 5,469,355 1/1991 Nov. 21, 1995 Japan . Primary Examiner_G0pa1 C_ Ray _ Assistant Examiner—Xuong M. Chung-Trans [73] Assignee. FlljltSll Llnuted, Kawasaki, Japan Attorney, Agent, 0,. Firm_Staas & Halsey [21] Appl. N0; 115,327 [57] [22] Filed: sep- 2’ 1993 [30] Foreign Application Priority Data A near-synonym generating method generates near-syn onyrns of a target character string by retrieving a near ABSTRACT synonym ?le based on the target character strmg, where the NOV. 24, 1992 Japan .................................. .. 4-312531 [51] Int C16 [52] Us' Ci """""""""""""""" ' ' _ ' """"""""""""""" " ' [58] Fleld 02222311 G06F 19/00 364/419 04_ rality of words. The near-synonym generating method includes the steps of (a) retrieving the near-synonym ?le ’ 364/419‘ 14 ' using words which form the target character string as keys, and extracting near-synonyms which are de?ned for each of 3124/ 9' ’ ' ’ ' nealzsynonym ?le de?nes neapsynonyms for one or a plu the words used as the keys as the near-synonyms for each of ’ ' ’395 the words forming the target character string, (b) forming a near-synonym group from each of the words forming the 5,168,533 . References Clted US. PATENT DOCUMENTS 5/1983 R b al 364/419 13 9/1989 2:33am!“ et ' """""" " 364/419'13 6/1989 Deemest'é'r'gt'gi“““ " 364/4l9'l3 12/1992 Kato et a1. ...... I: ..... .. 38254 5,297,039 3/1994 Kanaegami et a1. ............. .. 364/419.13 [56] 4 384 329 4’773’039 4’839’853 target character string and the corresponding near-synonyms so as to form a plurality of such near-synonym groups, and selecting the words or near-synonyrns from each of the near-synonym groups, and (c) generatmg the near-synonyms of the target character string by combining the selected words or near-synonyms obtained by the step (b) in an order which is different from the “def 0f the “Ids forming the mfg“ ChaIacler Strmg FOREIGN PATENT DOCUMENTS 2-129756 5/1990 12 Claims, 18 Drawing Sheets Japan . NEAR SYNONYM GENERATOR f8 2~ E; *r E] E] El 3 ,M DIV/DER 5 / , CHARACTER STRING \\ i REPLACING CHARACTER NEAR- SYNONYM PROCESSOR a 6 / I NEAR-SYNONYM HIERARCH/CAL DEFINITION (__ _ __ _ _ _' [Z] / i \ - PROCESSOR ,_ 7 4 a ADDING —___“___ \\ NEAR-SYNONYM PROCESSOR \ RETRIEVAL PROCESSOR gig/555E” a NEAR-SYNONYM FILE __ _ .___ - _. -_ MISSING \ if _] I<_"_~——__ TARGET l STRING * 9 US. Patent Nov. 21, 1995 Sheet 4 of 18 DATA LIBRARY H N PROCESSING UNIT DATA 5,469,355 RETRIEVAL FILE 3 ’\ cHARAcTER Z12 L” STRING DIV/DER “ \I ( IO RETRIEVAL PROCESSOR REPLACING NEAR-SYNONYM f I TAR f 4 RRocEss0R r 5 ~_ MISS/N6 CHAgECTER NEAR-SYNONYM 5 STRING PRocEssoR / 9 ~/ __ GENERATED ADDING NEAR SYNONYM RRO cEs50 R NEARSYNONYN FILE REsuLT 7 A, NEAR-SYNONYM cENERAToR 13 Q I R TRIEvAL LIE/T / RRocE 58IN6 '5 N PROCESSING uNITIcRu/MEMoRY 14 RETRIEVAL REsuLT US. Patent Nov. 21, 1995 Sheet 5 of 18 5,469,355 FIG.5A DATA ITEM MANAGEMENT FILE ITEM NAME HOLIDA Y/APPLlCATION/MEMBER / NUMBER HOLIDAY/ REPORT/EMPLOYEE/NUMERAL VACATION/ NOTICE/MEMBER INUMBER VACATION lREPORT/MEMBER/ N0 . EMPLOYEE/ NUMBER EMPLOYEE/NAME FIG. 5B l4 DIFFERENT SOUND/N6 SYNONYM CANDIDATE LIST / Q) HOLIDAY/APPLICATION/MEMBER/NUMBER HOLIDAY/REPORT/EMPLOYEE/NUMERAL VACATION /NOTICE/MEMBER/NUM8ER VACATION lHEWRT/MEMBER / No . EMPLOYEE INUMBER MEMBER/HOLIDAY /A PPLlCATlON/NUMBER @ HOLIDAY/REPORT/EMPLOYEE/NUMERAL HOLIDAY /APPLlCATION/MEMBER/NUMBER VA CATION/NOTICE/MEMBER / NUMBER VACATION/ REPORT/MEMBER/ Na . EMPLOYEE /N UMBER MEMBER /HOL I DAY/A PPL ICATION/NUMBER US. Patent ' Nov. 21, 1995 Sheet 6 of 1s FIG. 6A PERMITTED , 5,469,355 FIG. 6B PERMITTED PERMITTED PERMITTED PERM/ TTE FIG. 60 NOT @ NOT PERMITTED \ PERMITTED FIG. 60 NOT PERMITTED / / ;;; INOT \ NOT \ PERMITTED \ ‘PERMITTED \ 1 \ US. Patent Nov. 21, 1995 5,469,355 Sheet 7 of 18 FIG. 7 5/\@i3 SPECIFY TARGET CHARACTER STRING NO [S4 RETRIEVE KEYWORS DEFINED IN NEAR SYNONYM FILE DIV/DE TARGET CHARACTER STRING INTO KEYWORDS DIV/DE GIVEN TARGET CHARACTER STRING BY DEPENDING ON END SYMBOLS KEYWORDS RETRIEVED QILOEIW NEAR SYNONYM 56x , RETR/EVE NEAR SYNONYMS FROM NEAR SYNONYM FILE FOR EACH OF DIV/DED KEYWORDS REGARD D/VIDED KEYWORDS 8r RETR/EVED NEAR-SYNONYMS AS I NEAR SYNONYM GROUP US. Patent Nov. 21, 1995 Sheet 8 of 18 5,469,355 FIG.8 @ EXTRACT WORD FROM EACH NEAR-SYNONYM GROUP ASI CANDIDATE SCOMBINE SUCH WORDS S9 COMB/NATION SHOULD HAVE DISTANCE? S// S IO 1 / RETRIEVE COMBINED CHARACTER STRING AS RETR/EVE COMB/NED CHARACTER STRING AS CHARACTER STRING i‘dlzlggAggEgFigRél/?égRs . IVN WHICH KEYWORDS OR THE/R THE/R NEAR SYNONYMS EAR- SYNONYMS CONTINUE 8: OTHER WORDS IN-BETWEEN PICK UP NEAR-SYNONYMS WITH RESPECT TO TARGET CHARACTER STRING , [5/3 DISPLAY IN ORDER FROM NEAR-SYNONYM MOST SIMILAR TO TARGET CHARACTER STRING END US. Patent Nov. 21, 1995 Sheet 9 of 18 5,469,355 FIG. 9A TARGET CHARACTER STRING (Japanese U.$./PRE$IDENT/CAND/DATE Kanji character for'rice') FIG. 98 /8A r- NEAR SYNONYM FILE RICE’ SASANISHIKI , KOSHIHIKARI AMERICA-“US. , U.S.A., UNITED STATES, STATE OF TEXAS U.S. PRESIDENT~ REAGAN, BUSH CANDIDATE"'ZE)C(OMTA_IEERI\A‘DED, RECOMMENDATION, SELF-"RECOMMENDATION, ELELTIOMRUN, E T PRESIDENTIAL CA NDIDATE‘BUSH, CLINTON FIGJO RETRIEVAL RESULT CANDIDATEDORECOMMENDED BY PRIME MINSTER MIYAZAWA, APPEALED STIMULATING DEMANDS FOR KOSHIHIKARI,AND VIEWING THE ELECTION OFTHE UPPER HOUSE FOR THE NEXT TERM U.S. Patent Nov. 21, 1995 Sheet 11 of 18 5,469,355 TARGET CHARACTER smmc ——— SEASON/N65 / 557- w/B NEAR SYNQNYM FILE / 8B $EA$ON/NG*5ALAD OIL /FLOUR/DRIED EON/T0 /SOY SAUCE SET --' COMBINATION/PACKA GE/GlFT/PACKED /8AG FIG. [2C -RETRIEVAL OF SIMILAR GOODS - COMBINATION OF SEA SON/N65 SET OF SALAD OILS GIFT PACKAGE OFFLOURS PACKAGE OF DRIED BONITTD COMBINATION OF 80)’ SAUCES 9B US. Patent Nov. 21, 1995 5,469,355 Sheet 13 of 18 FIG. I4A TAGET CHARACTER STRING YY COMPANY/C BUILDING FIG. I4B DESTINATION INFORMATION DEFINITION FILE YY COMPANY-YY COMPANY LIMITED I MIN. WALK FROM STATION A-I ST BRANCH OPPOSITE 8 BUILDING ~>2ND BRANCH NEXT TO 8 BUILDING —- 3RD BRANCH NEARB BUILDING-*Z?'D/BRD BRANCH 5TH FLOOR OFC BUILDING~>3RD BRANCH C BUILDING ‘*3RD BRANCH NEAR SCHOOL D *4 TH BRANCH FIG. I4C GENERATED RESUL T YY COMPANY LIMITED 3RD BRANCH A 95 BUILDING MANAGEMENT DATA BASE YY COMPANY LIMITED 3RD BRANCH A CITY cAvEIvuE I - I - I NEXT TO 8 BUILDING 5TH FLOOR OF C BUILDING 5 MIN. WALK FROM STATION TEL. No, / I26 US. Patent Nov. 21, 1995 Sheet 15 of 18 5,469,355 FIG. 16A TARGET CHARACTER STRING DATA COMMUNICATION m. / 0 FIG. [6B SIMILAR DOCUMENT DEFINITION FILE / 80 DATA ‘PERSONAL COMPUTER, ANALOG, COMMAND, AUDIO; ' COMMUNICATIONfTRANSMIT, NETWORK, PROTOCOL, MAIL - - - FIG. 16C / 14D DOCUMENT LIST QUESTIONS 8 ANSWERS TO DATA TRANSMISSION INTRODUCTION TO TE L EPHONE3 DATA COMMUNICATION HANDBOOK DIGITAL SWITCHING ‘ DATA SWITCHING NETWORK CATALOG DATA COMMUNICATION PROTOCOL P5 ' 232C COMMUNICATION HAND BOOK osr MULTI-HYPER Ar COMMANDS a APPL IcA TIONS INFOMATION COMMUNICATION PROTOCOL ACCOUNTING SYSTEM OF vAN AuDID MAIL SERVICE COMPUTER COMMUNICATION DATA COMMUNICATION TECHNIQUE SEMINAR US. Patent Nov. 21, 1995 Sheet 16 0f 18 5,469,355 FIG. [7A TARGET CHARACTER STRING JU GYOU IN BANGOU FIG. [7B NEAR SYNONYM DEFINITION TABLE 85 /No. FIG. 17C GENERATED RESULT US. Patent Nov. 21, 1995 Sheet 17 of 18 5,469,355 FIG. [8A TARGET CHARACTER STRING Arsugi-Sh/ glass repair r» /F FIG. 18B NEAR—SYNONYM DEFITION FILE / 8F Arsugi-Shi-A rsugi G10 ss —- stained glass , p‘ane, window frame,wind0w Repair ’ Work, induSf/‘Y, factory, mater/01,9109 FIG. 18C LIST OF RETRIEVED SHOP NAMES r / [4F (TEL. NO.) (ADDRESS) ATSUGI GLASS XX - X x X X OOAvswusl-l-l ATSUGI STA/NED GLASS STUD/O X X - X X X X A AA 870-5 ATSUGI WINDOW FRAME INC. X X .. X X X X [1113:1213 ATSUGI AAGLASS XX - X X X X XX 2-I6 AAGLASS SHOP xx —xx xx 02-16 AAAPANE SHOP XX-XXXX AAAZ'Z-Z A A PANE MATERIALS X X - x X X x 05-4! A APANE SHOPILTD.) -X X _ X X X x ‘52-2 XX - X X X X X1234 (A) XX - X X xx 000 3-3 A ASPECIAL GLASS INDUSTRIES (B) X x - XX XX X X X4-4 WINDOW FRAME INDUSTRIESAAA A ASPECIAL GLASS INDUSTRIES L J 5,469,355 1 2 NEAR-SYNONYM GENERATING METHOD member No.”. On the other hand, according to the conventional docu BACKGROUND OF THE INVENTION ment retrieval method which uses a combination of the The present invention generally relates to near-synonym generating methods, and more particularly to a near-syn onym generating method which divides a character string near-synonyms that are extracted for each of the words 5 obtained by dividing the target character string, the extrac tion was satisfactory to a certain extent depending on the de?nition of the near-synonyms for each of the words. However, there was a problem in that it was impossible to extract a character string in which the words and a part of their near-synonyms of the target character string are miss which is to be retrieved into words and generates near~ synonyms of the character string by combining near-syn onyms which are extracted for each of the words. The generation of near-synonyms is essential when retrieving various electronic documents with a high accu racy. The “near-synonym” is sometimes also referred to as a “quasi-synonym”. The near-synonym of a certain word ing, a character string which is added with one or more words unrelated to the target character string, a character string having the words or near-synonyms arranged in a refers to a word which has the same or similar meaning as 15 different order from that of the target character string and the like. For example, in the example shown in FIG. 1, it was the certain word. The generation of near-synonyms is par— ticularly eifective when matters related to a certain theme are to be retrieved from a large scale database without omission. An example of a conventional document retrieval using near-synonyms will be described with reference to FIG. 1. In FIG. 1, it is assumed for the sake of convenience that a 20 character String which is to be retrieved (hereinafter simply referred to as a “target character string”) is “holiday/appli cation/member/number”. A predetermined electronic docu ment is retrieved using this target character string, and retrieval method which uses the combination of the near synonyms of the words, there was a problem in that the 25 near-synonyms of the target character string within the electronic document are extracted. In the electronic docu ment, a plurality of near-synonyms are included in the target character string as shown in FIG. 1. In this case, the near-synonyms which are extracted as a result of the retrieval were conventionally the same charac ter string as the target character string and the character String “holiday application member number” having a head which matches that of the target character string. There is also another known document retrieval method impossible to extract “employee number” which is missing a part corresponding to “holiday” and “application”, and “member holiday application number” in which a part corresponding to “holiday”, “application” and “member” is ordered diiferently from the target character string. In addition, according to the conventional document 35 which carries out the retrieval as follows. That is, the target operator must manually insert in the target character String an end symbol (character) “I” at the end of each word when the target character string is divided into the words. In other words, the operator must have a knowledge related to the words and the near-synonyms. In addition, if the target character string is long and contains a large number of words or, a large number of target character strings need to be input, the process of inserting the end symbol is troublesome and a big burden on the operator. Therefore, the conventional document retrieval method realized a satisfactory document retrieval only to a certain extent, and the result of the extraction often omitted the necessary near-synonyms, as may be seen from the character string “holiday application member number” is divided into words “holiday”, “application”, “member” and examples given above. For this reason, a highly accurate document retrieval could not be achieved by the conven tional document retrieval methods. In other words, the conventional generation of the near-synonyms was unsuited or insu?icient for the purposes of carrying out a retrieval “number” which form this target character string, and the near-synonyms are extracted for each of these words. For example, near-synonyms “report”, “employee” and “numeral” are respectively extracted as the near-synonyms with a high accuracy. of the words “application”, “member” and “number”. Such ‘ near-synonyms are de?ned in advance for each word. The electronic document is retrieved using a character string 45 “holiday report employee numeral” which is obtained by SUMMARY OF THE INVENTION Accordingly, it is a general object of the present invention combining the extracted near-synonyms, and a character string which is the same as this character string is extracted as the target character string of the near-synonyms. to provide a novel and useful near-synonym generating method in which the problems described above are elimi nated. Another and more speci?c object of the present invention is to provide a near-synonym generating method for gener Each element forming the target character string is called a “word”, and a character string which is made up of a plurality of words is called a “compound word”. According to the conventional document retrieval meth ating near-synonyms of a target character string by retriev ing a near-synonym ?le based on the target character string, ods described above, a character string which is the same as 55 where the near-synonym ?le de?nes near-synonyms for one the target character string and a character string having a or a plurality of words and the near-synonym generating head which matches that of the target character string were method comprises the steps of (a) retrieving the near extractable as near-synonyms. synonym ?le using words which form the target character However, there was a problem in that it was impossible to string as keys, and extracting near-synonyms which are extract a character string having a head (or a part of the head) 60 de?ned for each of the words used as the keys as the which does not match that of the target character string, a near-synonyms for each of the words forming the target character string (different sounding synonyms) having com character string, (b) forming a near-synonym group from pletely different words and phrases (sounds) from those of the target character string but having the same meaning as the target character string and the like. For example, in the example shown in FIG. 1, it was impossible to extract “vacation notice member number” and “vacation report each of the words forming the target character string and the 65 corresponding near-synonyms so as to form a plurality of such near-synonym groups, and selecting the words or near-synonyms from each of the near-synonym groups, and (c) generating the near-synonyms of the target character
© Copyright 2026 Paperzz