‘ US005761666A United States Patent [19] [11] Patent Number: Sakai et a]. [45] Date of Patent: [54] DOCUMENT RETRIEVAL SYSTEM I I I 5,761,666 Jun. 2, 1998 4.382277 5/1983 Glaser et a1. ......................... .. 395/605 5,123.103 6/1992 OhTaki et a1. ........................ .. 395/605 [75] Inventors: Tetsuya Sakai. Tokyo; Seiji Miike; Kazuo Sumita. both of Yokohama. all of Japan Primary Examiner—Paul R. Lintz Attorney, Agent, or Firm—Oblon. Spivak. Mcclelland. Maia; & NcustadL RC [73] Assignee: Kabushiki Kaisha Toshiba. Kawasaki. Japan [57] [21] Appl' No': “(L631 [22] Filed: [30] A document retrieval system is provided with an original Mar. 4, 1996 . . sentence processing unit which sets a plurality of sentence . . . types for identifying the contents of sentences. such as Foreign Apphcauon pnonty Data Mar. 16, 1995 [JP] “OPINION” and “PROPOSAL.” prepares sentence-unit Japan .................................. .. 7-083458 excerpt sentence data classi?ed according to the sentence 6 5 types from an original sentence database storing original """""""""""""""""" . . sentence data constituting documents. and stores the excerpt . ................................. .. , , ., 707/5 Sentence . [58] [56] ABSTRACT data as exec . t swtcncc database_ The ori .n al . . g1. sentence processlng unit comprises a type determination Field of Search ............................... 395/605. 601. 395/759 ’ 707/1 ‘ 3 ' 5 ' 100 References Cited Section for winding ?xccrpt Scmcnce data con?sponding to a designated sentence type. and a shaping section for shap ing the excerpt sentence data in a predetermined format. e.g.. in such a format that a conjunctive is deleted. U.S. PATENT DOCUMENTS 4,290,115 9/1981 Pin a a1. .............................. .. 395/605 RETRIEVAL RETRIEVED TYPE REOUESJ'II REQUE/FT PATTERN SECTION ORIGINAL SENTENCE I 23M TYPE 15 Claims, 29 Drawm'g Sheets EXCERPT SENTENCE+ EKCERPT INTERFACE ~23a INTERFACE UNIT 8 2T h 5 22 RETRIEVAL UNIT EXCERPT UNIT €5 EXDERPT aaEaE 5M OR|G|NAL SENTENCE DATA BASE <- -------------------------- -- """"""""""""""" " WORD INDEX N7 PATTERN DETERMINATION DICTIONARY : E : v "' ORIGINAL SENTENCE PROCESSING UNIT ~20 US. Patent Jun. 2, 1998 SI g2 CPU DISPLAY UNIT INPUT EXCERPT SENTENCE DATABASE ORIGINAL SENTENCE 5,761,666 g3 DISPLAY CONTROLLER STORAGE UNIT 5w Sheet 1 0f 29 CONTROLLER 5 ID INPUT DNII R 6‘” DATA BASE 5 7w H IIIDRD INDEx 8\,\ TYPE DETERMINATION DICTIONARY 9 \q? CONJUNCT IVE D I CT I ONARY I 4 FIG. I W KEYBOAD Ha MOOSE ~~IID US. Patent Jun. 2, 1998 Sheet 3 0f 29 J mm E32S8E6%Q QAEDSmxQV 2:2952502 5%85215m$ 2 2mm g. 28% QSlEOV 'Sm$23;5:8 5:SE. 23% 3g in _ up:55m;.8. 2 5% QZE w 2 5% E?. -gmwmo @NN22 3E2 3%65 222:58 62 E E8 2. gmummmcLm>mQEo.Q _lk52w.32,l i.u 32. G5 new” 8/ 5,761,666 US. Patent Jun. 2, 1998 Sheet 4 0f 29 5,761,666 ORIGINAL sENTENcE PROCESSING - 1 READ EACH sENTENcE OF DOCUMENT FROM ORIGINAL sENTENcE DATA BASE ~s\ 1 /8 PATTERN DETERMINATWW» DlCTlONARY /9 CONJUNCTIVE DICT'ONARY --+ PATTERN DETERMINATION SECTION DETERmNEs PATTERN CORRESPONDING TD EAcH ~s2 SENTENCE AND ADD TAG To EACH sENTENcE SHAPING SECTION DELETES ggwélrulggwz or EACH STORE THE RESULT As EXCERPT sENTENcE DATA F FIG. 4 ~53 ~54 U.S. Patent Jun. 2, 1998 Sheet 5 of 29 (TYPE) 5,761,666 (PATTERN) OPI NION/PROPOSAL 'i t is considered that - - '" 'Shou | d it not be that - - -“ ‘it is Drouosed that‘ - -” 'i think that'--" POSING OF PROBLEM ‘The ' 3 a prob | em. . .~ ‘Con ' ration wi il now be given aim” EXPECTATiON/GUESS 'Maybe- - -" 'I think FIG. that - - - " 5A TYPE OF EXCERPT (PLURAL EXCERPT TYPES MAYBE DESIGNATED) Q/OP I N ION/ PROPOSAL [22> El POSING 0F PROBLEM Q/EXPECTAT ION/GUESS D PAST/BACKGROUND E] STATUS ouo [1 CONCLUSION F I G. 58 ~ ALL SENTENCES OPiN ION/ PROPOSAL SENTENCE NOT INCLUDED iN DESIGNA TYP US. Patent Jun. 2, 1998 Sheet 7 0f 29 5,761,666 (ExcERPT PROCESS /5 EXCERPT SENTEN om BA E READ OUT EXCERPT SENTENCE I -* DATA CORRESPONDING T0 ~SIO SELECTED DOCUMENT EXTRACT FROM EXCERPT SENTENCE DATA ALL SENTENCES PROVIDED WITH TAG or SELECTED um N3“ I ORIGINAL SENTENCE DATA BASE STORE SENTENCE M0. GP EACH EXTRACTED SENTENCE N312 DISPLAY ALL EXCERPTS ON EXCERPT SENTENCE SCREEN IN ITEMIZED FORMAT ~SI3 READ OUT ORIGINAL ‘M’ SENTENCE DATA OF SELECTED ~SI4 DODUMENT DISPLAY ORIGINAL SENTENCE ON ORIGINAL SENTENCE SCREEN AND EMPHASIZE SENTENCE CORRESPONDING TO STORED SENTENCE NO. END FIG. 7 ~ SIS US. Patent Jun. 2, 1998 5,761,666 Sheet 8 of 29 CPATTERN DISPLAY ) SELECT TYPE TYPE /a ~s2o EXTRACT ALL PATTERNS DETERMINATION --~ CORRESPONDING To DICTIONARY SELECTED TYPE DISPLAY PATTERNS CORRESPONDING To SELECTED TYPE lN TABLE FORMAT I ( END FIG. ) 8A TYPEIOPI MON/PROPOSAL PATTERN: ‘It is considered that -~” ‘Should it not be that~ '" ‘It is proposed that-- '" ‘I think thaT---" FIG. 8B ~s21 M522 US. Patent 5,761,666 Sheet 9 0f 29 Jun. 2, 1998 ( SIESEII“ ) .45 XCERPT ENTENC DATA BA I READ OUT EXCERPT SENTENCE DATA 0F SELECTED DOCUMENT EXTRACT ALL SENTENCES PROVIDED WITH TAG OF SELECTED TYPE I EXTRACT IMPORTANT SENTENCE INCLUDING RETRIEVAL KEYWORD ~S32 DISPLAY ALL EXCERPTS ON EXCERPT SCREEN IN ITEMIZED FORMAT ~ S33 END FIG. 9 US. Patent Jun. 2, 1998 Sheet 11 of 29 5,761,666 DISPLAY OF SPECIFIC SENTENCE ,,a TYPE DETERMINATION I DICTIONARY + SELECT TYPE ~ $40 READ our ALL PATTERNS CORRESPONDING T0 SELECTED TYPE ~s4I DISPLAY EACH PATTERN AND NUMBER OF SENTENCES ~S42 MATCHING WITH EACH PATTERN SELECT PATTERN r/ ~ S 43 5 EXCERPT SENTENCE READ OUT EXCERPT SENTENCE P" DATA CORRESPONDING T0 DATA BACE A N844 SELECJ?ED TYPE AND PATTERN DISPLAY EXCERPT SENTENCE DATA IN EXCERPT SCREEN IN ITEMIZED FORMAT END FIG. II ~ S45 US. Patent Jun. 2, 1998 Sheet 13 0f 29 5,761,666 (CONTROL OF mou® DESIGNATE AMOUNT n OF SENTENCES (NUMBER OF N 550 SENTE CESI EXTRACTED AND DI PLAYED EXTRACT PATTERNS CORRESPONDING TO SELECTED TYPE AND PRIORITY DATA I = I TYPE r" DETERMINATION DICTIONARY ~ 852 COMPARE AMOUNT OF SENTENCES TO CHING WITH PATTERNS UP PRIORkTY AT D AOFO RT l DWITH T u WITIERPATI'TERNS OF PRIORITY END ~ 553 S56 US. Patent Jun. 2, 1998 Sheet 14 of 29 TYPE DETERMINATION DICTIONARY PRIORITY > <TYPE> (ORDER OPINION/PROPOSAL 5,761,666 (PATTERN) I 2 3 ‘I t is considered that - ' -” "Should i t not be that ' - -" 'I t i s nrooosed that - - -" 4 'l thi nk that - ~ -” FIG. I4A Q NUMBER OF SENTECES MATCHING WITH PATTERN IN DOCUMENT ‘It is considered that~-” 'Should it not be that ' - -” ‘It is Drooosed that - - -" 'I think that - ' -" FIG. I4B NOD-‘ G EXCERPT SENTENCE SCREEN NUMBER OF SENTENCES SHIFTEDzWITHIN E8) The Intention of AAAis considered to be Rather. shou Id resign. -As for 000 , i t i s eonsiderd that FIG. 14C . . US. Patent Jun. 2, 1998 Sheet 15 0f 29 5,761,666 UPDAT ING PROCESS TmMwclo ENm R mm NMRMNMmmaE P.u..M _.Tm F.Al_. . GTPTPD3mE DSTD\O YHNMTTENUETYAPH“TR.NPIDYMMEBRplm WUSWNA TEWTCD Emw?MMmanA“ T mN FIG. 15 _ S5 6 O.2.1 VN TN , END CORRECHON 0F DEGREE 0F PRIORITY D\SPLAY EXCERPT SENTENCE ~STO DATA ACCOMPANIED WITH PRIOR\TY ORDER P .MNRmm m AC PA mm m ME WE NE NN mm 0 ?wm mm DEmsE SRETASNE RM T... mTl HON FIG. 18 END S7 2l US. Patent Jun. 2, 1998 Sheet 16 of 29 TYPEIOPINION PROPOSAL @CIEE) ADDITIONAL PATTERNS should a PATTERN: ‘I t is considered that > ' ~" Z‘ShduJ 51 it 516% 681 1a:- 1% ‘It is Dronosed that - - ~" ‘I think that - ' -” FIG. PATTERN: ISA 'It is considered that ' - -” ‘Should it not, be that ~ - -” / / I I I r I r 1 0 , a ,;'I'///////// 1 ‘It is proposed that---” ‘I think that ' ' "' UPDATING OF TYPE DETERMINATION DICTIONARY FIG. I6B 5,761,666 US. Patent Jun. 2, 1998 Sheet 17 0f 29 ORIGINAL SENTENCE SCREEN SENTENCE o SENTENCE Q SENTENCE FIG. << DICTIONARY REGISTRATION //tsh(f)ul 1217/, 0 ITA REGISTERED WORD: SCREEN 'should- - -" IN WHICH TYPE OF PATTERN WILL THE WORD BE REGISTERED? Q/OP l N ION/ PROPOSAL [I POSING OF PROBLEM [I] EXPECTATION/GUESS @(QE) FIG. ITB 5,761,666 US. Patent Jun. 2, 1998 Sheet 18 0f 29 PRIORITY ) (ORDER <TYPE> OPINION/PROPOSAL (PATTERN) I 'I t is considered that - ' -” 2 ‘Shouid it not be that - - -” 3 'I t is nronosed that - - ~" 4 'I think that - ~ -” F I G. I9A -The intention of AAA i 5 considered to be Rather. shou I d resign. F l G. OPINION/PROPOSAL PRIORITY ) (ORDER 2 I 3 4 . 190 -As torOOO . it i 5 considered that (TYPE) 5,761,666 . I9B (PATTERN) ‘It is considered that - - "' ‘Shou Id i t not be that - - -" ‘It is nronosed that - - '” 'I th i nk that - - '" FIG. I9C US. Patent Jun. 2, 1998 Sheet 19 of 29 5,761,666 I DOCUMENT RETRIEVAL I SELECT TYPE AND PATTERN ~S80> RETRIEVE PLURAL DOCUMENTS ~S8I SORT RETRIEVAL RESULT IN THE ORDER BEGINNING WITH ONE INCLUDING MANY RELEVANT SENTENCES ON THE BASIS OF SELECTED TYPE AND PATTERN ~S82 DI PLAY RETRIEVED RESULT IN THE SORTED ORDER by 383 I ( END F I G. ) 20 SELECTION OF ALL SENTENCES L, 5 ORIGINAL DATA BASE SELECT TYPE _S/9O ORIGINAIs DATA BA E r YES EXTRACT EXCERPT SENTENCE DATA ~592 f, 6 II I EXTRACT 595~ ORIGINAL DATA I SENTSNCES WITH ~S93 TAG F TYPE I EXTRACT ALL 896 DISPLAY m ITEMIZED FORMAT T“ 394 END FIG. 23 DISPLAY DATA
© Copyright 2026 Paperzz