Construction of Japanese Patent Database and Preliminary

Construction of Japanese Patent
Database and Preliminary
Findings on Patenting Activities in
Japan
Akira Goto
and
Kazuyuki Motohashi
RCAST, University of Tokyo
EPIP Conference, Milan, 24-25, February 2006
1
Outline
1.
2.
3.
4.
5.
Introduction
Methodology and Basic Features of Japanese Patent
Database
Comparing Japanese citation data with those of
Europe and the U.S. Overview of innovation
activities by Japanese Patent Database
Citation Analysis by Japanese Patent Database
Conclusion
2
1.Introduction

“Nothing exists until it is measured”
----- Niels Bohr
Difficulty in measuring knowledge despite the talk of
“knowledge society”
Patent as a source of information on technological
knowledge
Great contribution of NBER patent citation database to
innovation research
3

Need for a patent database using JPO patents
IIP PATENT DATABASE
IIP: Institute of Intellectual Property
http://database.iip.or.jp/patentdb/index.html
(only for Japanese)
An English version will be available at
http://www.iip.or.jp/
4
2. Methodology and Basic Features of
Japanese Patent Database
Original Data Source; JPO Seiri hyojunka Data
(Literally translating,
“Arranged and Standardized Data”,
hereafter, JPO Patent Database)
 It contains the information generated through
acceptance of application to examination
process by JPO.
 About 50 boxes of half inch tape.

5

From this original data, we chose variables
important for innovation research, using
NBER database as benchmark.

It covers patents applied after 1964.

The number of patent applications included is
9,027,486.

It is made of the following five files.
6
①Patent application files
(9,027,486records)
Application number
 Application date
 Examination request date
 Applicant number
 Number of claims at patent application
 Lead IPC code at patent application
 Aggregated technology category

7
Aggregated Technology Category
Tech No
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
Title
A griculture
Food Stuffs
P ersonaland D om estic A rticles
H ealth and A m usem ent
D rugs
Separating,M ixing
M achine tools,M etalw orking
C asting,G rinding,Layered P roduct
P rinting
Transporting
P acking,Lifting
N on organic chem istry,Fertilizer
O rganic chem istry,P esticides
O rganic m olecule com pounds
D yes,P etroleum
B iotechnology,B eer,Ferm entation
G enetic Engineering
M etallurgy,C oating m etals
Textile
P aper
C onstruction
M ining,D rilling
Engine,P um p
Engineering elem ents
Lighting,Steam generation,H eating
W eapons,B lasting
M easurem ent,O ptics,P hotography
C lock,C ontrolling,C om puter
D isplay,Inform ation Storage,Instrum ents
N uclear physics
Electronics com ponents,sem iconductor
Electronics circuit,com m unication tech.
O thers
Corresponding IPC
A 01
~
A 21
~
A 41
~
A 61
A 61K
~
B 01
~
B 21
~
B 24
~
B 41
~
B 60
~
B 65
~
C 01
、
C 07
C 08
~
C 09
~
C 12
C 12N 15/
~
C 21
~
D 01
、
D 21
~
E01
E21
~
F01
~
F16
~
F21
~
F41
~
G 01
~
G 04
~
G 09
G 21
~
H 01
~
H 03
、
B 81
(
ExceptA 01N )
A 24
A 47
A 63
(
Except A 61K)
B 09
B 23
B 32
B 44
B 64
B 68
C 05
A 01N
(
Except B 31)
C 11
C 14
C 30
D 07
B 31
E06
F04
F17
F28
F42
G 03
G 08
G 12
、
F15
、
C 06
H 02
H 04
B 82
、
H 05
N BER
6
6
6
3
3
1
5
5
6
5
5
1
1
1
1
3
3
5
6
6
6
6
5
5
6
6
4
2
2
4
4
2
6
8
②Patent registration file
(2,618,699records)
Application number
 Registration number
 Registration date
 Rights expiration date
 Rights holder number
 Number of claims at patent registration
 Lead IPC code at patent registration
 Aggregated technology category

9
③Applicant file(626,708 records)
Applicant number
 Applicant name
 Applicant type (individual, corporation or
government)
 Country and prefecture code
 JPO applicant code

10
④Rights holder file(204,622 records)
Rights holder number
 Rights holder name

11
⑤Citation information file(5,318,225records)
Citing patent application number
 Cited patent application number
 Citation type

12
3.Overview of innovation in Japan by
IIP patent Database
9,027,486 applications, 4,427,840 requests for
examinations, 2,594,044 grants from 1964
through 2003.
 Information on patent right termination is also
available.
 Data truncation problem

13
Patent application by technology
500000
450000
400000
350000
300000
250000
200000
150000
100000
50000
0
Unclassified
2003
2000
Other
1997
Machinery
1994
1991
Electronics
1988
1985
1982
Medical&Drugs
1979
1976
ICT
1973
1970
1967
1964
Chemical
14
Patent application by applicant type
450,000
400,000
350,000
300,000
250,000
200,000
150,000
100,000
50,000
Foreign
Individual
Private Enterprise
Non Profit Org.
2002
2000
1998
1996
1994
1992
1990
1988
1986
1984
1982
1980
1978
1976
1974
1972
1970
1968
1966
1964
0
15
Patent registration by technology
250000
200000
150000
100000
50000
0
2003
Machinery
2001
1999
1997
Electronics
1995
1993
1991
Medical&Drugs
1989
1987
1985
ICT
1983
1981
1979
1977
1975
1973
Chemical
Other
16
Number of claims by technology
16
14
12
10
8
6
4
2
0
2003
Medical&Drugs
Other
2001
1999
1997
1995
1993
ICT
Machinery
1991
1989
1987
1985
1983
1981
1979
1977
1975
1973
1971
Chemical
Electronics
17
Patent Life Length by Applicant Type
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
1
2
3
4
Foreign
5
6
7
8
Individual
9 10 11 12 13 14 15 16 17 18 19 20
Private Enterprise
Non Profit Org.
18
Patent Life Length by Application Date
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
1
2
3
4
5
6
7
Before 1970
8
9 10 11 12 13 14 15 16 17 18 19 20
1970's
1980's
After 1990
19
3. Citation in IIP patent Database

Two types of information on citation---both are
given by patent examiner, not inventor
(1)as prior art to reject application: subject to
change according to examination practices
(2)In Patent Gazett, reference to previous
patent is made occasionally. Notable previous
related patents. Only for patents granted, and
after 1985.
20
Patent counts with citation made
140,000
120,000
100,000
80,000
60,000
40,000
20,000
0
2003
2000
Patent Publication
1997
1994
1991
Examiner Rejection
1988
1985
1982
1979
1976
1973
1970
1967
1964
No Citation
Both
21
4. Comparing JP Citation Data with
those of Europe and US
JP Citation data: (1) prior art for a reject
application by examiner and (2) those in Patent
Gazett (again by examiner’s citation)
 EP Citation data: minimum number of
important reference documents
 US Citation data: applicant’s citations (but
over 40% of citations are made by examiner,
Alacer and Gittleman, 2004), citation
inflation?

22
Correction of patent data biases by
OECD patent family data
JP-1
Family A
JP-2
JP-101
JP-102
Family AA
EP-1
EP-101
JP-4
JP-103
Family B
JP-5
JP-104
Family BB
US-1
23
Data
JP Citation: IIP Patent Database
EP Citation: EP Citation Database
US Citation: NBER Patent Database + B. Hall Extension
OECD Patent Family: 1978-2002 Data
JP
EP
US
Total C itation A verage
Fam ily # N um ber Fam ily #
117700
77045
1.53
113490
80991
1.40
566756 205974
2.75
24
Overlapping Citing Cited Pairs?
JP Patent
83,149
21,295
6,437
2,609
531,320
11,532
US Patent
97,122
EP Patent
25
What about citation lags?
18%
16%
14%
12%
10%
8%
6%
4%
2%
0%
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
JP
EP
US
26
Correlation of citation count
Backward citation count (citation made)
EP
US
JP
EP
0.22
0.23
US
JP
0.24
-
Forward citation count (citation received)
EP
US
JP
EP
0.42
0.42
US
JP
0.36
27
5. Citation Analysis
Selecting granted patents from all citing and
cited pairs to capture cumulative process of
innovation
 5,318,225 -> 1,602,130 for 2,618,699 granted
patents
 Comparing patent counts, generality and
originality index across applicant’s type and
technology field

28
Share of patents with citing made and
citation received
70%
60%
50%
40%
30%
20%
10%
0%
2002
2000
1998
1996
1994
1992
1990
1988
1986
1984
1982
1980
1978
1976
1974
1972
1970
1968
1966
1964
Citing
Cited
29
Indicators for backward citation
(citation made)
2.5
0.18
0.16
2.0
0.14
0.12
1.5
0.10
0.08
1.0
0.06
0.04
0.5
0.02
0.0
0.00
2002
2000
1998
1996
1994
1992
1990
1988
1986
1984
1982
1980
1978
1976
1974
1972
1970
1968
1966
1964
Number of citing (left scale)
Originarity (right scale)
30
Indicators for forward citation
(citation received)
2.5
0.12
0.10
2.0
0.08
1.5
0.06
1.0
0.04
0.5
0.02
0.0
0.00
2002
2000
1998
1996
1994
1992
1990
1988
1986
1984
1982
1980
1978
1976
1974
1972
1970
1968
1966
1964
Number of cited (left scale)
Generality (right scale)
31
Descriptive regressions of patent
indicators (1)
1970's
1980's
1990's
2000's
Firms
Public Organization
33 technology field dummies
…………
R-squared
Share
Citing (%)
0.000
(.)
0.144
(8.07)**
0.299
(16.82)**
0.376
(21.00)**
0.057
(3.75)**
0.005
(0.33)
Mean #
Citing
0.000
(.)
0.197
(4.63)**
0.416
(9.81)**
0.453
(10.43)**
0.118
(3.26)**
-0.021
(0.56)
Mean
Originality
0.000
(.)
0.021
(1.28)
0.048
(2.89)**
0.059
(3.50)**
-0.012
(0.82)
-0.001
(0.06)
Share
Cited (%)
0.000
(.)
0.037
(2.95)**
-0.128
(10.27)**
-0.319
(25.34)**
0.036
(3.33)**
0.026
(2.38)*
Mean #
Cited
0.614
(11.53)**
0.626
(11.80)**
0.353
(6.66)**
0.000
(.)
0.152
(3.83)**
-0.007
(0.17)
Mean
Generality
0.086
(9.85)**
0.096
(11.11)**
0.060
(6.89)**
0.000
(.)
-0.013
(1.95)
-0.001
(0.18)
0.67
0.45
0.19
0.77
0.48
0.48
32
Descriptive regressions of patent
indicators (2)

High citing patent counts:


High cited patent counts:


printing, machine tools, non-organic chemistry, drug,
organic molecule, dyes, metallurgy, textile, optics,
communication technology
printing, paper, drug, organic molecule, display and
information storage
High generality index:

separating and mixing, organic molecule, metallurgy
33
6. Conclusion



Construction of IIP Patent Database: filling a hole of patent
database initiative in the triad
Open up opportunities for further research of on innovation
and further data developments: for example IIP patent data
base can be linked with JPO’s Survey on Intellectual Property
Activities
Citation data analysis



Substantial differences: not only by patent system (JP-EP vs US), but
also home biases
On the other hand, substantial cross co-relation for citation received
information (indicator of important patent for cumulative innovation)
Preliminary citation analysis conducted
34
Future works

Data cleaning of applications name



Using Japanese character makes things better
Standardized name and JPO applicant coded after 1992
International coordination: match to Derwent codes?
Using patent family information
Adding other variables such as inventor
information, post grant opposition etc.
 Linkage with firm level data such as financial
report data

35