PLEASE DO NOT QUOTE OR CITE!! 7/29/2017 12:33 AM Materials for IPSC Presentation Aug. 11-12, 2006 Katherine J. Strandburg Kinetics of the Patent Citation Network: A Physics Approach to Understanding the Patent System I have not yet prepared a law review style article about this research. For purposes of this presentation, I have attached: 1. An NSF proposal that gives results of the research so far and some ideas for future work. (Non-technical) 2. A draft paper that we will be submitting to the Proceedings of the National Academy of Sciences (fairly technical). I am very interested in your reactions to and suggestions for this work! 1 PLEASE DO NOT QUOTE OR CITE!! 7/29/2017 12:33 AM Collaborative Research: A Statistical Physics Approach to Patent Citation Networks A Proposal to the National Science Foundation Principal Investigators: Katherine J. Strandburg, Jan Tobochnik, Peter Erdi 1. Introduction We propose to apply network analysis approaches from statistical physics to a study of the patent citation network. The goal of our study is to gain a better understanding of the structure and dynamics of the United States patent system and to investigate the implications of current patenting behavior for innovation policy and law. To our knowledge, our study is the first to address the current groundswell of concern about the effects of increased patenting on innovation by combining the recent availability of extensive and detailed computerized data about United States patents with recent advances in the statistical physics study of complex networks. The project will not only advance our understanding of the patent system -- a matter of critical significance for the long term health of United States innovation and economic growth -- but will also help to demonstrate the important role that network theory and other techniques of statistical physics might play in furthering the interdisciplinary approach to legal theory that is exemplified by the extremely influential law and economics line of attack. The study proposed here thus stands at a particularly propitious crossroads of societal and technical events: recently available data, recently developed methodology, and currently pressing legal concerns. The principal investigators are also singularly equipped for this research: Dr. Strandburg is a patent law professor and former litigator with a background as a statistical physicist performing numerical investigations of critical phenomena; Dr. Tobochnik is a statistical physicist presently working in the area of network theory and an expert on the application of numerical techniques to statistical physics problems; and Dr. Erdi is an expert on networks and complex systems. 2. Background Recent years have seen a major upsurge in patenting, an expansion of the range of innovations which are eligible for patent protection, and a perception that the United States economy relies more and more heavily on knowledge and innovation for its success. (See, e.g., Merrill, 2004; Federal Trade Commission, 2003; Cohen, 2003) At the same time, developments in the law, including the establishment of a single appellate court that hears the vast majority of patent appeals in the United States have led to suggestions that, in one way or another, the legal system is becoming increasingly “patent-friendly;” that patents are being issued for lower quality innovations; and that the legal rights awarded to patentees are becoming stronger. (For discussion of these issues see, e.g., Allison, 2002; Burk, 2003; Burk, 2005; Cohen, 2003; Dreyfuss, 1989; Dreyfuss, 2004; Federal Trade Commission, 2003; Hall, 2005; Heller, 1998; Jaffe, 2004; Landes, 2004; Litmann, 1997; Lunney, 2000/2001; Merges, 1990; Merges, 2000; Merrill, 2004; Rai, 2003; Shapiro, 2001; Thomas, 2003; Turner, 2005.) Along with empirical evidence that increasingly calls into question the extent to which patents actually provide incentives for research and development, (see, e.g., Bessen, 2005; Cohen, 2003; Hall, 2 PLEASE DO NOT QUOTE OR CITE!! 7/29/2017 12:33 AM 2001; Heald, 2005; Long, 2002; Malchup, 1958; Wagner, 2005), these trends have converged to raise growing concerns among academics and policymakers about whether patent law and policy are adequately designed to “promote the Progress of . . . useful Arts.” U.S. Const., Art. I, sec. 8. To accomplish this constitutionally mandated objective, the benefits of patent protection must be carefully balanced against the costs that patents impose on attempts to improve upon existing technology. United States patents are issued by the Patent and Trademark Office (USPTO) after applications are examined to determine, among other things, whether the patent “claims” meet the legal requirements of novelty and non-obviousness. (Patent claims are specific statements of the scope of the legal coverage of a patent.) In the course of this examination, claims are compared against potential “prior art,” consisting in large part of prior patents and other publications in relevant technical fields. See 35 U.S.C. §§ 102, 103, and 112 for the main statutory requirements for patentability; see also the USPTO Manual of Patent Examining Procedures. Potential prior art is identified by applicants and their patent attorneys and by the official patent examiners. References relevant to the examination are cited in the issued patent document. See MPEP § 707.05. Historically, empirical analyses of the effects of patent protection have been very difficult historically due to a lack of usable data. Though important information about the economic value of patents (such as most records of royalties and licensing) is proprietary, in some respects the United States patent system is extremely transparent. Complete records of patent prosecution, patent citations, and patent litigation over many years are publicly available. In the past, however, the form in which the data were available made quantitative analysis a difficult and painstaking process. (For examples of past uses of patent citation data and other patent statistics, see Griliches, 1990; Trajtenberg, 1990.) Recent advances in computer technology, along with the efforts of empirical economists (see especially, Hall, 2003), have rendered a wealth of publicly available patent data suitable for large-scale quantitative analysis. A surge of attempts to capitalize on this new availability by economists and economically-inclined legal academics has resulted. (See, e.g., Allison, 2004; Bessen, 2005; Duguet, 2005; Hagedoorn, 2003; Hall, 2005; Harhoff, 1998; Harhoff, 2003; Jaffe, 2003; Lanjouw, 2004; Marco, 2005; Maurseth, 2005; Moore, 2005; Ziedonis, 2004.) This work has applied econometric techniques to connect patent characteristics -- such as the number of citations made and received, the number of patent claims and whether the patent was renewed; inventor characteristics -- such as geographical location and employment context; and, most recently, patent examiner characteristics -- such as length of experience -- to financial, economic and technological indicators. In this way, patent data has been used to investigate knowledge flows and spillovers; to attempt to determine the characteristics of valuable patents; and to try to understand the role innovation plays in the behavior of various types of institutions, from universities to small and large firms. Our approach is different from, but complementary to, these econometric studies. We apply techniques from statistical physics to the analysis of patent citation data. Our proposed study is particularly timely and promising for several reasons. First, one way to view the patent system is as a network of nodes (patents) connected by citation links. While the precise interpretation of a particular citation of one patent by a later patent varies, on average citations convey valuable information about the relationships between the technologies covered by the citing and cited patent. Current econometric approaches 3 PLEASE DO NOT QUOTE OR CITE!! 7/29/2017 12:33 AM probe the connections between citation counts and various quantities of interest, such as probability of licensing, probability of renewal, and probability of litigation. However, because of its connectivity, the network of patents and citations is a much richer source of information about the patent system and the associated technological development than is captured by these techniques. Recently, statistical physicists have undertaken the study of a wide variety of networks with great vigor. (See, e.g., Albert, 2002; Barabasi, 2002; Barabasi, 1999; Dorogovtsev, 1999; Dorogovtsev, 2003; Klemm, 2002; Krapivsky, 2000; Newman, 2003; Pastor-Satorras, 2004; Redner, 2005; Watts, 1998; Watts, 2002; Zhu, 2003.) These studies have looked at the structural properties of networks and the dynamics of their growth both empirically and theoretically, cataloguing and exploring the similarities and differences between various types of networks. Applying network analysis to the patent citation network has great potential both to complement existing econometric studies of patent citations and to advance the understanding of statistical networks in general because of the fact that the patent citation network is one of the largest and most completely characterized networks available for study. A second reason to take a statistical physics approach to analyzing the patent system is the preponderance of highly skewed, power-law-like statistical distributions that characterize both the citation network and other measures of patent system attributes, such as the distribution of innovation values. (See, e.g., Astebro, 2003; Lemley, 2005; Marsili, 2005; Scherer, 2000a; Scherer, 2000b.) Statistical physics, as a discipline, has a long history of dealing with such highly nonlinear behaviors in the contexts of critical phenomena and self-organized criticality, (see, e.g., Bak, 1999; Hohenberg, 1977; Jensen, 1998; Ma, 2000; Stanley, 1971; Stanley, 1999), and has developed both intuitions and analytical techniques tailored to such phenomena that may be extremely useful in understanding the patent system. To the extent that the law must deal with skewed and heavy-tailed probability distributions in other contexts, this particular application of insights from physics may prove exemplary. Finally, our work is part of a broader movement attempting to apply physics-style analysis to economic questions. Such approaches have begun to bear fruit in the study of finance, but application of physics-like analysis to the study of innovation is in its infancy. (See, e.g., Chakrabarti, 2005; Cowan, 2004; Frenken, 2005; Mandelbrot, 2004; Mantegna, 1999; Silverberg, 2005; Wille, 2002.) 3. Initial Results We have already begun the studies proposed here and achieved interesting results. After a preliminary investigation of some basic network properties, our initial study has focused on a kinetic description of the growing patent network. A short technical paper outlining the results of this initial work has been submitted for publication. Preliminary results were presented at the March 2005 meeting of the American Physical Society and at the 2005 Intellectual Property Scholars Conference at Cardozo Law School. In the analysis we have used the Hall, Jaffe, and Trajtenberg dataset (Hall, 2003) which includes the approximately 16 million citations made by the more than two million patents issued by the USPTO between 1975 and 1999. Our results show that, to a good approximation, the growth of the patent system can be modeled by a random attachment model in which the probability that an existing 4 PLEASE DO NOT QUOTE OR CITE!! 7/29/2017 12:33 AM patent will be the target of a particular citation is given by a separable function of its current number of citations, its age, and the time at which it issued. Thus, P(l,k,t) A(k,l)/S(t), where l is the age of the patent as expressed in patent numbers, k is its current number of citations received (also called in the literature “forward citations”), and t is the time of issue, also expressed in patent numbers. S(t) is an overall scale factor determined by the sum of A(k,l) over all existing patents at time t. To analyze the data we have developed a novel iterative technique which we have used to extract A(k,l) and S(t) from the citation data. We do not assume a particular functional form for these functions, but find empirically that A(k,l) is approximately given by a product of two functions, which we denote Ak(k) and Al(l). See Figures 1 and 2. Each of these functions exhibits a power-law decay for large values of its argument. 3.1 Dependence on Citations Received The dependence on citations received, Ak(k) ~ k, where 1.19, indicates that the patent system exhibits the well-known “preferential attachment” phenomenon in which highly cited patents are more likely to be cited in the future. From network studies, this type of kinetics is widely understood to produce a highly skewed node degree distribution with a relatively long tail (see Albert, 2002; Newman, 2003) (corresponding in our case to the well-known skewed distribution of citations received per patent). When = 1, preferential attachment can lead to a power law, or Pareto, node degree distribution. Theoretical models suggest that growing networks with super-linear preferential attachment ( > 1) undergo a “condensation” in which nearly all nodes in the system have very low connectivity and are connected to a small (finite even in an infinite network) number of very highly connected nodes. (Krapivsky, 2000.) The fact that > 1 in the patent network suggests that citations are highly concentrated -- in other words, a relatively small number of patents receive most of the citations. In the patent system, the accumulation of citations by highly cited nodes is mitigated by a decay of citation probability with age, so the sharp condensation will probably not occur, but the relatively high value of still means a highly unequal distribution of patent “citability.” 3.2 Dependence on Patent Age As noted, the age dependence of the citation probability is approximately independent of k, especially at large l and k, as shown in Figure 2 for several values of k. The observed Al(l) has a peak for relatively small l and decays as l- for large l, where 1.6. This observed aging function highlights the complicated dynamics of technical innovation. Since, presumably, the citation probability function reflects underlying substantive differences between patents, one way to interpret the functional form of Al(l) is that there are “typical” patents, which are most likely to be cited relatively soon after they issue, along with a long tail of “pioneer”-type patents that continue to influence innovation over long periods of time. The age dependence appears to be described most consistently by using the patent number as a measure of age. Given that patents have been issuing at a rapidly accelerating rate, (see, e.g., Jaffe, 2003), the fact that patent number provides a meaningful measure of age is significant, suggesting, for example, that the twenty-year 5 PLEASE DO NOT QUOTE OR CITE!! 7/29/2017 12:33 AM patent term is becoming effectively longer as measured by the pace of innovation. Similarly, the peak age for citation is approximately 200,000 patent numbers, corresponding in 1999 to just over a year. The accelerating issuance of patents means that this peak age is becoming shorter and shorter in real time. Nonetheless, though the peak age for citation is relatively short (and getting shorter), patents have a nonnegligible chance of being cited at any age. Moreover, while highly cited patents are more likely to be cited, the form of Al(l) is relatively independent of in-degree, i.e., citation probability decays slowly even for low k. This means that, at least on average, patents are never entirely “dead.” There are “sleeper” patents that go without citation for long periods of time, only to re-emerge at a later time. If one makes the reasonable assumption that receiving a citation is some indication of social value, this observation suggests the difficulty of predicting the social value of innovations in the long term. 3.3 Time Dependence 3.3.1 Time Dependence of Scale Factor S(t) Independently of the approximate decomposition of citation probability into separate k and l dependent components, our analysis yields a time-dependent scale factor, S(t), which is normalized such that the time-dependent probability that a brand new patent (with k = 0 and l = 1) will be cited by a given citation is 1/S(t). (See Figure 3.) Over time, this scale factor increases (and the probability that a new patent will be the target of any particular citation decreases) because the number of patents increases while older patents continue to be citable. This increase dilutes the chance that a given patent will receive any particular citation. However, we find that this decay has been compensated by an increasing number of citations made by each patent so that the probability of a brand new patent being cited by a given patent has been slowly growing over time -- patents do not become “lost in the crowd” of new patents. (See Figure 4.) These results might suggest that, despite the notoriety of dubious patents on crustless peanut butter and jelly sandwiches and so forth, (see, e.g., Crouch, 2005; Varian, 2005) there has not been an average increase of truly “outlier” patents that are far beyond the technological mainstream. In other words, patent applicants and examiners have on the average felt that the number of “material” patents to be cited by any given patent has been increasing at least as fast as the number of patents. The observed time dependence is consistent, however, with a picture of increasing patent “density”, in which patents constitute smaller and smaller technical advances, leading to a crowded “thicket” of closely related patents, all of which must be cited by each new patent in the vicinity, even though none may gather citations for long. The interpretation of these average results is complicated, however, by the highly skewed distributions of citations made and received and by the possibility that citation practices may have changed over time due, for example, to the increasing availability of electronically searchable databases. Thus, a more detailed understanding of the behavior underlying the average increase in the probability of being cited by a given patent is part of our ongoing research. 3.3.2 Time Dependence of Power Law Decay 6 PLEASE DO NOT QUOTE OR CITE!! 7/29/2017 12:33 AM While it is well known that there has been a significant increase in the number of US patents granted per year since 1984 (see, e.g., Jaffe, 2003), the underlying reason for this increase is not clear. Has there simply been an acceleration of technological development in the last twenty years or has there been a more fundamental change in the patent system, perhaps as a result of increased leniency in the legal standard for obtaining a patent? Our study may provide some insight into this question. Our kinetic model permits us to measure whether there has been any deep change in the growth kinetics of the patent citation network. Because we measure time in units of patent number, a mere acceleration of technological progress should leave A(k,l) unchanged in patent number “time.” A change in A(k,l) over time would indicate that something more is going on. To begin to investigate the possible time dependence of A(k,l), we allowed and to vary with time. To perform time-dependent fits, we averaged over a 500,000-patent sliding time window and calculated the parameters after every 100,000 patents. There is a significant variation in over time. No significant time dependence was observed in to within the statistical errors. There are two regimes for . Prior to about 1991, decreases with time, while there is a significant increase afterward. As noted earlier, the parameter has some very important consequences for the growth of the network: the higher , the more “condensed” the network will be. An increasing indicates increasing stratification -- a smaller and smaller fraction of the patents is receiving a larger and larger fraction of the citations. The turn-around from decreasing to increasing suggests a more fundamental change in the characteristics of patents that are being issued than a simple acceleration of technological progress. 4. Proposed Plan of Research In the near term, we plan to focus on three specific directions: 1) further research on patent citation network growth kinetics and dynamics, focusing on possible time dependence of the growth behavior and on investigating variation with category of technology; 2) use of path length measures from statistical network theory to investigate relationships between different technical areas, particularly focusing on whether the currently used patent classification schemes reflect the observed relationships between patented innovations; and 3) use of clustering coefficients and generalized correlation functions to develop new measures of patent “generality” and “originality,” (discussed below) that do not rely on patent office classification schemes and to explore whether there are different structural “signatures” for citations that reflect complementary and substitute technologies. 4.1 Citation Network Growth Kinetics 4.1.1. Technological Differences in Patent System Kinetics We plan to investigate the behavior of A(k,l) for particular subsets of patents. In our initial study we will determine the behavior of Al(l) and Ak(k) for six different technology classes defined by Hall, Jaffe and Trajtenberg based on a classification scheme employed by the USPTO. (Hall, 2003) Very preliminary results suggest that the decay exponents vary from class to class in an interesting way. In particular, the category 7 PLEASE DO NOT QUOTE OR CITE!! 7/29/2017 12:33 AM of “Computers and Communication” seems to have a larger value of than the others. A larger value of for this category is consistent with the general intuition that the computer industry is “fast-paced.” We plan to pursue a more detailed investigation of the technology dependence of A(k,l). If the apparent distinctiveness of the Computers and Communications category holds up upon further study, we would like to perform a more thorough investigation with the goal of exploring the behavior of software patents. Software patents have been the subject of considerable controversy and analysis, (see, e.g., Bessen, 2003; Burk, 2003; Graham, 2003; Lunney, 2001/2001; Mann, 2005) and the proposed research should contribute valuable insights to this debate. We also plan to explore possible variations in time dependence among technological categories. Depending on data availability, it may also be possible in future work to compare the time dependent evolution of the United States patent network with others internationally. 4.1.2 Modeling the Underlying “Dynamics” of the Patent System So far we have only a probabilistic understanding of the evolution of the network. Especially because citations are contributed by inventors, their attorneys, and patent examiners, and because patents clearly have quite heterogeneous intrinsic technical merit, it is not at all obvious how the observed dependence on number of citations previously received and on patent age arises from the “microscopic mechanisms” of patent citation. Indeed, it is quite interesting -- and perhaps surprising -- that a very recent extensive study of physics journal citations has uncovered rather similar kinetics despite the very different processes by which journal and patent citations are made. (Redner, 2005.) We also have no theoretical explanation at present for the degree distribution for citations made, which, according to our initial studies, has a power law tail unlike the previously reported exponentially decaying distributions of citations made for scientific journals. (Vazquez, 2001.) Since there is no obvious “preferential attachment” or multiplicative mechanism to explain this power law distribution (citations are “made” in one fell swoop), the observed distribution of citations made is surprising. One way to attempt to gain some understanding of these underlying mechanisms is through simulations of “microscopic” models that incorporate factors such as heterogeneous patent quality. As our understanding of the network improves based upon the studies already described we hope to develop models that will provide more “microscopic” understanding of the evolution of the patent citation network. 4.2 Statistical Network Measures of Technological Relationships Patent citations are a source of information about the technological relationships between patents and may serve, indirectly, as a source of insight into technological relationships more generally. Previous economic studies have used patent citation data as a proxy for knowledge flow. (See, e.g., Jaffe, 2003.) Because many patent citations are provided not by patent applicants, but by their attorneys and by patent examiners, the usefulness of patent citations for this purpose must be carefully assessed. Data about which citations have been provided by patent examiners and which by patentees have only recently been made available by the USPTO. Regardless of the extent to which citations reflect knowledge flows, citations do reflect technological relationships. The 8 PLEASE DO NOT QUOTE OR CITE!! 7/29/2017 12:33 AM techniques of statistical network theory, which are focused on understanding the complex structure of networks, should be useful in exploring these relationships. One particularly promising direction for such exploration is the potential to compare technological relationships embedded in the citation network with the technical categorization of patents by patent examiners. Each patent application, when received by the USPTO, is categorized according to a classification scheme which has been developed in an ad hoc manner over the years to assist examiners and patent applicants in searching for relevant prior art. The examiner-assigned patent classifications have also been used by scholars of the patent system to assess such things as the “generality” of a patent (evaluated by the extent to which it is cited by patents from different technological classes), the “originality” of a patent (evaluated by the extent to which it cites patents from different technological classes) (Hall 2003), and the “technological closeness” of firms involved in patent litigation (evaluated by the extent to which the patents assigned to those firms are in the same technological classes). These quantities have limited explanatory power, however. For example, one recent study determined that patent litigation often involves firms that do not have significant “technological closeness” as measured by the classifications. (Meurer, 2005.) Statistical network analysis provides another way to measure technological relationships, which can be used both to evaluate the existing technological classifications and as an alternative to measures based on those classifications. Of course, the citation network is not entirely independent of the classification scheme, since both examiners and applicants derive some of their citations from searches based upon it. However, the universe of citations is broader than those found using the classification scheme. Moreover, as explained in somewhat more detail below, statistical network theory provides a metric for measuring relationships between patents that is more finegrained and quantitative than comparing classifications. We propose two specific projects that will use citation data to assess technological relationships. The first will use path length measures from statistical network theory to measure technological closeness. The second will build on the studies of technological closeness to develop and apply network clustering algorithms to the patent citation network. 4.2.1 Network Path Length as a Measure of Technological Closeness Physicists studying the properties of networks have employed various measures of network distance based essentially on counting the number of steps or “links” between network nodes. (See, e.g., Albert, 2002; Newman, 2003.) Thus, one may define the average path length between two nodes, the shortest path length between two nodes, and distributions of path lengths between nodes. In the patent citation network, one may use these measures to explore the “technological distance” between patented technologies. Direct citations will on average indicate closer technical relationships than second, third, and later generation citations. (Of course, not all citations indicate equal technical relationship and it is certainly possible to imagine “weighting” different citations according to other properties, ranging from numbers of citations in common between citing and cited patents to textual overlap of the underlying patent documents. For numerical tractability and conceptual simplicity, we propose to begin with a simple treatment in which each link is given equal weight.) 9 PLEASE DO NOT QUOTE OR CITE!! 7/29/2017 12:33 AM As an initial matter, we have measured the “size” of the patent citation network by looking at the shortest path lengths (sometimes called “geodesic distances”) between patent nodes. We determined that the network contains one “giant” connected component, and some very small components. Thus, there is a “small world effect” indicating that most patents are connected to most other patents by a relatively small number of steps. (Albert, 2002; Newman, 2003.) The longest path length between nodes in the “giant” component (of nearly 4 million patents) is only 24 steps. The “technology space” encompassed by the patent system is perhaps surprisingly closely connected. This calculation will be expanded to study the distribution of path lengths in more detail. An understanding of how closely connected the patent space is could have significant implications for the interpretation of important concepts in patent law. For example, many legal questions depend on the concept of the “person having ordinary skill in the art,” where the “art” is the technical field of the invention. These legal inquiries assume that the “art” to which a patent pertains may be sensibly defined. Studies of the connectedness of the patent network may provide insight into the extent to which such an assumption is meaningful. The network path length may also be used to evaluate and explore existing classification schemes. Distances between patents that are classified in the same category may be compared with distances between patents in different categories to evaluate the accuracy of the classification systems. Moreover, average distances between different categories can be computed. These path length measures may provide a more finegrained (and complementary) measure of technological closeness. They may also permit generalized measures of generality and originality to be devised that account for the citation distance between different categories. 4.2.2 Network Clustering Approach to Patent Categorization After exploring the path length statistics, we will attempt to apply network clustering techniques to explore the structure of the patent citation network. The development of numerical techniques to partition large complex networks into meaningful “communities” is an evolving area of research. (See, e.g., Newman, 2004a; Newman, 2004b; Palla, 2005; Song, 2005; Wu, 2004; Zhou, 2003.) Clustering is difficult in such networks, in part because their large size necessitates efficient algorithms, but more fundamentally because of the conceptual difficulty in defining meaningful clusters in such highly connected and often hierarchically structured systems. The study of clustering in the patent network may have practical implications for patent classification, but, on a more fundamental level, the attempt to devise a sensible clustering approach may yield insights into the structure of the underlying “technological space.” 4.3 Clustering Coefficients and Network Correlations and the Study of Patent Context Among the fundamental tools of statistical network theory are clustering coefficients (sometimes called “transitivity”) and degree-degree correlations. (Albert, 2002; Newman, 2003.) The basic clustering coefficient measures the likelihood that two nodes that are connected to a specific third node are also connected to one another. We can also imagine defining clustering coefficients for higher-order neighbors, e.g., the 10 PLEASE DO NOT QUOTE OR CITE!! 7/29/2017 12:33 AM likelihood that two nodes two “steps” away from a particular node are connected to one another. The study of clustering coefficients and correlations in the patent network is likely to provide interesting insights in its own right as well as to identify quantities that may be useful inputs to more traditional econometric analysis. For example, clustering coefficients are a promising alternative to measures of “generality” or “originality” based on the patent classification systems. Like previously defined measures of generality and originality, clustering coefficients are sensitive to the relationships between the patents that cite or are cited by a particular patent. Moreover, clustering coefficients and correlation functions are much more nuanced and fine-grained than those previously defined measures and are less dependent on ad hoc classification schemes. Because patent citation networks are “directed networks” (i.e., the citation link between two patents has a defined direction), several distinct varieties of local clustering coefficient may be defined. Thus, a highly cited patent with a small incidence of citation among its nearest neighbors is likely to be more general than a highly cited patent with a highly connected set of citing patents. We propose to define a “generality” clustering coefficient by looking at the fraction of patents citing a given patent that also cite one another. (See Fig. 5a) Similarly, a patent that cites a number of patents which tend to cite one another may be less “original” than a patent that cites patents that do not tend to cite one another. An “originality” clustering coefficient might be defined by looking at the fraction of patents cited by a given patent that cite one another. (See Fig. 5b.) Clustering coefficients may also be useful in distinguishing “lovely” citations (which cite to an earlier patent because they build on its technology) from “dangerous” citations (which cite to an earlier patent because they replace or distinguish its technology). (See Maurseth, 2005, for a discussion of this distinction.) A patent that cites a patent which makes a “lovely” citation to an earlier patent may be more likely also to cite the earlier patent than a patent that cites a patent which makes a “dangerous” citation to the earlier patent. (See Fig. 5c.) It is thus possible that clustering coefficients will be able to provide insight into the extent to which patents in a given technical field tend to be competing substitutes or “blocking patent” complements. 5. Broader Impact of the Proposed Research This project addresses a subject of great importance to the technical and economic health of society -- the functioning of the patent system in promoting innovative activity. The proposed research will help to provide a stronger empirical basis for the current debate among legal academics and policymakers about the legal design of the system. In addition, the research will contribute to understanding the fundamental science of networks, a topic that has far-reaching implications in many areas of physical, biological, and social science. Because the Principal Investigators for this proposal come from diverse scholarly communities, we are well positioned to disseminate the results of this research broadly. Our initial results have already been presented at conferences dealing with physics, applied mathematics, law, and complexity theory; we anticipate continuing to publish and present our results in broad-based fora and to facilitate inter-disciplinary communication on subjects related to the research. We also anticipate involving a wide range of students -- including Ph.D. students from the KFKI Research Institute of the 11 PLEASE DO NOT QUOTE OR CITE!! 7/29/2017 12:33 AM Hungarian Academy of Sciences, undergraduates from Kalamazoo College, and law students from DePaul University College of Law, in the project. 12 PLEASE DO NOT QUOTE OR CITE!! 7/29/2017 Figure 1: Ak(k) as a function of in-degree, k, for various values of patent age, l. 13 12:33 AM PLEASE DO NOT QUOTE OR CITE!! 7/29/2017 12:33 AM Figure 2: Top: A(k,l) as a function of patent age l, in units of patent number, for various values of in-degree, k. Bottom: The decaying portion of Al(l) as a function of patent age l, for various values of k. 14 PLEASE DO NOT QUOTE OR CITE!! Figure 3: 1/S(t) as a function of time in patent number units. 15 7/29/2017 12:33 AM PLEASE DO NOT QUOTE OR CITE!! 7/29/2017 12:33 AM Figure 4: Top: Average number of citations made per patent, E(t), as a function of time in patent number units. Bottom: E(t)/S(t) (probability that a new patent will be cited by a given patent) as a function of time in patent number units. 16 PLEASE DO NOT QUOTE OR CITE!! 7/29/2017 12:33 AM (a) (b) (c) Figure 5: Illustrations of definitions of clustering coefficients relevant to (a) generality; (b) originality; (c) “loveliness” or “dangerousness” of citation from B to C 17 PLEASE DO NOT QUOTE OR CITE!! 7/29/2017 12:33 AM REFERENCES Albert, Reka and Albert-Laszlo Barabasi, Statistical Mechanics of Complex Networks, 74 REVIEWS OF MODERN PHYSICS 47 (2002) Allison, John R. and Mark A. Lemley, The Growing Complexity of the United States Patent System, 82 BOSTON UNIVERSITY LAW REVIEW 77 (2002) Allison, John R., Mark A. Lemley, Kimberly A. Moore and R. Derek Trunkey, Valuable Patents, 92 GEORGETOWN LAW JOURNAL 435 (2004) Astebro, Thomas, The Return to Independent Invention: Evidence of Unrealistic Optimism, Risk Seeking or Skewness Loving?, 113 THE ECONOMIC JOURNAL 226-39 (2003) Bak, Per, HOW NATURE WORKS: THE SCIENCE OF SELF-ORGANIZED CRITICALITY, Springer-Verlag Telos (1999) Barabasi, Albert-Laszlo and Reka Albert, Emergence of Scaling in Random Networks, 286 SCIENCE 509-512 (1999) Barabasi, Albert-Laszlo, LINKED: THE NEW SCIENCE OF NETWORKS, Perseus Publishing (2002) Bessen, James and Robert M. Hunt, The Software Patent Experiment, PROCEEDINGS OF OECD CONFERENCE ON PATENTS, INNOVATION AND ECONOMIC PERFORMANCE, OECD, (April 2003) Bessen, James and Michael J. Meurer, Lessons For Patent Policy From Empirical Research On Patent Litigation, 9 LEWIS & CLARK LAW REVIEW 1 (2005) Burk, Dan L. and Mark A. Lemley, Policy Levers in Patent Law, 79 VIRGINIA LAW REVIEW 101 (2003). Burk, Dan L. and Mark A. Lemley, Designing Optimal Software Patents, in INTELLECTUAL PROPERTY RIGHTS IN FRONTIER INDUSTRIES: SOFTWARE AND BIOTECHNOLOGY, Robert Hahn, ed., (2005) Chakrabarti, Bikas K., Arnab Chatterjee, and Sudhakar Yarlagadda, eds., ECONOPHYSICS OF WEALTH DISTRIBUTIONS (NEW ECONOMIC WINDOWS), Springer (forthcoming 2005) Cohen, Wesley M. and Stephen A. Merrill, Eds., PATENTS IN THE KNOWLEDGE-BASED ECONOMY, National Research Council of the National Academies (2003) 18 PLEASE DO NOT QUOTE OR CITE!! 7/29/2017 12:33 AM Cowan, Robin, Network Models of Innovation and Knowledge Diffusion, Technical Report 016, Maastricht: MERIT, Maastricht Economic Research Institute on Innovation and Technology (2004), available at http://ideas.repec.org/p/dgr/umamer/2004016.html Crouch, Dennis, Children Rejoice -- Peanut Butter and Jelly Patent Rejected on Appeal, Patently-O Blog,http://patentlaw.typepad.com (April 8, 2005) Dorogovtsev, S. N. and J. F. F. Mendes, Evolution of Networks with Aging of Sites, 62 PHYSICAL REVIEW E 1842-1845 (2000) Dorogovtsev, S. N. and J. F. F. Mendes EVOLUTION OF NETWORKS. FROM BIOLOGICAL NETS TO THE INTERNET AND WWW, Oxford University Press (2003) Dreyfuss, Rochelle Cooper, The Federal Circuit: A Case Study in Specialized Courts, 64 New York University Law Revew 1 (1989) Dreyfuss, Rochelle Cooper, The Federal Circuit: A Continuing Experiment in Specialization, 54 CASE WESTERN RESERVE LAW REVIEW 769 (2004) Duguet, Emmanuel and Megan MacGarvie, How Well Do Patent Citations Measure Flows of Technology? Evidence from French Innovation Surveys, 14 ECONOMICS OF INNOVATION AND NEW TECHNOLOGY 374-93 (2005) Federal Trade Commission, To Promote Innovation: The Proper Balance of Competition and Patent Law and Policy, Report (October 2003) Frenken, Koen, Technological Innovation and Complexity Theory, ECONOMICS OF INNOVATION AND NEW TECHNOLOGY, (forthcoming 2005) Graham, Stuart J. H. and David C. Mowery, Intellectual Property Protection in the U.S. Software Industry, in Cohen, Wesley M. and Stephen A. Merrill, eds., PATENTS IN THE KNOWLEDGE-BASED ECONOMY, National Research Council of the National Academies (2003) Griliches, Zvi, Patent Statistics as Economic Indicators: A Survey, 28 JOURNAL OF ECONOMIC LITERATURE 1661-1707 (1990) Hagedoorn, John and Myriam Cloodt, Measuring Innovative Performance: Is There an Advantage in Using Multiple Indicators?, 32 RESEARCH POLICY 1365-1379 (2003) Hall, Bronwyn & Rosemarie Ziedonis, The Patent Paradox Revisited: An Empirical Study of Patenting in the U.S. Semiconductor Industry, 1979-1995, 32 RAND JOURNAL OF ECONOMICS 101 (2001) Hall, Bronwyn H., Adam B. Jaffe and Manuel Trajtenberg, The NBER Patent Citation Data File: Lessons, Insights and Methodological Tools, in Adam B. Jaffe and Manuel 19 PLEASE DO NOT QUOTE OR CITE!! 7/29/2017 12:33 AM Trajtenberg, eds., PATENTS, CITATIONS, & INNOVATIONS: A WINDOW ON THE KNOWLEDGE ECONOMY, MIT Press (2003) Hall, Bronwyn H., Exploring the Patent Explosion, 30 JOURNAL OF TECHNOLOGY TRANSFER 35-48 (2005) Harhoff, D., F. Narin, F.M. Scherer, and K. Vopel, Citation Frequency and the Value of Patented Inventions, 81 REVIEW OF ECONOMICS AND STATISTICS 511-15 (1998) Harhoff, Dietmar and Scherer, Frederic M. & Vopel, Katrin, Citations, Family Size, Opposition and the Value of Patent Rights, 32 RESEARCH POLICY 1343-63, Elsevier (2003). Heald, Paul J., A Transaction Costs Theory of Patent Law, 66 OHIO STATE LAW JOURNAL ___ (forthcoming 2005) Heller, Michael and Rebecca S. Eisenberg, Can Patents Deter Innovation? The Anticommons in Biomedical Research, 280 SCIENCE 698-701 (May 1998) Hohenberg, P.C. and B. I. Halperin, Theory of Dynamic Critical Phenomena, 49 REVIEWS OF MODERN PHYSICS 435 (1977) Jaffe, Adam B. and Manuel Trajtenberg, PATENTS, CITATIONS, & INNOVATIONS: A WINDOW ON THE KNOWLEDGE ECONOMY, MIT Press (2003) Jaffe, Adam B. and Josh Lerner, INNOVATION AND ITS DISCONTENTS: HOW OUR BROKEN PATENT SYSTEM IS ENDANGERING INNOVATION AND PROGRESS AND WHAT TO DO ABOUT IT, Princeton University Press (2004) Jensen, Henrik Jeldtoft, SELF-ORGANIZED CRITICALITY: EMERGENT COMPLEX BEHAVIOR IN PHYSICAL AND BIOLOGICAL SYSTEMS, Cambridge Lecture Notes in Physics (No. 10) (1998) Klemm, Konstantin and Victor M. Eguiluz, Highly Clustered Scale-Free Networks, 65 PHYSICAL REVIEW E 036123 (2002) Krapivsky, P.L., S. Redner, and F. Leyvraz, Connectivity of Growing Random Networks, 85 PHYSICAL REVIEW LETTERS 4629-32 (2000) Landes, William M. and Richard A. Posner, An Empirical Analysis of the Patent Court, 71 UNIVERSITY OF CHICAGO LAW REVIEW 111 (2004) Lanjouw, Jean O. and Mark Schankerman, Patent Quality And Research Productivity: Measuring Innovation With Multiple Indicators, 114 ECONOMIC JOURNAL 441–65, Royal Economic Society (2004) 20 PLEASE DO NOT QUOTE OR CITE!! 7/29/2017 12:33 AM Lemley, Mark A. and Carl Shapiro, Probabilistic Patents, 19 JOURNAL OF ECONOMIC PERSPECTIVES 75-98 (2005) Litmann, Allan N., Restoring the Balance of Our Patent System, 37 IDEA 545 (1997) Long, Clarisa, Patent Signals, 69 UNIVERSITY OF CHICAGO LAW REVIEW 625 (2002) Lunney, Glynn S., Jr., E-Commerce And Equivalence: Definining The Proper Scope Of Internet Patents Symposium: E-Obviousness, 7 MICHIGAN TELECOMMUNICATIONS AND TECHNOLOGY LAW REVIEW 363 (2000 / 2001) Ma, Shanggeng and Shang-Keng Ma, MODERN THEORY OF CRITICAL PHENOMENA, Perseus (2000) Malchup, Fritz, An Economic Review of the Patent System, Study No. 15 of the Subcomm. On Patents , Trademarks, and Copyrights of the Committee on the Judiciary, 85th Cong., 2d sess. (1958) Mandelbrot, Benoit, and Richard L. Hudson, THE (MIS)BEHAVIOR OF MARKETS, Basic Books (2004) Mann, Ronald J., The Myth of the Software Patent Thicket: An Empirical Investigation of the Relationship Between Intellectual Property and Innovation in Software Firms, 83 TEXAS LAW REVIEW 961 (2005) Mantegna, Rosario N. and H. Eugene Stanley, AN INTRODUCTION TO ECONOPHYSICS: CORRELATIONS AND COMPLEXITY IN FINANCE, Cambridge University Press (1999) Marco, Alan C., The Option Value of Patent Litigation: Theory and Evidence, REVIEW OF FINANCIAL ECONOMICS (forthcoming 2005), Vassar College Department of Economics Working Paper Series No. 52 Marsili, Orietta and Ammon Salter, ‘Inequality” of Innovation: Skewed Distributions and the Returns to Innovation in Dutch Manufacturing, 14 ECONOMICS OF INNOVATION AND NEW TECHNOLOGIES 83-102 (2005) Maurseth, Per Botolf, Lovely but Dangerous: The Impact of Patent Citations on Patent Renewal, 14 ECONOMICS OF INNOVATION AND NEW TECHNOLOGIES 351-74 (2005) Merges, Robert P. and Richard R. Nelson, On the Complex Economics of Patent Scope, 90 COLUMBIA LAW REVIEW 839 (1990) Merges, Robert P., One Hundred Years of Solicitude: Intellectual Property Law, 19002000, 88 CALIFORNIA LAW REVIEW 2187 (2000) Merrill, Stephen A., Richard C. Levin, and Mark B. Myers, eds., A PATENT SYSTEM FOR THE 21ST CENTURY, National Research Council of the National Academies (2004) 21 PLEASE DO NOT QUOTE OR CITE!! 7/29/2017 12:33 AM Meurer, Michael and James Bessen, The Patent Litigation Explosion, Presentation at the Annual Meeting of the American Law and Economics Association (May 6-7, 2005) Moore, Kimberly A., Worthless Patents, 20 BERKELEY TECHNOLOGY LAW JOURNAL (forthcoming 2005) Newman, M. E. J., The Structure and Function of Complex Networks, 45 SIAM REVIEW 167-256 (2003) Newman, M. E. J., Fast Algorithm for Detecting Community Structure in Networks, 69 PHYSICAL REVIEW E 066133 (2004) Newman, M. E. J. and M. Girvan, Finding and Evaluating Community Structure in Networks, 69 PHYSICAL REVIEW E 026113 (2004) Palla, Gergely, Imre Der´enyi, Ill´es Farkas, and Tam´as Vicsek, Uncovering the Overlapping Community Structure of Complex Networks in Nature and Society, 435 NATURE 814-18 (Jun 9, 2005) Pastor-Satorras, Romualdo and Alessandro Vespignani, EVOLUTION AND STRUCTURE OF THE INTERNET : A STATISTICAL PHYSICS APPROACH, Cambridge University Press (2004) Rai , Arti K., Engaging Facts And Policy: A Multi-Institutional Approach To Patent System Reform, 103 COLUMBIA LAW REVIEW 1035 (2003) Redner, S., Citation Statistics from 110 Years of Physical Review, 58 PHYSICS TODAY 49 (2005) Scherer, F.M. and Dietmar Harhoff, Technology Policy for a World of Skew-Distributed Outcomes, 29 RESEARCH POLICY 559-66 (2000) Scherer, F. M., Dietmar Harhoff, and Jorg Kukies, Uncertainty and the Size Distribution of Rewards from Innovation, 10 JOURNAL OF EVOLUTIONARY ECONOMICS 175-200 (2000) Shapiro, Carl, Navigating the Patent Thicket: Cross Licenses, Patent Pools, and Standard-Setting in INNOVATION POLICY AND THE ECONOMY, VOLUME I, Adam Jaffe, Joshua Lerner, and Scott Stern, eds., MIT Press (2001) Silverberg, Gerald and Bart Verspagen, A Percolation Model of Innovation in Complex Technology Spaces, 29 JOURNAL OF ECONOMIC DYNAMICS AND CONTROL 225-44 (2005) Song, Chaoming, Shlomo Havlin, and Hern´an A. Makse, Self-similarity of Complex Networks, 433 NATURE 392-95 (Jan. 27, 2005) Stanley, H. Eugene, INTRODUCTION TO PHASE TRANSITIONS AND CRITICAL PHENOMENA (1971) 22 PLEASE DO NOT QUOTE OR CITE!! 7/29/2017 12:33 AM Stanley, H. Eugene, Scaling, Universality, and Renormalization: Three Pillars of Modern Critical Phenomena, 71 REVIEWS OF MODERN PHYSICS S358 (1999) Thomas, John R., Formalism At The Federal Circuit, 52 AMERICAN UNIVERSITY LAW REVIEW 771 (2003) Trajtenberg, Manuel, A Penny for your Quotes: Patent Citations and the Value of Innovations, 21 RAND JOURNAL OF ECONOMICS 172-87 (1990) Turner, John L., In Defense of the Patent Friendly Court Hypothesis: Theory and Evidence (under review 2005), http://www.terry.uga.edu/~jlturner/PatentFCH.pdf Varian, Hal R., Patent Protection Gone Awry, NEW YORK TIMES, Economic Scene (October 21, 2004) Vazquez, Alexei, Citations Networks, http://arxiv.org/PS_cache/condmat/pdf/0105/0105031.pdf (2001) Wagner, R. Polk, and Gideon Parchomovsky, Patent Portfolios, 154 UNIVERSITY OF PENNSYLVANIA LAW REVIEW __ (forthcoming 2005) Watts, Duncan J. and Steven H. Strogatz, Collective dynamics of small world networks, 393 NATURE 440- 42 (1998). Watts, Duncan J., SIX DEGREES: THE SCIENCE OF A CONNECTED AGE, W.W. Norton & Co. (2002) Wille, Luc T., NEW DIRECTIONS IN STATISTICAL PHYSICS: ECONOPHYSICS, BIOINFORMATICS, AND PATTERN RECOGNITION, Springer (2002) Wu, Fang and Bernardo A. Huberman, Finding Communities in Linear Time: A Physics Approach, 38 EUROPEAN PHYSICS JOURNAL B 331-38 (2004) Zhu, Han, Xinran Wang, and Jian-Yang Zhu, Effect of Aging on Network Structure, 68 PHYSICAL REVIEW E 056121 (2003) Zhou, Haijun, Distance, Dissimilarity Index, and Network Community Structure, 67 PHYSICAL REVIEW E 061901 (2003) Ziedonis, Arvids and Bhaven N. Sampat, Patent Citations and the Economic Value of Patents: A Preliminary Assessment, in HANDBOOK OF QUANTITATIVE SCIENCE AND TECHNOLOGY RESEARCH, Henk Moed, Wolfgang Glänzel, and Ulrich Schmoch, eds., Kluwer Academic Publishers (2004). 23 PLEASE DO NOT QUOTE OR CITE!! 24 7/29/2017 12:33 AM
© Copyright 2026 Paperzz