Probabilistic Approaches to Computational Problems A. Introduction Area: Over the past half century, computers have become ubiquitous in almost every aspect of human life. Computer Science as an academic discipline, starting from the early 1950s, has been studying the mathematical theory and practical aspects of computing. As automation and computers have entered more and more areas, it has become critical to understand, analyze, and solve computational problems from a mathematical and algorithmic point of view. In the past two to three decades, computer science has experienced another paradigm shift while computing and communications have seamlessly merged and the resulting technology has become easily available to an unprecedented number of non-expert users. This change has been supported by new techniques in theoretical as well as applied areas, expanding the need for mathematics for computer-science problems, ranging from discrete mathematics to wider parts of probability, statistics, numerical methods, and geometrical analysis. The shift in the use of computation by the general populace, as well as the availability of cheap and easy methods for generating, collecting, and storing complex data, have resulted in the need for processing extraordinary amounts of data that most existing algorithms are unable to cope with. This data can be the whole genome of a species, streaming sound or video or text, multidimensional data with missing values, sales records, astrophysical observations, etc. Meanwhile, the computation and communication capabilities of existing electronics have been unable to keep up. It is now increasingly clear that we will not be able to catch up with the increase in the amount and complexity of data using technology that roughly doubles its resources every 18 months (Moore’s Law); we need dramatic changes in our thinking about methodology. In particular, techniques that use probability theory and randomization, approximation algorithms that return very quick and reasonably correct answers to complicated optimization questions, and algorithms that process real-time streams as they are generated using statistical tools have all proved useful, and they represent our hopes for dealing with otherwise unwieldy data. To develop such techniques, we need collaboration among researchers from discrete mathematics, algorithms, machine learning, probability theory, statistics, numerical methods, and geometrical analysis. Indiana University Bloomington is already home to an outstanding group of researchers working on different aspects of probabilistic approaches to computational problems. The objectives of this proposal are to hire colleagues who will bridge the Departments of Mathematics, Computer Science, and Statistics and facilitate collaborations, to bring these new 1 3 of 59 hires and the existing researchers together in order to encourage the free flow of ideas, and to tackle large and practical problems in the area that can lead to patentable algorithms and commercializable software. Rationale: The unifying theme of Big Data concerns a very real problem. There are various reasons for the generation of huge amounts of data. First, over the past three decades, it has become cheap and easy to generate and store data. From the individual who keeps innumerable hours of digital home movies to the telecommunications company that never throws out an IP address that arrives at a router, data gets stored, with the hopes of one day using it. Second, there are now many more and varied ways of generating data than simply a few years ago. Devices that used to serve simple purposes such as telephones have now become means of data generation and storage. Household appliances or industrial machines that used to be mostly mechanical now have embedded electronics in them, and they communicate with one another. Sensor networks can have many small, cheap nodes generating, storing, and communicating data in the wild. Large telescope arrays and particle accelerators break records every year in the amount of data they generate and store. Unfortunately, our analytical capabilities, in hardware or software, do not expand nearly as fast as our data; we are essentially helpless in the face of the uncontrolled increase and heterogeneous and complex nature of the data that we generate. While data analytics and big data have become buzzwords across the world, leading to many doubtless necessary and useful research and teaching programs (including at Indiana University), the data crisis remains unresolved, and can be overcome only through a deep understanding and merging of mathematical and computational principles underlying the notion of algorithms for big data. The emphasis of this proposal is on the notion of bridge. We have excellent people in the Department of Mathematics and the Department of Statistics whose research is connected with questions and applications in computational science, and equally excellent but relatively new hires in Computer Science who use mathematical tools and who also explore mathematical questions, many related to Big Data, in their research. Our goals are to improve communication across departments with overlapping research and teaching programs and to increase collaboration. To this end, we need individuals who can cross the bridge, communicating problems from Computer Science to Mathematics and Statistics and communicating mathematical and statistical models and tools to Computer Science. While Big Data has been an area of priority (including a Big Data graduate degree within School of Informatics and Computing) at Indiana University Bloomington, the push has mostly been in an applied direction. Though practical improvements are on the surface what the layman sees and appreciates, these improvements must be borne out of mathematical foundations and new theoretical methods; in the absence of developments in theory, applied progress will eventually stagnate. Within the context of big data, this is the sense in which our proposed area merging big-data algorithms with mathematical foundations needs to emerge on the Indiana University Bloomington campus. The three departments are committed to nurturing this area by creating an interdisciplinary seminar, regardless of the funding of this proposal. The magnitude of the success of this cross-disciplinary effort would increase by hiring additional faculty who are closer to the middle of the existing gap in faculty expertise. Objective: The main objective of this proposal is to bring together the three departments, 2 4 of 59 in order to investigate, from a mathematical and algorithmic angle, problems related to big data. While the details of the research agenda will no doubt depend to some extent on future hires into the departments, the research questions to be investigated initially include the development of probabilistic methods for analyzing and discovering trends in noisy data, for communication-efficient algorithms for distributed optimization across a network of computers, for solving optimization problems approximately, for metric embeddings and manifold learning (reducing a problem in a complicated space to a problem in a simpler space in order to facilitate its solution or visualization), and for generating random graphs that more accurately model real-world networks for use in computational algorithms. These are practical problems in which Indiana University Bloomington already has expertise, and Indiana University Bloomington would benefit from expanding through new faculty addressing these problems. The longer term goals will develop through the establishment of a cross-disciplinary seminar series that will bring faculty and students from Mathematics, Statistics, and Computer Science together to learn about the mathematical problems that lie within the computational ones and to learn about the mathematical tools that can be used to solve those problems. Additional hires who can help close the gap between the departments, as well as expand the research agenda, will enable the effective communication between existing world-class faculty, leading to new mathematical theories, grant applications, patentable algorithms, and commercializable software. Current state of knowledge: The theoretical study of algorithms usually focuses on the optimal speed taken or memory required of an algorithm to solve a particular class of problems. The computational hardness of a problem (where “hardness” is a technical, quantifiable term) refers to its i nherent difficulty: a computationally hard problem cannot be solved quickly and effectively no matter how good the researcher or how fast the computer running the algorithm for it. This notion of hardness is determined by producing upper and lower bounds on the amount of resources (typically time and memory) required to solve the problem. To exhibit an upper bound on the time requirements of a problem, it suffices to produce a “fast” algorithm that solves the problem. When a problem is inherently hard to solve, it might be necessary to show a lower bound, proving the impossibility of designing an efficient (e.g., fast) solution. When a real-life problem is complicated and hard (as they typically tend to be), or when the input data is extremely large (an increasingly problematic trend), one might be facing algorithms whose runtimes span millions of years, or whose memory requirements cannot be met with the existing technology for another few generations. In that case, one would be willing to compromise, and trade certainty or accuracy for efficiency, settling for a randomized algorithm that solves the problem only with high probability, or an approximation algorithm that returns an answer that might have a tiny error in it. It is important to realize that these slight “imperfections” are not hugely significant: one can typically guarantee that an algorithm will work “with high probability,” which, in practice, might be indistinguishable from absolute certainty (e.g., failure probability can typically be made less than that of a lightning strike on the computer running the algorithm). Likewise, the output of an “approximate” computation might have a tenth of a per cent error; this type of error is typically dwarfed by the noise in real-life inputs that introduce significantly higher variation into the output, making efforts for overly accurate computation superfluous. Randomized and/or approximation algorithms have become, over time, a big part of the 3 5 of 59 computer science research endeavor, and they borrow heavily from mathematics and statistics for tools and techniques. In addition, during the past two decades, as the amount of data has exploded and creation/storage paradigms have changed, the field has turned to different models of computation such as streaming and sampling, which borrow techniques from probability theory, statistics, optimization, and information theory. Mathematical and statistical models underlie many of the new data generation and structural models; the backbone of the analysis of computation involving big data is also provided by these fundamental disciplines. Funda Ergun and Qin Zhang from Computer Science work on algorithms for streaming data; their techniques borrow significantly from the above mentioned ideas of randomization, sampling, and data generation. In particular, they are interested in devising small-space algorithms, that is, algorithms that do not store all of their input, for solving problems related to massive data arriving as a sequential stream. To achieve this task, one can form small summaries named “sketches” and store these in memory rather than the entire data set. Most computations on these summaries are necessarily approximate, as the shrinking likely leads to a loss of information, but it can be shown using probabilistic techniques that the results are very close to the most accurate answers, with high probability. In the event that the computation is impossible, they resort to showing lower bounds using techniques from information theory and communication complexity. Qin Zhang has lately focused on a more distributed notion of streams, while Funda Ergun has worked on string problems. Martha White, also in Computer Science, works in streams in a machine-learning environment, employing combinatorial and probabilistic techniques. A new hire to Computer Science, Yuan Zhou, is interested in approximation algorithms, some of which employ randomized techniques. He is also interested, when approximations fail, in finding lower bounds. Often algorithmic problems and other problems regarding computation and communications networks are formulated in terms of properties of metric graphs. Or alternatively, it is often proven that large classes of problems are of the same computational complexity, and one of the problems in the class is described in terms of metric properties of graphs. Once a problem is formulated in terms of properties of a metric graph, many mathematical techniques can be brought to bear, including combinatorics, analysis, probability, group theory, and geometry. We give a few examples below that highlight connections among researchers already at Indiana University Bloomington. If one models a communications network via a graph, then in one sense a network is optimal if it has relatively few connections, but one would still need to remove large numbers of edges to disconnect the graph. This can be made precise asymptotically by studying families of graphs called expanders. The original proofs that expanders exist were probabilistic (by Mark Pinsker, a Russian mathematician working in the Institute for Information Transmission Problems in the Russian Academy of Sciences, Moscow). In a major breakthrough, Grigory Margulis, currently the Erastus L. DeForest Professor of Mathematics at Yale, used group theory and analysis, particularly property (T), to give explicit constructions of expanders. For the interested reader, property (T) has its own Wikipedia page: https://en.wikipedia.org/wiki/Kazhdan’s_property_(T). David Fisher, in the Department of Mathematics at Indiana University, has worked on property (T) and its generalizations and strengthenings, in part with Margulis. For several decades, all explicit constructions of expanders either depended on some variant of property (T) or on other 4 6 of 59 results from group-representation theory related to number theory. More recently, ideas from arithmetic combinatorics and particularly work of Jean Bourgain, Nets Katz, and Terry Tao have been been used to give many additional explicit constructions of expander families. These constructions are more robust than their predecessors, and it may be that these new constructions will be useful to computer scientists. Yuan Zhou, from Computer Science, is interested in the new notion of probabilistic graph expanders for optimization problems. A related area of research is the study of sets with small doubling in general groups. This is motivated in part by constructions of expanders. Other recent constructions of expanders are by a zig-zag construction due to Avi Wigderson and collaborators and also by operator algebra techniques due to Laurent Lafforgue. Michael Larsen, also in the Indiana University Mathematics Department, has analyzed a well-known expander and developed an efficient algorithm for finding good but not-quite-optimal routes through the resulting network. Embedding metric spaces into standard metric spaces, such as Euclidean space, is an area of interest to faculty in Computer Science, Mathematics, and Statistics. The powerful techniques underlying this research find numerous applications in the design and analysis of algorithms, including nearest-neighbor search, semidefinite programming and sparsest cuts, combinatorial optimization problems in network design, etc. The general idea is to convert a problem born in a complex setting to one lying in a relatively easier setting. Formally, given a problem, we try to embed its underlying metric space (the original space) into a simpler one (the host space) and then solve the problem in the host space, where the algorithm design is much easier. One goal is to make the “distortion” (its definition is based on applications) of the embedding as small as possible. In addition, a special property of Euclidean space, known as negative type and defined via an embedding property, is not only useful in Computer Science, but also in statistics, where it leads to powerful algorithms for clustering and for testing independence, among others. Lyons, in the Department of Mathematics at Indiana University, has recent work on this topic motivated by statistics. Qin Zhang in the Department of Computer Science at Indiana University also has recent work on this topic motivated by the desire to find better algorithms for string-similarity search/join, a fundamental problem in database optimization. Funda Ergun, also in the Department of Computer Science at Indiana University, has work on embeddings motivated by her study of strings and their patterns and edit distances. Manifold learning (also called “nonlinear dimension reduction”) posits that high-dimensional feature vectors lie on (or near) a low-dimensional manifold. Many popular techniques for learning an unknown data manifold begin by constructing a locally connected graph whose nodes correspond to feature vectors, then embedding the graph to construct a low-dimensional Euclidean representation of the presumed manifold. One such representation uses eigenfunctions of the Laplacian on the graph, giving what is known as a spectral embedding, an extremely popular technique in machine learning. Lyons has recent work with a computer scientist at the University of Washington on spectral embedding of graphs, and is currently working with Judge (in the Indiana University Bloomington Department of Mathematics) on spectral embeddings of manifolds. However, Lyons’ work is in a mathematical direction; bridge hires would facilitate cross-interdisciplinary fertilization and applications. A large portion of Indiana University Bloomington Department of Statistics Professor 5 7 of 59 Michael Trosset’s research program involves constructing approximate representations (usually in low-dimensional Euclidean space) of objects from information about pairwise relationships between objects. In the context of this proposal, these are problems best described under the heading of graph embedding. Here are some examples: 1. Determine the 3-dimensional structure of a molecule from information about bond lengths, bond angles, and other information (often determined by NMR spectroscopy) about pairwise interatomic distances. 2. Construct Euclidean representations of conceptual spaces, either for visualization or for subsequent statistical analysis. Such applications are common in psychology and statistics, in which disciplines “graph embedding” is known as “multidimensional scaling.” Rob Nosofsky in the Indiana University Bloomington Department of Psychological and Brain Sciences also has work of this type. 3. Latent space approaches to social network analysis were first proposed in 2002. These approaches model networks as random graphs and posit that nodes correspond to positions in an unobserved space. The probability of an edge between two nodes is a function of their positions in the latent space, e.g., the distance or the inner product between them. Recently, several researchers have argued that hyperbolic manifolds generate more realistic random graphs than do flat manifolds. While some things are known about embedding graphs in hyperbolic manifolds, this territory is largely unexplored. The sparsest cut problem in combinatorial optimization is to divide the nodes of a graph into two sets so as to minimize the ratio of the number of edges that go between the two sets divided by the number of nodes in the smaller half of the partition. This objective function favors solutions that are both sparse (few edges crossing the cut) and balanced (close to a bisection), which is the opposite of the goal with expander graphs. That is, graphs with optimal sparsest cuts are easily disconnected and make bad communication networks. Thus, it is important to determine the sparsest cut for a given graph. In this context, we have a problem where exact solutions are known to be NP hard but where results exist showing that certain kinds of approximate solutions are possible in polynomial time. Finding the optimal degree of approximation of solutions in this context turns out to be related to finding embeddings with small distortion of metric graphs of negative type into L1 . Motivated by this problem, Michel Goemans in Department of Mathematics at the Massachusetts Institute of Technology and Nati Linial in the School of Computer Science and Engineering at the Hebrew University of Jerusalem conjectured that every metric graph of negative type admitted an embedding of bounded distortion in L1 . This conjecture was disproven by Khot and Vishnoi. Soon thereafter, another, more natural and stronger, counterexample was given by Cheeger and Kleiner using a new form of differentiation theory. The exact result of Cheeger and Kleiner, and in particular the graphs they studied, were conjectured by Lee and Naor not to admit bounded-distortion embeddings in L1 . A key aspect of their use of differentiation theorems is that they reduce the problem to a geometric one, where one can rescale the embedding in both domain and range, and so differentiation as an infinitesimal operation 6 8 of 59 makes sense. For other problems in this area, it remains important to study embeddings of graphs where one cannot reduce the problem to a geometric one that admits rescalings. For these other problems, it seems likely that the correct approach is to use the idea of coarse differentiation and coarse analysis that was introduced by Eskin, Fisher (of Indiana University), and Whyte in their work on geometric group theory. Naor has proposed a particular problem in this direction to Eskin and Fisher. In addition, there are many uses of random walks on graphs, effective resistance, and random spanning trees in Theoretical Computer Science; all of these mathematical topics are central themes in Lyons’ work. To cite just one example, a recent award-winning paper provides a probabilistic algorithm that does better than all previous ones for solving the asymmetric traveling salesman problem. A crucial feature in this new work is the use of random spanning trees coming from electrical networks. Some of the properties they use are crucial also to work of Lyons and others in their own study of random spanning trees and their analogues on infinite graphs. This recent paper cites the book Lyons has just finished and, in turn, the book has now incorporated some of the new results of this paper. (Incidentally, this new book has been and remains available online for free. It has been the most or almost the most popular download on IUB’s domain pages.iu.edu, formerly known as mypage.iu.edu, for several years. For example, it had 164,460 downloads in 2014 and 161,593 downloads in 2015. Unfortunately, IU no longer provides this download data. Among its endorsements at www.cambridge.org/9781107160156 is one from Daniel Spielman, a Nevanlinna Prize winner of the Computer Science department at Yale University, who said, “It is one of my favorite references for probability on finite graphs. If you want to understand random walks, isoperimetry, random trees, or percolation, this is where you should start.”) These particular problems and areas demonstrate close connections between ideas in theoretical computer science related to graph theory and certain areas of mathematics and statistics. With the additional resources a successful proposal would bring, these connections would be strengthened and solidified, making Indiana University Bloomington a world-class leader in this relatively new cross-disciplinary area. B. Specific Aims The objective is to become a world leader in the development of probabilistic tools for computational problems, challenging the paradigm that hard computational problems are too difficult to solve by providing tools that allow them to be solved with high probability or solved more easily by cleverly incorporating probabilistic methods. In order to attract faculty, postdoctoral, and student talent, the proposal has specific aims that illuminate several major problems in the area. The over-arching problem is to discover trends in massive structures with bounded resources. Specific Aim 1: Finding trends in large, imperfect data Problems that involve analyzing text find many applications in computational biology, text processing, and data mining. Typically the text tends to be very long (often larger than a computer can hold in main memory), and with some errors (noise). An example of 7 9 of 59 noise is typos in a given text; another is “insertions or deletions” in the human genome. The problem of identifying trends in long data streams is central to many fields; these trends can be the existence of a pattern (e.g., the “search” function in text), repetitive trends that data exhibits (do Earth’s temperature readings display a periodic trend?) or certain types of correlation between elements (e.g., “tandem repeats,” of the type defabcdabcdklm in a genome sequence). When the data contain noise, as is the norm with real-life data, it becomes significantly harder to search for trends in the input. We then have to develop an approximation algorithm where not only the output, but also the input can be approximate. Typically algorithms for problems related to long, imperfect inputs resort to randomization, which helps not only with minimizing the resources needed, but also with “smoothing over” the imperfections in the data, similar to polling a large population before an election. However, unlike polling, the trends searched are quite complex and specific, the input massive, and the resources minuscule. One relatively new area that deals with large data/small resources is “sublinear algorithms;” those use less space/time than even a basic reading of the input requires, and produce approximate, probabilistic results. Techniques such as Fourier Analysis, while helpful, may require time or memory that we cannot afford for such problems. While the algorithms that eventually work tend to be deceptively simple, it takes involved probabilistic tools to prove that they work and, thus, to be confident of success. In the case of noisy data, the existing knowledge needs to be significantly improved through the cooperation of algorithms researchers from Computer Science, probability theorists, and statisticians. Solutions to these types of problems are of interest to industry, as they represent real-life questions and applications. Industrial colleagues have repeatedly asked whether we can compose trends – for instance, a period that grows steadily over time – as well as how we can deal with different kinds of noise (for instance, rather than having a few wildly wrong data points, how about having all data points slightly off?). We expect to collaborate with industry researchers on many of our Big Data theory problems. Specific Aim 2: Probabilistic methods for communication-efficient algorithms for distributed optimization across a network of computers As the datasets become increasingly larger, a common practice in compute science is to store big graph/geometric data in different machines connected by a network. Now when processing a query (e.g., find a good clustering), we need to do a distributed optimization across the network, whose running time is typically dominated by the total communication between the machines. An emerging research area in theoretical computer science is to design communication-efficient algorithms for distributed optimization. An important question is to understand the communication limits, that is, how much communication must be transmitted through the network in order to solve the problem. This is called “multi-party communication complexity” in theoretical computer science, and is closely related to information theory. Specific Aim 3: Inference with random graphs Many times graphs are structures imposed by researchers on data in order to discover information. Other times, real-world networks exist already, such as various social networks 8 10 of 59 or industrial networks. These real-world networks usually have a structure quite different from simple nearest-neighbor graphs that one can draw from geometric data. In particular, they often have the so-called “small-world” property, where it does not require many edges to go from any one node to any other. This property is shared by many common models of random networks, which is a major reason for applied interest in random graphs. However, classical statistical techniques were not designed to handle the essential sorts of dependencies inherent in network data. This makes inference from network data quite challenging; indeed, this is the central problem in the field today. We aim to provide useful tools for such popular tasks, together with theoretical justification for such tools. Specific Aim 4: Computing with massive graphs Today one encounters regularly graphs of massive size, the internet and its connections being the most common example. Because of their size, such graphs do not permit computation that requires the entire graph to be known. Instead, local algorithms that work near a given node of interest, or near several randomly chosen nodes, have become important. For example, one such problem is to find, given a node, a cluster of nodes that contains that node but that has “few” connections to the outside graph. This is known as the local graph clustering problem. Practical applications have included understanding the community structure of social and information networks. Solutions to this problem are also used as a subroutine for other partitioning problems. One popular method of dealing with massive graphs involves “local” algorithms. Lyons has written a paper (in press) that presents the best algorithm to date for a similar problem, that of partitioning the node set into two equal parts while minimizing (or maximizing) the number of connections between them; his algorithm is a local one and works on graphs of large girth. Lyons was drawn to the problem for purely mathematical reasons, as the technique of local algorithms turns out to be related to a field in mathematics called ergodic theory. Practical applications include understanding disease spread, transportation and evacuation models, business and computer security intelligence, systems biology applications, the power grid, and full-scale modeling on massive networks. Specific Aim 5: k-SAT One computationally difficult problem that has been intensively studied is called k-SAT. It is a mathematical abstraction known to be computationally equivalent to questions ranging from protein-folding to aircraft-crew scheduling in the sense that an efficient algorithm that was able to decide whether any instance of k-SAT has a solution would immediately be translatable to an algorithm for all the other problems. In k-SAT, there are n Boolean variables and m clauses, where each clause specifies the values of k out of the n variables. The question is whether there is a value for each variable that satisfies all m clauses. In general, one is interested in m growing proportionally to n as n → ∞. Although, like the other such problems, it is believed that no efficient algorithm exists, it might still be the case that most instances can indeed be solved quickly. Thus, much effort has gone into studying random instances of k-SAT. Random k-SAT is connected to statistical physics and tools from that area have brought new light to this computational problem. A major conjecture in the area is that there are phase transitions in the (apparent) difficulty of solving k-SAT, depending on the ratio m/n. 9 11 of 59 In other words, for large m/n, there is unlikely to be any solution, whereas for small m/n, not only is there likely to be a solution, but an algorithm is known that is likely to provide such a solution. The conjecture is that there is a sharp cut-off between these two possibilities that does not depend on m and n but only on the ratio. Other problems in the area have to do with the phenomenon that solutions cluster in groups determined by many of the variables being “frozen,” i.e., taking only one value. Survey Propagation is an algorithmic way for finding values that satisfy given requirements; it works by making the assumption that marginals over cluster projections essentially factorize. Determining the validity of this assumption is an open problem in the area. Phase transitions and complexity were the topics of a special semester at the Simons Institute for the Theory of Computing at UCB (https://simons.berkeley.edu/programs/ counting2016) that just concluded. Techniques in this field involve, besides Computer Science, various elements of discrete probability theory and statistical physics. Complexity theory also involves the analysis of Boolean functions, which is a form of discrete harmonic analysis. This was also a main topic of another special semester at the Simons Institute (https://simons.berkeley.edu/programs/realanalysis2013). Lyons has expertise in discrete probability and statistical physics, as well as harmonic analysis. C. Design and Methods Much of the proposed research is theoretical, not involving experiments except for limited application to real data of the theoretical techniques that have already been developed. For a few of the proposed projects, there are natural ways of evaluating and interpreting the data. For generating graphs that mimic real-world networks, graphs produced by a proposed algorithm will be measured for their features and compared against values that come from real-world networks. For problems such as metric embeddings and manifold learning, the standard tool used to evaluate techniques is to use data where the answer is known and determine the accuracy of the algorithms or the predictions from the model. These types of techniques will be used to evaluate models and algorithms developed in this proposal. For some problems involving long streams, it is possible to compare results theoretically to the performance of existing algorithms. Much of this proposal is about establishing connections on the Indiana University Bloomington campus that would allow the expertise of mathematics and statistics faculty to be brought to bear on practical problems in Computer Science. The PIs are intrinsically motivated to pursue this cross-disciplinary research, which would be facilitated by additional faculty whose research already crosses these disciplines or comes closer to doing so. 10 12 of 59 D. Timetable Year 1: 2016-17 • Begin an interdisciplinary seminar devoted to Probabilistic Approaches to Computational Problems • Hire one interdisciplinary postdoctoral fellow for a 3-year appointment to be housed in the Department of Mathematics • Draft ad for three interdisciplinary faculty hires Year 2: 2017-18 • Hire a cluster of three faculty members in Mathematics, Theoretical Computer Science, and/or Statistics • Hire one interdisciplinary postdoctoral fellow for a 3-year appointment to be housed in the Department of Mathematics • Hire one interdisciplinary postdoctoral fellow for a 2-year appointment to be housed in the Department of Computer Science • Continue and ramp up the interdisciplinary seminar devoted to Probabilistic Approaches to Computational Problems • Submit a Simons “Targeted Grants to Institutes” proposal Year 3: 2018-19 • Hire one interdisciplinary postdoctoral fellow for a 2-year appointment to be housed in the Department of Statistics • Hire one interdisciplinary postdoctoral fellow for a 2-year appointment to be housed in the Department of Computer Science • Submit a Simons Investigator Grant proposal • Submit a collaborative National Science Foundation Research proposal • Continue the interdisciplinary seminar devoted to Probabilistic Approaches to Computational Problems 11 13 of 59 Year 4: 2019-20 • Hire one interdisciplinary postdoctoral fellow for a 1-year appointment to be housed in the Department of Computer Science • Continue the interdisciplinary seminar devoted to Probabilistic Approaches to Computational Problems • Apply for a W. M. Keck Foundation grant E. Significance and Impact While each part is significant, and the significance and impact of each is described below, this proposal is more than the sum of its projects, each of which would be a major advance by itself in some aspect of computation. The significance of this proposal is in its vision to connect the work done on campus in mathematics and statistics to the work done on campus in computer science and to solidify that connection using postdoctoral fellows and strategic new faculty hires. Specific Aim 1: Discovering trends in the presence of noisy data. From NetFlix recommendations to modern information-technology security algorithms to precision medicine, researchers and industry need to be able to efficiently and accurately identify trends in massive noisy datasets. Specific Aim 2: Probabilistic methods for communication-efficient algorithms for distributed optimization across a network of computers. Distributed optimization is important for analyzing massive data sets that are stored on multiple machines, where data is collected from multiple sensors, for instance. It is also important for robotics, cognitive-radio networks, and mobile-device networks. Specific Aim 3: Inference with random graphs. Examples of the scientific applications include inferring the structure of regulatory networks in biological systems and mapping the connections in the brain. In order to understand and predict the behavior of epidemics, with the possibility of controlling their virulence, one needs to infer the structure of the network along which disease spreads. Specific Aim 4: Computing with massive graphs. Practical applications include understanding disease spread, transportation and evacuation models, business and computer security intelligence, systems biology applications, the power grid, and full-scale modeling on massive networks. Specific Aim 5: k-SAT. The problem of proving the random k-SAT threshold is of theoretical importance; it is a 30-year old conjecture. Algorithms for solving k-SAT problems with high probability have many applications, such as to combinatorial equivalence checking, automated theorem proving, and software verification. Algorithms have been developed that 12 14 of 59 work well in the space below critical where clustering is not an issue. Algorithms that work better in spaces below critical where clustering is an issue exist and need to be developed. F. Future Funding/Sustainability The standard source for funding for mathematics and computing is the National Science Foundation. If successful, this proposal would open up new sources of funding and create patentable algorithms and commercializable software. For example, the NSF has a program called “Critical Techniques, Technologies and Methodologies for Advancing Foundations and Applications of Big Data Sciences and Engineering (BIGDATA)”. They particularly solicit projects that are “collaborative, involving researchers from domain disciplines and one or more methodological disciplines, e.g., computer science, statistics, mathematics, simulation and modeling, etc.” Potential private foundation sources of funding include the Simons Foundation, the Clay Institute, and the W. M. Keck Foundation. The Simons Foundation’s Investigators program funds researchers in Mathematics and Theoretical Computer Science during their most productive years at a level of $100,000/year. The program also includes a special Math+X award to encourage novel collaborations between mathematics and other fields. While we are not proposing to solve the famous P vs. NP problem, which would lead to a Millennium Prize from the Clay Institute, faculty hired on this proposal may be eligible for Clay Research Fellowships and research generated from this proposal may lead to a Clay Research Award. The W. M. Keck Foundation’s stated priorities are funding “projects in research that • Focus on important and emerging areas of research • Have the potential to develop breakthrough technologies, instrumentation or methodologies • Are innovative, distinctive and interdisciplinary • Demonstrate a high level of risk due to unconventional approaches, or by challenging the prevailing paradigm • Have the potential for transformative impact, such as the founding of a new field of research, the enabling of observations not previously possible, or the altered perception of a previously intractable problem • Does not focus on clinical or translational research, treatment trials or research for the sole purpose of drug development • Fall outside the mission of public funding agencies • Demonstrate that private philanthropy generally, and the W. M. Keck Foundation in particular, is essential to the project’s success” Aspects of this project may meet those criteria. Additionally, the proposed research broadens the public funding sources we can apply for to include military and national security funding agencies, in addition to more standard 13 15 of 59 funding through the National Science Foundation. We will also apply for funding in teams, enhancing the chances an award will be made, increasing the size of the award, and increasing administrative efficiency. G. New Positions Proposed We propose to hire three new tenured or tenure-track faculty members in probability, analysis, and theoretical computer science, each of whom would have close ties to Computer Science in the School of Informatics and Computing and to the Department of Mathematics and the Department of Statistics in the College of Arts and Sciences. These faculty would close the gap between the existing theoretical expertise in the Department of Mathematics and the practical needs of faculty in Computer Science for mathematical tools. The departments and the two schools are committed to making these hires work in whatever way makes most sense for the hirees; junior faculty are often better served by having a clear tenure home, for instance. However, the goal is to bring in faculty who can teach and train students in multiple departments, thereby stabilizing the bridge being built between disciplines in this proposal. Specifically, we envision faculty whose teaching is split between schools, including team-teaching where appropriate, while their tenure home is in one unit. There is precedent for this sort of appointment. Elizabeth Housworth has her full FTE in the Department of Mathematics, but her teaching is split between Mathematics and Biology as described in her appointment letter. We will hire faculty with expertise in algorithms, complexity, and the theory of computation; combinatorial statistics; geometry and analysis at the interface between the continuous and discrete; and probability and stochastic processes. These areas overlap so that a carefully chosen selection of three new hires could cover these areas. There are set hiring procedures within the Department of Mathematics, the Department of Computer Science, and the Department of Statistics. Only the last has a formal mechanism to include faculty outside the department in its internal hiring decisions. The Departments of Mathematics and the Department of Computer Science are committed to extensive consultation during the hiring process absent a formal mechanism. We expect that at least one faculty member will be hired in the Department of Mathematics and at least one in the Department of Computer Science. The third faculty member would be in any of the three departments that suits her best. The hiring will be done as a cluster. Cluster hires can break down silos and increase interdisciplinary work, key goals of this proposal. Cluster hires also are known to increase diversity https://www.insidehighered.com/news/2015/05/01/new-report-sayscluster-hiring-can-lead-increased-faculty-diversity, even when increasing diversity is not the stated goal of the cluster hire. Cluster hires work best when institutions create structures that support the hires, facilitating interactions and valuing the work produced. The PIs on the proposal will ensure that there are frequent opportunities for interactions through establishing an interdisciplinary colloquium series and facilitating research discussion groups for the new faculty and postdoctoral scholars. Teaching across departments, including team-teaching courses, is another established way of supporting cluster hires. Finally, all the departments involved have mechanisms for evaluating interdisciplinary work, so that it is 14 16 of 59 valued fairly during the tenure and promotion process. H. IU and Collaborative Arrangements We have letters of support from the Chairs of Mathematics, Computer Science, and Statistics, and the Dean of the College of Arts and Sciences and the Dean of the School of Informatics and Computing. We have external letters from Robin Pemantle, the Merriam Term Professor of Mathematics at the University of Pennsylvania, and Yury Markarychev, an Associate Professor at the Toyota Technological Institute at Chicago. The main collaborative arrangements are between the PIs on this proposal and the new hires this proposal would fund, with additional collaboration with Professor Markarychev expected as the work progresses. I. Metrics and Deliverables Metrics, deliverables, and assessment of the progress and impact of this proposal: • We will apply for new funding each year of the proposal. • We will increase the interdisciplinary work between Mathematics, Statistics, and Computer Science, measurable by the increase in co-authored papers that bridge disciplines. • We will hold a successful interdisciplinary seminar with weekly talks attracting faculty, postdoctoral fellows, and students from multiple disciplines. • We will hire faculty who can cross disciplines, as evidenced by their participating in the teaching missions of multiple departments. These faculty will also facilitate the existing faculty in mathematics, statistics, and computer science crossing disciplinary boundaries. Metrics and assessment of the enhanced reputation of Indiana University Bloomington: • We will attract postdoctoral and graduate student applicants of the highest caliber. • We will attract world-renowned speakers to our lecturer series. • Work conducted under this proposal will win international recognition and awards. • Graduate students who join PIs on this work will obtain excellent positions in academia and industry. 15 17 of 59 Biographical Sketch: Russell Lyons Professional Preparation Case Western Reserve University, Cleveland, OH B.A. summa cum laude with departmental honors, May 1979, Mathematics University of Michigan, Ann Arbor, MI Ph.D., August 1983, Mathematics Specialization: Harmonic Analysis Université de Paris-Sud, Orsay, France Postdoctoral work, 1983–1985 Specialization: Harmonic Analysis Appointments Indiana University, Bloomington, IN: James H. Rudy Professor of Mathematics, 2014–present. Indiana University, Bloomington, IN: Adjunct Professor of Statistics, 2006–present. Indiana University, Bloomington, IN: Professor of Mathematics, 1994–2014. University of Calif., Berkeley: Visiting Miller Research Professor, Spring 2001. Georgia Institute of Technology, Atlanta, GA: Professor, 2000–2003. Microsoft Research: Visiting Researcher, Jan.–Mar. 2000, May–June 2004, July 2006, Jan.–June 2007, July 2008–June 2009, Sep.–Dec. 2010, Aug.–Oct. 2011, July–Oct. 2012, May–July 2013, Jun.–Oct. 2014, Jun.–Aug. 2015, Jun.–Aug. 2016. Weizmann Institute of Science, Rehovot, Israel: Rosi and Max Varon Visiting Professor, Fall 1997. Institute for Advanced Studies, Hebrew University of Jerusalem, Israel: Winston Fellow, 1996–97. Université de Lyon, France: Visiting Professor, May 1996. University of Wisconsin, Madison, WI: Visiting Associate Professor, Winter 1994. Indiana University, Bloomington, IN: Associate Professor, 1990–94. Stanford University, Stanford, CA: Assistant Professor, 1985–90. External Funding NATO Postdoctoral Fellowship in Science, Université de Paris-Sud, 1983–84. AMS Postdoctoral Fellowship, Université de Paris-Sud, 1984–85 and Stanford University, 1985–86. NSF Mathematical Sciences Postdoctoral Research Fellowship, Stanford University, 1986–89. Alfred P. Sloan Foundation Research Fellowship, Indiana University, 1990–93. NSF, Division of Mathematical Sciences, Statistics and Probability Program, $60,000, 1993–96. NSF, Division of Mathematical Sciences, Statistics and Probability Program, $74,000, 1998–2001. NSF, Division of Mathematical Sciences, Statistics and Probability Program, $100,000, 2001–04. NSF, Division of Mathematical Sciences, Statistics and Probability Program, $61,020, 2002–04. NSF, Division of Mathematical Sciences, Statistics and Probability Program, $258,000, 2004–07. NSF, Division of Mathematical Sciences, Statistics and Probability Program, $285,000, 2007–10. NSF, Division of Mathematical Sciences, Statistics and Probability Program, $303,161, 2010–16. NSF, Division of Mathematical Sciences, Statistics and Probability Program, $15,000, 2015. NSF, Division of Mathematical Sciences, Statistics and Probability Program, $59,307, 2015–16. NSF, Division of Mathematical Sciences, Statistics and Probability Program, $150,000, 2016–19. 1 37 of 59 Selected Publications Lyons, R. Distance covariance in metric spaces, Ann. Probab. 41, no. 5 (2013), 3284–3305. http://pages.iu.edu/~rdlyons/pdf/dcov.pdf Lyons, R. Hyperbolic space has strong negative type, Illinois J. Math. 58, no. 4 (2014), 1009– 1013. http://pages.iu.edu/~rdlyons/pdf/hypneg.pdf Lyons, R. The spread of evidence-poor medicine via flawed social-network analysis, Stat., Politics, Policy 2, 1 (2011), Article 2. (27 pp.) DOI: 10.2202/2151-7509.1024. http://pages.iu.edu/~rdlyons/pdf/CF-pub-erratum.pdf Lyons, R. Factors of IID on trees, Combin. Probab. Comput., to appear. http://pages.iu.edu/~rdlyons/pdf/fiid.pdf Oveis Gharan, S. and Lyons, R. Sharp bounds on random walk eigenvalues via spectral embedding, preprint. http://pages.iu.edu/~rdlyons/pdf/peigs.pdf Lyons, R. Determinantal probability measures, Publ. Math. Inst. Hautes Études Sci. 98 (2003), 167-212. http://pages.iu.edu/~rdlyons/pdf/bases.pdf Lyons, R. Determinantal probability: basic properties and conjectures, Proc. Intl. Congress Math., 2014, vol. IV, 137–161. http://pages.iu.edu/~rdlyons/pdf/icm.pdf Lyons, R. Fourier-Stieltjes coefficients and asymptotic distribution modulo 1, Ann. of Math. 122 (1985), 155–170. Lyons, R. The measure of non-normal sets, Invent. Math. 83 (1986), 605–616. Lyons, R. A new type of sets of uniqueness, Duke Math. J. 57 (1988), 431–458. Gaboriau, D. and Lyons, R. A measurable-group-theoretic solution to von Neumann’s problem, Invent. Math. 177, no. 3 (2009), 533–540. http://pages.iu.edu/~rdlyons/pdf/subr.pdf Angel, O., Kechris, A.S. and Lyons, R. Random orderings and unique ergodicity of automorphism groups, J. Europ. Math. Soc. 16 (2014), 2059–2095. http://pages.iu.edu/~rdlyons/pdf/order.pdf Synergistic Activities Lecture Series: Aug. 1999: International Summer School (20 lectures), Jyväskylä, Finland March 2005: Minicourse at EURANDOM (4 hours), Netherlands July 2005: Cornell Summer School in Probability (9 hours) Nov. 2008: Courant Research Centre, Göttingen (Distinguished Lecture Series, 3 hours) March 2012: Conference at Vanderbilt (3 hours) Editing: Annals of Probability, Associate Editor, 2003–2008 Annals of Applied Probability, Associate Editor, 2003–2008 J. Topology Analysis, Associate Editor, 2007– Tbilisi Mathematical J., Managing Editor, 2009–2014 Journal of Fractal Geometry, Associate Editor, 2013– 15 conferences organized since 1999 130 invited talks at other institutions, conferences, and workshops, Jan. 2000–Dec. 2015 10 presentations on misuse of applied statistics in a variety of venues, 2011–2014 2 38 of 59 Recent and Current Collaborators Omer Angel, U. British Columbia, Vancouver, Canada; Itai Benjamini, Weizmann Institute of Science; Damien Gaboriau, ENS-Lyon, France; Alexander Kechris, Caltech; Shayan Oveis Gharan, U. Washington; Yuval Peres, Microsoft Research, Redmond, WA; Oded Schramm, Microsoft Research, Redmond, WA; Xin Sun, MIT; Andreas Thom, Technische U., Dresden, Germany; Kevin Zumbrun, IU Graduate Advisers and Postdoctoral Sponsors (2) Thesis Advisers: Hugh L. Montgomery, Allen L. Shields, University of Michigan, Ann Arbor Postdoctoral Adviser: Jean-Pierre Kahane, Université de Paris, Orsay, France Thesis Adviser and Postgraduate-Scholar Sponsor (8) Ádám Timár, Rényi Institute, Budapest, Hungary; Serdar Altok, Boğaziçi University, Istanbul, Turkey; Peter Mester, J.P. Morgan, Budapest, Hungary; Sandeep Bhupatiraju, Indiana U.; Justin Cyr, Indiana U.; Minwoo Park, Indiana U.; Pengfei Tang, Indiana U.; Liviu Ilinca, G-Research, London, UK 3 39 of 59 Biosketch: Michael W. Trosset See http://mypage.iu.edu/∼ mtrosset/Personal/cv.pdf for complete curriculum vitae. Education and Current Employment • B.A. (summa cum laude), Mathematics and Mathematical Sciences, Rice University, May 1978. • Ph.D. (Fannie & John Hertz Foundation Fellow), Statistics, University of California, Berkeley, December 1983. • Professor (from August 2006) and Chair (from August 2012) of Statistics, Executive Director of the Indiana Statistical Consulting Center, Indiana University, Bloomington. Selected Research Grants 1. Principal Investigator, Global Optimization for Multidimensional Scaling (University of Arizona), National Science Foundation, $59,527.35, July 1996 to June 1999. 2. Principal investigator, Statistical Decision-Theoretic Methods for Robust Design Optimization, National Science Foundation, $126,000, August 2004 to July 2008. 3. Principal investigator, Embedding Method for Disparate Data, Office of Naval Research, $300,000, January 2007 to December 2010. 4. Principal investigator, IU subcontract to Virginia Polytechnic Institute & State University (L. T. Watson), Parallel Deterministic and Stochastic Global Optimization Algorithms, Air Force Office of Scientific Research, $210,333, January 2009 to December 2011. 5. Principal investigator, IU subcontract to Johns Hopkins University (C. E. Priebe), Fusion and Interference from Multiple and Massive Disparate Data Sources, Department of Defense, $231,116, January 2009 to December 2013. Books 1. An Introduction to Statistical Inference and Its Applications with R, Chapman & Hall/CRC, Taylor & Francis Group, June 23, 2009. Supplementary materials are provided on an accompanying web page: http://mypage.iu.edu/∼ mtrosset/StatInfeR.html Selected Articles in Professional Journals 1. Biotic and abiotic influences on foraging of Heterotermes aureus. Environmental Entomology, 16:791–795, 1987. (S.C. Jones, M.W. Trosset, W.L. Nutting) 2. Nesting-habitat relationships of riparian birds along the Colorado River in Grand Canyon, Arizona. The Southwestern Naturalist, 34:260–270, 1989. (B.T. Brown, M.W. Trosset) 1 40 of 59 3. Alzheimer’s disease effects on semantic memory: loss of structure or impaired processing? Journal of Cognitive Neuropsychology, 3:166–182, 1991. (K.A. Bayles, C.K. Tomoeda, A.W. Kaszniak, M.W. Trosset) 4. Interference competition in desert subterranean termites. Entomologia Experimentalis et Applicata, 61:83–90, 1991. (S.C. Jones, M.W. Trosset) 5. Relation of linguistic communication abilities of Alzheimer’s patients to stage of disease. Brain and Language, 42:454–472, 1992. (K.A. Bayles, C.K. Tomoeda, M.W. Trosset) 6. Optimal shapes for kernel density estimation. Communications in Statistics—Theory and Methods, 22(2):375–391, February 1993. 7. Alzheimer’s disease: effects on language. Developmental Neuropsychology, 9(2):131–160, 1993. (K.A. Bayles, C.K. Tomoeda, M.W. Trosset) 8. An extension of the Karush-Kuhn-Tucker necessity conditions to infinite programming. SIAM Review, 36(1):1–17, March 1994. (R.A. Tapia, M.W. Trosset) 9. Measures of deficit unawareness for predicted performance experiments. Journal of the International Neuropsychological Society, 2:315–322, 1996. (M.W. Trosset, A.W. Kaszniak) 10. A new formulation of the nonmetric STRAIN problem in multidimensional scaling. Journal of Classification, 15:15–35, 1998. 11. The solution of the metric STRESS and SSTRESS problems in multidimensional scaling by Newton’s method. Computational Statistics, 13(3):369–396, 1998. (A.J. Kearsley, R.A. Tapia, M.W. Trosset) 12. A rigorous framework for optimization of expensive functions by surrogates. Structural Optimization, 17(1):1–13, 1999. (A.J. Booker, J.E. Dennis, P.D. Frank, D.B. Serafini, V. Torczon, M.W. Trosset) 13. Distance matrix completion by numerical optimization. Computational Optimization and Applications, 17:11–22, 2000. 14. Recursive Bayesian inference for hydrologic models. Water Resources Research, 37(10):2521– 2535, 2001. (M. Thiemann, M.W. Trosset, H. Gupta, S. Sorooshian) 15. Extensions of classical multidimensional scaling via variable reduction. Computational Statistics, 17(2):147–162, 2002. 16. Better initial configurations for metric multidimensional scaling. Computational Statistics and Data Analysis, 41(1):143–156, 2002. (S.W. Malone, P. Tarazaga, M.W. Trosset) 17. Visualizing correlation. Journal of Computational and Statistical Graphics, 14(1):1–19, 2005. 18. On the diagonal scaling of Euclidean distance matrices to doubly stochastic matrices. Linear Algebra and Its Applications, 397:253–264, 2005. (C.R. Johnson, R.D. Masson, M.W. Trosset) 19. Approximate solutions of continuous dispersion problems. Annals of Operations Research, 136:65–80, 2005. (A. Dimnaku, R. Kincaid, M.W. Trosset) 2 41 of 59 20. Sensitivity analysis of the strain criterion for multidimensional scaling. Computational Statistics and Data Analysis, 50:135–153, 2006. (R.M. Lewis, M.W. Trosset) 21. The out-of-sample problem for classical multidimensional scaling. Computational Statistics & Data Analysis, 52(10):4635–4642, June 2008. (M.W. Trosset, C.E. Priebe) 22. Semisupervised learning from dissimilarity data. Computational Statistics & Data Analysis, 52(10):4643–4657, June 2008. (M.W. Trosset, C.E. Priebe, Y. Park, M.I. Miller) 23. Iterative denoising. Computational Statistics, 23(4):497–517, October 2008. (K.E. Giles, M.W. Trosset, D.J. Marchette, C.E. Priebe) 24. Molecular embedding via a second-order dissimilarity parameterized approach. SIAM Journal on Scientific Computing, 31(4):2733–2756, 2009. (I.G. Grooms, R.M. Lewis, M.W. Trosset) 25. Euclidean and circum-Euclidean distance matrices: characterizations and linear preservers. Electronic Journal of Linear Algebra, 20:739–752, 2010. (C.-K. Li, T. Milligan, M.W. Trosset) 26. Parallel deterministic and stochastic global minimization of functions with very many minima. Computational Optimization and Applications, 57:469–492, 2014. (D.R. Easterling, L.T. Watson, M.L. Madigan, B.S. Castle, M.W. Trosset) 27. Algorithm XXX: QNSTOP—quasi-Newton algorithm for stochastic optimization. To appear in ACM Transactions on Mathematical Software, 2017. (B.D. Amos, D.R. Easterling, L.T. Watson, W.I. Thacker, B.S. Castle, M.W. Trosset) 28. Fast embedding for JOFC using the raw stress criterion. arXiv:1502.03391, 2015. In revision. (V. Lyzinski, Y. Park, C.E. Priebe, M.W. Trosset) 29. On the power of likelihood ratio tests in dimension-restricted submodels. arXiv:1608.00032, 2016. Submitted. (M.W. Trosset, M. Gao, C.E. Priebe) Senior Theses Directed (College of William & Mary). Anthony Padula, Interpolation and Pseudorandom Function Generation, 2000; Paul Goger, Computational Experiments with Stochastic Approximation, 2001; Michael Levy, Computational Experiments with Two Response Surface Methods for Stochastic Optimization, 2003; Kristina Hofmann, Computational Experiments with Nearest Neighbor Classification, 2005. Summer REU Students (Matrix Analysis and Applications, College of William & Mary). Samuel Malone, A Study of the Stationary Configurations of the SStress Criterion for Metric Multidimensional Scaling, 1999; Robert Masson, On the Diagonal Scaling of Euclidean Distance Matrices to Doubly Stochastic Matrices, 2000. Ph.D. Students (Indiana University). Minh Tang (Computer Science) defended his dissertation on Graph Metrics and Dimensionality Reduction in October 2010. He is currently an Assistant Research Professor at Johns Hopkins University. Brent Castle (Computer Science) defended his dissertation on Quasi-Newton Methods for Stochastic Optimization with Application to SimulationBased Parameter Estimation in July 2012. He currently works for the Department of Defense. 3 42 of 59 Biographical Sketch: Funda Ergun School of Informatics and Computing, Indiana University, Bloomington 150 South Woodlawn Avenue, Bloomington, IN 47405, USA Phone: (812)-369-3793; E-mail: [email protected]; Web: www.informatics.indiana.edu/fergun Professional Preparation Bilkent University, Ankara, Turkey The Ohio State University, Columbus, OH Cornell University, Ithaca, NY University of Pennsylvania, Philadelphia Computer Engineering and Information Science B.S. 1990 Computer Science M.S. 1992 Computer Science Ph.D. 1998 Computer Science Post Doc Fellow 1999 Appointments Indiana University, Bloomington, IN Indiana University, Bloomington, IN Simon Fraser University, Burnaby, BC Simon Fraser University, Burnaby, BC Simon Fraser University, Burnaby, BC NEC Research, Princeton, NJ Case Western Reserve University, Cleveland OH Bell Laboratories, Murray Hill, NJ University of Pennsylvania Professor Associate Professor Professor Associate Professor Assistant Professor Visiting Research Scientist Schroeder Assistant Professor Member of Technical Staff Postdoctoral Researcher 2015 - present 2013 - 2015 2013 2006 - 2013 2003 - 2006 2001 - 2002 199 - 2003 1998 - 1999 1997 - 1998 Five Most Relevant Products Y. Le, J.C. Liu, F. Ergun, D. Wang. Online Load Balancing for MapReduce with Skewed Data Input. Proceedings of the the 33rd Annual IEEE International Conference on Computer Communications (INFOCOM), Toronto, ON, April 2014. F. Ergun, H. Jowhari. On Distance to Monotonicity and Longest Increasing Subsequence of a Data Stream. Combinatorica (Conf. Version: ACM/SIAM Symposium on Discrete Algorithms, SODA'08), 10.1007/s00493- 014-3035-1, 2014. P. Berenbrink, F. Ergun, F. Mallmann-Trenn, E. Sadeqi-Azer. Palindrome Recognition in the Streaming Model. Proceedings of the 31st Annual Symposium on Theoretical Aspects of Computer Science (STACS), Lyon, France, March 2014. F. Ergun, S. Muthukrishnan, S.C. Sahinalp. Periodicity testing with sublinear samples and space. ACM Transactions on Algorithms 6(2), pp 1-14. 2010. F. Ergun, H. Jowhari, M. Saglam. Periodicity in Streams. Proceedings of the 14th Intl. Workshop on Randomization and Computation (RANDOM), Barcelona, Spain, September 2010. Five Other Significant Products Y. Le, F. Wang, J.C. Liu, F. Ergun. On Datacenter-Network-Aware Load Balancing in MapReduce. Proceedings of the 8th IEEE CLOUD, New York, NY, June 2015. T. Batu, F. Ergun, C. Sahinalp. Oblivious String Embeddings and Edit Distance Approximations. Proceedings of the 17th ACM/SIAM Symposium on Discrete Algorithms (SODA), Miami, Florida, January 2006. 43 of 59 A. Czumaj, F. Ergun, L. Fortnow, A. Magen, I. Newman, R. Rubinfeld, C. Sohler. Sublinear Approximation of Euclidean Minimum Spanning Tree. SIAM Journal on Computing, 35(1):91–109, 2005. P. Berenbrink, F. Ergun, T. Friedetzky. Finding Frequent Patterns in a String in Sublinear Time. Proceedings of the 13th Annual European Symposium on Algorithms (ESA), LNCS, pp. 747–757, Mallorca, Spain, October 2005. F. Ergun, S. R. Kumar, R. Rubinfeld. Fast Approximate Probabilistically Checkable Proofs. Information and Computation, 189(2):135–159, 2004. Synergistic Activities Leader, PIMS Collaborative Research Group on Algorithmic Theory of Networks. Co-organizer, Summer School on Randomized Algorithms, Algorithms, Vancouver, BC, 8/14. Co-organizer, BIRS Workshop on Communication Complexity, Banff, AB, 8/14. Co-organizer, Workshop on Streaming Algorithms, Dortmund, Germany, 7/12. PC Member, Symposium on Discrete Algorithms (SODA), Kyoto, Japan,1/12. Collaborators in the past 48 months (16 ) Petra Berenbrink (Simon Fraser U.), Will Evans (U of British Columbia), Nick Harvey (U. of British Columbia), Lisa Higham (U. of Calgary), Hossein Jowhari (U. of Warwick), Bruce Kapron (U. of Victoria), David Kirkpatrick (U. of British Columbia), Valerie King (U. of Victoria), J.C. Liu (Simon Fraser U.), Kostas Oikonomou (AT&T Research), Mert Saglam (U. of Washington), Rakesh Sinha (AT&T Research), Venkatesh Srinivasan (U. of Victoria), D. Wang (Hong Kong Polytechnic U.), F. Wang (U. of Mississippi), Philipp Woelfel (U. of Calgary) Current Advisees ( 2 ) Erfan Sadeqi-Azer (one Ph.D. student), Peter Kling (one postdoctoral researcher). Past Advisees ( 4 ) Hossein Jowhari, Erfan Sadeqi-Azer, Yanfang Le (three graduate students: one Ph.D., two M.S.), Christiane Lammersen (one postdoctoral researcher) in the past 48 months. Before that, an additional two postdocs and six graduate students. Advisors ( 2 ) Ph.D. advisor: Ronitt Rubinfeld (Massachusetts Institute of Technology, Cambridge, MA) Postdoctoral advisor: Sampath Kannan (University of Pennsylvania, Philadelphia, PA) (two advisors: one Ph.D., one postdoctoral) 44 of 59 Biographical Sketch for Michael Larsen Professional Preparation Harvard College, A.B. in Mathematics, 1984. Princeton University, Ph.D. in Mathematics, 1988. Institute for Advanced Study, School of Mathematics, 1988–1990. Appointments Indiana University, 1997–present. University of Missouri, 1997–1998. University of Pennsylvania, 1990–1997. Publications Most relevant • Kazhdan, David; Larsen, Michael; Varshavsky, Yakov: The Tannakian Formalism and the Langlands Conjectures, Algebra Number Theory 8 (2014), no. 1, 243–256. • Kollár, János; Larsen, Michael: Quotients of Calabi-Yau varieties. Algebra, arithmetic, and geometry: in honor of Yu. I. Manin. Vol. II, 179–211, Progr. Math., 270, Birkhäuser Boston, Inc., Boston, MA, 2009. • Larsen, Michael; Lubotzky, Alexander; Marion, Claude: Deformation theory and finite simple quotients of triangle groups I, J. Eur. Math. Soc. (JEMS) 16 (2014), no. 7, 1349-1375. • Larsen, Michael; Pink, Richard: Finite subgroups of algebraic groups, J. Amer. Math. Soc. 24 (2011), 1105–1158. • Larsen, Michael; Shalev, Aner: Tiep, Pham Huu: Waring Problem for Finite Simple Groups, Annals of Math. (2) 174 (2011), no. 3, 1885–1950. Representative • Elkies, Noam; Kuperberg, Greg; Larsen, Michael; Propp, James: Alternating-sign matrices and domino tilings. I. J. Algebraic Combin. 1 (1992), no. 2, 111–132. • Freedman, Michael H.; Kitaev, Alexei; Larsen, Michael J.; Wang, Zhenghan: Topological quantum computation. Mathematical challenges of the 21st century. Bull. Amer. Math. Soc. (N.S.) 40 (2003), no. 1, 31–38. • Larsen, Michael: Maximality of Galois actions for compatible systems. Duke Math. J. 80 (1995), no. 3, 601–630. • Larsen, Michael; Lunts, Valery A.: Rationality criteria for motivic zeta functions. Compos. Math. 140 (2004), no. 6, 1537–1560. • Larsen, Michael; Pink, Richard: On the ℓ-independence of algebraic monodromy in compatible systems of representations, Invent. Math. 107 (1992), 603–636. Synergistic Activities • The proposer is Chair of the prize committee for the Frank Nelson Cole Prize in Number Theory. • The proposer serves on the following editorial boards: Journal of the American Mathematical Society, Transactions of the American Mathematical Society, Memoirs of the American Mathematical Society, Indiana University Mathematics Journal. 1 45 of 59 • The proposer served for three years on the Putnam Committee of the Mathematical Association of America; he contributed sixteen problems which appeared on Putnam Examinations over a four year period. • The proposer developed software for the Gutenberg Project that was solicited by and subsequently contributed to the optical character recognition project of the Free Software Foundation and System 4 (which was later folded into Mac OS X) of NeXT Computer. • The proposer consulted for E-Systems, a subsidiary of Raytheon, on signal processing issues related to a robotics program. Collaborators Khalid Bou-Rabee (CUNY) Jean Bourgain (Institute for Advanced Study) Emmanuel Breuillard (Université Paris-Sud) Jordan Ellenberg (University of Wisconsin) David Fisher (Indiana University) Robert Guralnick (USC) Bo-Hae Im (KAIST) David Kazhdan (Hebrew Univiversity) Chandrashekhar Khare (UCLA) János Kollár (Princeton University) Ayelet Lindenstrauss (Indiana University) Alexander Lubotzky (Hebrew University) Valery Lunts (Indiana University) Justin Malestein (Hebrew University) Claude Marion (Universität Freiburg ) Gunter Malle (Universität Kaiserslautern) Barry Mazur (Harvard University) Karl Rubin (UC Irvine) Gordan Savin (University of Utah) Aner Shalev (Hebrew University) Ralf Spatzier (University of Michigan) Matthew Stover (Temple University) Pham Tiep (University of Arizona) Yakov Varshavsky (Hebrew University) Graduate and Postdoctoral Advisors Gerd Faltings (Max Planck Institute Bonn) Robert Langlands (Institute for Advanced Study) Thesis Advisor and Postgraduate-Scholar Sponsor Brad Emmons - Utica College (graduate) Arthur Gershon (graduate) Chun Yin Hui - VU Amsterdam (graduate) Bo-hae Im - KAIST (graduate) 2 46 of 59 Daniel Jordan - Columbia College (graduate) Neeraj Kashyap (graduate) Eugene Kushnirski - Northwestern University (postdoctoral) Corey Manack (graduate) Michael Movshev - SUNY Stony Brook (graduate) Christopher Thornhill - Wayland Baptist University (graduate) Krishna Venkata (graduate) Erik Wallace - University of Connecticut (graduate) The proposer has supervised eleven Ph. D. theses and is currently supervising six doctoral students (including two who have not yet qualified to begin work on their theses). He has also supervised one postdoctoral associate. 3 47 of 59 Yuan Zhou E-mail: [email protected], Homepage: http://homes.soic.indiana.edu/yzhoucs/ Professional Preparation B.Eng. in Computer Science, Tsinghua University Student in Tsinghua University – Microsoft CS Pilot Class GPA: 94.1/100, Rank 1/130 Beijing, China, 2005 – 2009 M.Sc. in Computer Science, Carnegie Mellon University Pittsburgh, Pennsylvania, USA, 2009 – 2013 Ph.D. in Theoretical Computer Science, Carnegie Mellon University Pittsburgh, Pennsylvania, USA, 2009 – 2014 Advisors: Prof. Venkatesan Guruswami and Prof. Ryan O’Donnell Appointments Related Publications Assistant Professor Computer Science Department, Indiana University at Bloomington 2016.08 – current Instructor in Applied Mathematics Mathematics Department, Massachusetts Institute of Technology 2014.08 – 2016.06 Optimal Sparse Designs for Process Flexibility via Probabilistic Expanders Xi Chen, Jiawei Zhang, Yuan Zhou Operations Research 63(5): pp. 1159–1176 (2015) Satisfiability of Ordering CSPs Above Average Is Fixed-Parameter Tractable Konstantin Makarychev, Yury Makarychev, Yuan Zhou FOCS 2015, Proceedings of the 56th Annual Symposium on Foundations of Computer Science Constant Factor Lasserre Gaps for Graph Partitioning Problems Venkatesan Guruswami, Ali Kemal Sinop, Yuan Zhou SIAM Journal on Optimization 24–4 (2014), pp. 1698–1717 Optimal PAC Multiple Arm Identification with Applications to Crowdsourcing Yuan Zhou, Xi Chen, Jian Li ICML 2014, the 30th International Conference on Machine Learning Hardness of Robust Graph Isomorphism, Lasserre Gaps, and Asymmetry of Random Graphs Ryan O’Donnell, John Wright, Chenggang Wu, Yuan Zhou SODA 2014, Proceedings of the 25th annual ACM-SIAM Symposium on Discrete Algorithms Additional Publications Hypercontractive inequalities via SOS, with an application to Vertex-Cover Manuel Kauers, Ryan O’Donnell, Li-Yang Tan, Yuan Zhou SODA 2014, Proceedings of the 25th annual ACM-SIAM Symposium on Discrete Algorithms Approximability and proof complexity Ryan O’Donnell, Yuan Zhou SODA 2013, Proceedings of the 24th annual ACM-SIAM Symposium on Discrete Algorithms Hypercontractivity, Sum-of-Squares Proofs, and their Applications Boaz Barak, Fernando Brandão, Aram Harrow, Jonathan Kelner, David Steurer, Yuan Zhou STOC 2012, Proceedings of the 44th annual ACM Symposium on Theory of Computing Conference 48 of 59 Polynomial integrality gaps for strong SDP relaxations of Densest k-Subgraph Aditya Bhaskara, Moses Charikar, Venkatesan Guruswami, Aravindan Vijayaraghavan, Yuan Zhou SODA 2012, Proceedings of the 23th annual ACM-SIAM Symposium on Discrete Algorithms Approximation Algorithms and Hardness of the k-Route Cut Problem Julia Chuzhoy, Yury Makarychev, Aravindan Vijayaraghavan, Yuan Zhou SODA 2012, Proceedings of the 23th annual ACM-SIAM Symposium on Discrete Algorithms Invited to ACM Transactions on Algorithms Recent Collaborators Xi Chen, New York University Xue Chen, University of Texas at Austin Parikshit Gopalan, Microsoft Research Venkatesan Guruswami, Carnegie Mellon University Manuel Kauers, Johannes Kepler Universität Jian Li, Tsinghua University Konstantin Makarychev, Microsoft Research Yury Makarychev, Toyota Technological Institute at Chicago Raghu Meka, University of California at Los Angeles Ryan O’Donnell, Carnegie Mellon University Omer Reingold, Samsung Research America Ali Kemal Sinop, Institute for Advanced Study Li-Yang Tan, Toyota Technological Institute at Chicago Madhur Tulsiani, Toyota Technological Institute at Chicago John Wright, Carnegie Mellon University Chenggang Wu, Tsinghua University Salil Vadhan, Harvard University Yuichi Yoshida, National Institute of Informatics (Japan) Jiawei Zhang, New York University Recent Co-editors None Graduate Advisors and Postdoctoral Sponsors None Thesis Advisor and Postgraduate– Scholar Sponsor None 49 of 59 Martha White Department of Computer Science and Informatics, Indiana University, Bloomington 150 South Woodlawn Avenue Bloomington, IN 47405, USA E-mail: [email protected] Web: www.informatics.indiana.edu/martha 1. Professional preparation • B.S., Mathematics, University of Alberta, Edmonton, Canada, 2008. • B.S., Computing Science, University of Alberta, Edmonton, Canada, 2008. • M.S., Computing Science, University of Alberta, Edmonton, Canada, 2010. • Ph.D., Computing Science, University of Alberta, Edmonton, Canada, 2015. 2. Appointments 01/2015 — present Assistant Professor, Department of Computer Science, School of Informatics and Computing, Indiana University, Bloomington 3. Products Selected relevant publications • S. Jain, M. White, P. Radivojac. Estimating the class prior and posterior from noisy positives and unlabeled data. In Advances in Neural Information Processing Systems (NIPS), 2016. • R. S. Sutton, A. R. Mahmood and M. White. An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning. Journal of Machine Learning Research (JMLR), 2016. • M. White, J. Wen, M. Bowling and D. Schuurmans. Optimal Estimation of Multivariate ARMA Models. In Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI), 2015. • F. Mirzazadeh, M. White, A. Gyorgy and D. Schuurmans. Scalable Metric Learning for Co-embedding. In European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), 2015. • M. White, Y. Yu, X. Zhang, D. Schuurmans. Convex Multiview Subspace Learning. In Advances in Neural Information Processing Systems (NIPS), 2012. Other relevant publications Clement Gehring, Yangchen Pan and M. White. Incremental Truncated LSTD. In Proceedings of the Twenty-First International Joint Conference on Artificial Intelligence (IJCAI), 2016. A. White and M. White. Investigating practical, linear temporal difference learning. In Proceedings of the International Conference on Autonomous Agents and Multi-agent Systems (AAMAS), 2016. • J. Veness, M. White, M. Bowling, and A. Gyorgy. Partition Tree Weighting. Data Compression Conference (DCC), 2013. • M. White and D. Schuurmans. Generalized Optimal Reverse Prediction. In Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics (AISTATS), 2012. • L. Xu, M. White and D. Schuurmans. Optimal Reverse Prediction: A Unified Perspective on Supervised, Unsupervised and Semi-supervised Learning. In Proceedings of the Twenty-Sixth International Conference on Machine Learning (ICML), 2009. Honorable Mention for Best Paper 1 50 of 59 4. Synergistic activities • Program Committee member for several machine learning conferences, including ICML (2015,2016), NIPS (2015,2016), AAAI (2015,2016), IJCAI (2014, 2015,2016) • Reviewer for JMLR, ICML, NIPS, IJCAI, AAAI, AISTATS, Machine Learning Journal, Transactions on Image Processing, Journal of Autonomous Agents and Multi-agent Systems, Artificial Intelligence Journal, IEEE Transactions on Neural Networks and Learning Systems • Served on panels for graduate and undergraduate students, through Women in Technology (CeWIT) at Indiana University • Tutored native american students under Frontier College, Edmonton, AB, Canada (2014) • Workshops for youth, including workshops with Women in Scholarship, Engineering, Science and Technology (WISEST) and Women in Technology (WIT) promoting diversity in Computing Science (2011, 2007) 5. Collaborators (past 5 years, alphabetical by last name), total = 10 Bowling Michael (U. of Alberta), Degris Thomas (Google Deepmind), Gyorgy Andras (U. of Alberta), Pestilli Franco (Indiana U.), Radivojac Predrag (Indiana U.), Schuurmans Dale (U. of Alberta), Sutton Richard (U. of Alberta), Trosset Michael (Indiana U.), Veness Joel (Google Deepmind), Zhang Xinhua (NICTA) 6. Current advisees Ph.D.: Tasneem Alowaisheq, Lei Le, Raksha Kumaraswamy, Yangchen Pan. 7. Ph.D. Advisors Michael Bowling and Dale Schuurmans, University of Alberta 2 51 of 59 Biographical Sketch: Qin Zhang School of Informatics and Computing, Indiana University, Bloomington 150 South Woodlawn Avenue, Bloomington, IN 47405, USA Phone: (812)-855-2567; E-mail: [email protected]; Web: http://homes.soic.indiana.edu/qzhangcs/ 1. Professional Preparation B.S., Computer Science, Fudan University, Shanghai, China, 2006. Ph.D., Computer Science, Hong Kong University of Science and Technology, Hong Kong, 2010. Post-doctoral fellow, Computer Science, Aarhus University, Aarhus, Denmark, 2012. Post-doctoral fellow, Computer Science, IBM Research Almaden, San Jose, CA, USA, 2013. 2. Appointments 08/2013 – Assistant Professor, School of Informatics and Computing, Indiana University, Bloomington 3. Products Five Most Relevant Products D. P. Woodruff, Q. Zhang. An Optimal Lower Bound for Distinct Elements in the Message Passing Model. Proceedings of the 25th ACM-SIAM Symposium on Discrete Algorithms (SODA 14), pages 718-733. Portland, OR, USA, January 2014. K. Yi. Q. Zhang. Optimal Tracking of Distributed Heavy Hitters and Quantiles. Algorithmica, volume 65, issue 1, pages 206-223, January 2013. D. P. Woodruff, Q. Zhang. Tight Bounds for Distributed Functional Monitoring. Proceedings of the 44th ACM Symposium on Theory of Computing (STOC 12), pages 941-960. New York, NY, USA, May 2012. G. Cormode, S. Muthukrishnan, K. Yi, Q. Zhang. Continuous Sampling from Distributed Streams. Journal of the ACM. (JACM), 59(2), Article 10, April 2012. Z. Huang, K. Yi, Q. Zhang. Randomized Algorithms for Tracking Distributed Count, Frequencies, and Ranks. Proceedings of the 31th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 12), pages 295-306. Scottsdale, Arizona, USA, May 2012. Five Other Significant Products D. Belazzougui, Q. Zhang. Edit Distance: Sketching, Streaming and Document Exchange. Proceedings of the 57th IEEE Symposium on Foundations of Computer Science (FOCS 16), to appear. New Brunswick, NJ, October 2016. J. M. Phillips, E. Verbin, Q. Zhang. Lower Bounds for Number-in-Hand Multiparty Communication Complexity, Made Easy. SIAM Journal of Computing (SICOMP), volume 45, issue 1, pages 174196, February 2016. Q. Zhang. Communication-Efficient Computation on Distributed Noisy Datasets. Proceedings of the 27th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 15), pages 313322. Portland, Oregon, U.S.A., June 2015. 52 of 59 D. Van Gucht, R. Williams, D. P. Woodruff and Q. Zhang. The Communication Complexity of Distributed Set-Joins with Applications to Matrix Multiplication. Proceedings of the 34th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 15), pages 199-212. Melbourne, VIC, Australia, May-June 2015. E. Verbin and Q. Zhang. The Limits of Buffering: A Tight Lower Bound for Dynamic Membership in the External Memory Model. SIAM Journal of Computing (SICOMP), volume 42, issue 1, pages 212-229, January 2013. Note: All papers above use alphabetic ordering of authors, following the convention of theoretical computer science. 4. Synergistic Activities Served on program committees of 11 conferences and workshops in theoretical computer science, databases and data mining. Ad-hoc reviewers for 11 journals and 17 conferences. Currently supervising two PhD students. Member of three PhD student advisory committees. Member of committee developing a course curriculum for the Computational and Analytic Track of the Data Science Program at School of Informatics and Computing, IUB. 53 of 59 David Fisher Biographical Sketch Education B.A. in Mathematics, Columbia University, May 1993, summa cum laude. M.S. in Mathematics, The University of Chicago, August 1994. PhD. in Mathematics, The University of Chicago, June 1999. Academic positions Summer 2005 to present, Department of Mathematics, Indiana University at Bloomington: Assistant Professor, 2005-2008, Associate Professor, 2008-2010, Full Professor, since July 2010. Spring 2004 to 2007, member of Doctoral Faculty, Department of Mathematics, CUNY Graduate Center. Fall 2002 to Spring 2005, Assistant Professor, Department of Mathematics and Computer Science, Lehman College-CUNY. 1999 to 2002, Gibbs Instructor and NSF Postdoctoral Fellow, Department of Mathematics, Yale University. Five most relevant publications (1) Coarse differentiation of quasi-isometries II: Rigidity for Sol and Lamplighter groups. joint with A. Eskin and K. Whyte, Annals of Math 177 (2013), no. 3, 869910. (2) Global rigidity of higher rank Anosov actions on tori and nilmanifolds. joint with B. Kalinin and R. Spatzier J. Amer. Math. Soc. 26 (2013), no. 1, 167198. (3) Coarse Differentiation of Quasi-isometries I; spaces not quasiisometric to Cayley graphs. joint with A. Eskin and K. Whyte, Annals of Math. 176 (2012), 221-260. (4) Local rigidity of affine actions of higher rank groups and lattices, joint with G. Margulis, Annals of Math. 107 (2009), no. 1, 67122. (5) Quasi-isometric embeddings of symmetric spaces joint with Kevin Whyte, preprint: http://arxiv.org/abs/1407.0445. Other publications (1) Totally non-symplectic Anosov actions on tori and nilmanifolds, joint with Boris Kalinin and Ralf Spatzier, Geometry and Topology 15 (2011), no. 1, 191216. (2) Quasi-isometric rigidity of solvable groups joint with A. Eskin, Proceedings of the International Congress of Mathematicians 2010 (ICM 2010), volume III. 1 54 of 59 2 (3) Quasi-isometric embeddings of non-uniform lattices joint with Thang Nguyen, preprint. (4) Almost isometric actions, property (T ) and local rigidity, joint with G. Margulis, Invent. Math., 162 (2005) 19-80. (5) Local rigidity for cocycles, joint with G. Margulis, in Surv. Diff. Geom. Vol VIII, refereed volume in honor of Calabi, Lawson, Siu and Uhlenbeck , editor: S.T. Yau, 45 pages, 2003. Synergistic Activities (1) Co-organized 17 Summer Schools, Workshops, Conferences. (2) Invited to give mini-courses/lecture series at 9 international conferences and summer schools since 2006. (3) Co-organizer of various seminars, lecture series and colloquiua, Indiana University, Fall 2005 to present. (4) Mentor for graduate students and postdocs at IU and several other universities. Reader for theses at University of Michigan and Université de Valenciennes, France. (5) Arranged screening and panel discussion of documentary Counting from Infinity on work of Yitang Zhang at IU Cinema. Campus wide math focused event. Recent Collaborators (11) Yves de Cornulier, Université de Paris, Orsay. Tullia Dymarz, University of Wisconsin, Madison. A.Eskin, University of Chicago. T.J.Hitchman, University of Northern Iowa. Boris Kalinin, Penn State University. Neeraj Kashyp, Indiana University. G.A.Margulis, Yale University. Karin Melnick, University of Maryland, College Park. Thang Nguyen, Indiana University. Ralf Spatzier, University of Michigan. K.Whyte, University of Illinois at Chicago. Thesis Advisor: R.J.Zimmer, University of Chicago. Postdoctoral Senior Researcher: G.A.Margulis, Yale University. Graduate Students Advised/Postdoctoral Scholars Sponsored: Irine Peng (postdoctoral mentor, 2008-2011) Ning Yang (doctoral student, finishing this year) Thang Nguyen (doctoral student, finishing this year) 55 of 59 Biographical Sketch Ciprian Demeter Work address: Department of Mathematics, Indiana University, Bloomington Rawles Hall, 831 East 3rd St. Bloomington, IN 47405 E-mail: [email protected] Professional Preparation Member (On leave from IU), 2007-2008, Institute for Advanced Study (Princeton) Postdoctorate (Hedrick Assistant Professor), 2004-2007, University of California (Los Angeles) Ph.D., 2004, University of Illinois at Urbana-Champaign, Urbana-Champaign, Illinois, (Mathematics) M.S., 1999, Babes-Bolyai University, Cluj-Napoca, Romania, (Mathematics) B.A., 1998, Babes-Bolyai University, Cluj-Napoca, Romania, (Mathematics) Appointments 2016 , Indiana University (Bloomington), Full Professor 2011- 2016, Indiana University (Bloomington), Associate Professor 2008- 2011, Indiana University (Bloomington), Tenure Track Assistant Professor 2007-2008, Institute for Advanced Study (Princeton), Member 2004-2007, University of California (Los Angeles), Hedrick Assistant Professor Awards and honors • Sloan research fellowship (2009-2011) • Continuous NSF support 2006-present • Rothrock teaching award (2016) Selected publications • Proof of the main conjecture in Vinogradov’s mean value theorem for degrees higher than three (with Jean Bourgain and Larry Guth), to appear in Annals of Math. • The proof of the l2 Decoupling Conjecture (with Jean Bourgain), Annals of Math. 182 (2015), no. 1, 351-389. • Breaking the duality in the return times theorem (with Michael Lacey, Terence Tao and Christoph Thiele), Duke Math. J. 143 (2008), no. 2, 281-355 • New bounds for the discrete Fourier restriction to the sphere in four and five dimensions (with Jean Bourgain), Int. Math. Res. Not. IMRN 2015, no. 11, 3150-3184 • Linear independence of time frequency translates for special configurations Math. Res. Lett. 17 (2010), no. 4, 761-779 • Logarithmic Lp bounds for maximal directional singular integrals in the plane (with Francesco Di Plinio), J. Geom. Anal. 24 (2014), no. 1, 375-416 1 56 of 59 • On the two dimensional Bilinear Hilbert Transform, (with Christoph Thiele), American Journal of Mathematics, 132 (2010), no. 1, 201-256 • Modulation invariant bilinear T(1) theorem (with Árpád Bényi, Andrea R. Nahmod, Christoph M. Thiele, Rodolfo H. Torres, Paco Villarroya), Journal d’Analyse Mathematique, 109 (2009), 279-352 Synergistic Activities 1. In recent years, I have discovered surprising and interesting connections between diverse areas of Mathematics such as Harmonic Analysis, Ergodic Theory, Number Theory, PDEs, Incidence Geometry and the theory of random Schrödinger operators. I have made my work public through thirty research papers, and through numerous talks in various seminars and conferences. 2. In recent years, I have co-organized one section of an AMS meeting on Harmonic Analysis and related Topics (Bloomington, April 2008), as well as a similar section at the international AMS meeting in Alba-Iulia, Romania in 2013. I have co-organized three summer schools with my collaborator Christoph Thiele, and another one with him and Michael Lacey. 3. I have taught at three different universities a variety of Algebra, Geometry, Calculus, Dynamics and Analysis classes. I constantly improved and adapted my teaching skills to the specifics of each course and conveyed my students motivation and a rigorous understanding of the material. I found particularly interesting and challenging to teach Merit Workshop and small group Active Learning classes, which gave me the opportunity to spend more time with students and test their skills better. I have recently taught a few graduate topics classes, that served as a training for and facilitated the recruitment of my first three graduate students, Francesco Di Plinio, Prabath Silva and Fangye Shi. 4. I am currently co-organizing the Analysis seminar at Indiana University, Bloomington. 5. I have written a survey paper entitled A guide to Carleson’s Theorem, meant to be a gentle introduction for a large audience to selected topics in time frequency analysis. Graduate students and postdoctoral fellows: I have supervised two successful graduate students, Francesco Di Plinio (currently on tenure track at University of Virginia) and Prabath Silva (former postdoc at Caltech). I am currently supervising another two graduate students (Fangye Shi and Dominique Kemp). I have mentored Zubin Gautam as a postdoctoral fellow. I am currently mentoring Shaoming Guo as a postdoctoral fellow. 2 57 of 59 Probabilistic Approaches to Computational Problems Personnel Russell Lyons is the leading probabilist in the Department of Mathematics and on the Indiana University Bloomington campus (due to retirements and faculty retention issues). His research has theoretical connections with the Algorithms and Theory groups in the Department of Computer Science. He holds many honors, including being a Rudy Professor, having recently given an invited talk to the International Congress of Mathematicians, and being a Fellow of the American Mathematical Society. Michael Larsen is the leading algebraist in the Department of Mathematics, whose work includes studying expander graphs, a common tool in theoretical computer science. Many advanced problems benefit from a variety of mathematical techniques, so it is important to include experts in many areas. He holds many honors, including being a Distinguished Professor at Indiana University Bloomington, being a Fellow of the American Mathematical Society, and having received the E. H. Moore Research Article Prize from the American Mathematical Society. Ciprian Demeter is the leading harmonic analyst in the Department of Mathematics. Many advanced problems benefit from a variety of mathematical techniques, so it is important to include experts in many areas. He won a Sloan Foundation Fellowship. His recent work with Jean Bourgain on the nonlinear Schrödinger equation was one of four featured current events in mathematics at the yearly joint meetings of the American Mathematical Society in 2015. David Fisher is the leading geometer in the Department of Mathematics. Many advanced problems benefit from a variety of mathematical techniques, so it is important to include experts in many areas. His research on coarse differentiation is important for understanding metric embeddings, a common theme in this proposal. He is a Fellow of the American Mathematical Society and has received a Simons Foundation Fellowship and a prestigious CAREER award from the National Science Foundation. Michael Trosset is the leading statistician at Indiana University Bloomington with expertise in statistical learning and computational statistics. As Director of the Indiana Statistical Consulting Center since 2006, he has years of experience fostering interdisciplinary research on campus. Funda Ergun is the senior member of the Algorithms Group in the Department of Computer Science. She has experience with metric embeddings and pattern recognition, including in streaming data. Her synergistic activities include leadership experience organizing a Pacific Institute for the Mathematical Sciences Collaborative Research Group on Algorithmic Theory of Networks. She brings a deep understanding of the mathematical needs of computer science research to this project. 1 58 of 59 Qin Zhang is an Assistant Professor in the Algorithms Group and the Theory Group in the Department of Computer Science. His research includes algorithms for streaming data and communication complexity. He has prior research experience in the Theory Group at IBM Almaden Research Center and at the Center for Massive Data Algorithmics at Aarhus University. His work is closest to Funda Ergun’s. Having junior faculty involved on the project is important both for the new ideas they contribute and for the environment they create for new faculty and postdoctoral hires on this proposal. They can serve as mentors close in age and experience to the postdoctoral fellows and cluster hire faculty members. Yuan Zhou is an Assistant Professor in the Algorithms Group in the Department of Computer Science. His research includes approximation algorithms, complexity theory, and satisfiability theory. His work is closest to Russell Lyons’. Having junior faculty involved on the project is important both for the new ideas they contribute and for the environment they create for new faculty and postdoctoral hires on this proposal. They can serve as mentors close in age and experience to the postdoctoral fellows and cluster hire faculty members. Martha White is an Assistant Professor in the Intelligent Systems Group in the Department of Computer Science. Her research includes work on temporal-difference learning algorithms and metric learning for determining the best distance function to use for a target task. Her work is closest to Michael Trosset’s. Having junior faculty involved on the project is important both for the new ideas they contribute and for the environment they create for new faculty and postdoctoral hires on this proposal. They can serve as mentors close in age and experience to the postdoctoral fellows and cluster hire faculty members. 2 59 of 59
© Copyright 2024 Paperzz