Probabilistic Programming Miguel Lázaro-Gredilla [email protected] January 2013 Machine Learning Group http://www.tsc.uc3m.es/~miguel/MLG/ Contents Towards a modern machine learning Probabilistic programming languages Infer.NET The software stack: A digression The modeling language References Towards a modern machine learning Probabilistic programming languages Infer.NET References Classical machine learning I I Large number of tools with diverse backgrounds I k-means I Principal Component Analysis I Independent Component Analysis I Classical Neural Networks I Support Vector Machines I Density estimation via Parzen windows I Recursive Least Squares I Least Mean Squares I (just an arbitrary sample, we could go on and on...) Pragmatic, unsystematic, non-probabilistic 1/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References Third generation machine learning I Data sets described as instances of a probabilistic model I Gaussian Process regression/classification I Latent Dirichlet Allocation I Bayes Point Machine I ... I We can infer unknowns in a principled way I Features I Each tool can be expressed as a Bayesian network I Systematic, modular, standardized approach I Proposals are models, not algorithms I Detach inference from model 2/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References Updating classical machine learning (I/V) I Should we just dump classical ML and jump on the Bayesian bandwagon? I Most classical ML tools can be written as some type of inference on some Bayesian network I So just update classical ML using a Bayesian interpretation I Gain insight on the model behind the algorithm I See overlaps emerge I Use other types of inference I It may become obvious how to enhance them 3/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References Updating classical machine learning (II/V) I Classical algorithm: k-means I Bayesian model (assuming normalized data) p(xn |zn , {µk }) = N (xn |µzn , v I) p(v ) = InvGamma(v |1, 1) p(µk ) = N (µk |0, 10I) p(zn |w) = Discrete(zn |w) p(w) = Dirichlet(w|1k×1 ) I Classical model corresponds to I Maximum likelihood for p({xn }|{zn }, {µk }), obtained using hard-EM optimization I Particular case of Gaussian mixture model 4/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References Updating classical machine learning (III/V) I Classical algorithm: Extended Recursive Least Squares (adaptive) I Bayesian model p(xn |wn ) = N (xn |w> n un , v ) p(wn |wn−1 , α, β) = N (wn |(1 − β)wn−1 , βαI) p(v ) = InvGamma(v |1, 1) p(α) = InvGamma(α|1, 1) I p(β/(1 − β)) = InvGamma(β/(1 − β)|1, 1) Classical model corresponds to I Posterior mean for wn , for some magically selected α and v I Particular case of Kalman filter I Contrived assumptions needed to represent exponentially weighted RLS in this framework. I Hints that it might not make sense, see [KRLST]. 5/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References Updating classical machine learning (IV/V) I Classical algorithm: Principal Component Analysis I Bayesian model p(xn |yn , W, v ) = N (xn |W> yn , v I) p(yn ) = N (yn |0, I) p([W]d-th col ) = N (wd |0, diag([α1 , . . . , αD ]) p(v ) = InvGamma(v |1, 1) p(αd ) = InvGamma(v |1, 1) I Classical model corresponds to I Maximum likelihood for p(xn |W, v ) when v → 0, with W product of orthogonal and ordered diagonal matrix I Restrictions on W make it unique, but don’t change the model I New possibilities: What if we do max. lik. for p(xn |{yn }, v )? 6/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References Updating classical machine learning (V/V) I Classical algorithm: Support Vector Machines I Bayesian model I See: Sollich, P. (2002). Bayesian Methods for Support Vector Machines: Evidence and Predictive Class Probabilities. Machine Learning, 46:21-52. I Ingenious approach, but not very natural (involves using three classes to solve a binary problem) 7/27 Contents Towards a modern machine learning Probabilistic programming languages Infer.NET The software stack: A digression The modeling language References Towards a modern machine learning Probabilistic programming languages Infer.NET References Some observations According to the previous slides I Most ML tools have a Bayesian model description I Full description takes a few lines I The type of inference (ML, MAP, point estimates, full Bayesian posterior) is independent of the model I Though the tractability of each does depend on the model In the process of creating new ML tools I What makes a new ML tool worth is: A novel model I What we spend most time working on is: Making inference tractable on the new model 8/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References The idea Probabilistic programming I Define a language to describe Bayesian models (“programs”) I Create some “compiler” that understands those programs and generates inference engines for them The new workflow (emphasis is on model design) I Spend more time thinking about the model I Program it (just a few lines!) I Optional: Sample data from the model I Feed data to the inference engine and assess the results I Model wasn’t that good, do it over 9/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References Programming paradigms (I/II) A random assortment of them: I Imperative (Matlab) I Procedural (C) I Object oriented (C++) I Declarative (SQL) I Functional (Ocaml, F#) I Metaprogramming (LISP) I Domain specific language (Spice) 10/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References Programming paradigms (II/II) For probabilistic programming: I A domain specific language may be simpler for the user, but I Doesn’t integrate well with existing codebases I Doesn’t interface well with DB access, plotting capabilities, etc. I Functional languages with metaprogramming can be useful to write a “guest probabilistic program” within a “host programming language” (we’ll see F# examples) I This can be done with imperative style, but looks uglier (we’ll see Python examples) 11/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References An incomplete list of probabilistic programming languages I BUGS: Bayesian inference using Gibbs sampling I HANSEI: Extends OCaml, discrete distributions only I Hierarchical Bayesian Compiler (HBC): Large-scale models and non-parametric process priors I PyMCMC: MCMC algorithms for Python classes I Church: Extends Scheme to describe Bayesian models I Infer.NET: Provides a probabilistic language within the .NET platform See more on http://probabilistic-programming.org 12/27 Contents Towards a modern machine learning Probabilistic programming languages Infer.NET The software stack: A digression The modeling language References Towards a modern machine learning Probabilistic programming languages Infer.NET References The Java case Three big operating systems I Linux, OSX, Windows ...and one language to rule them all I Sun Microsystems designed Java to run on the JVM I ...and implemented the JVM to run on linux, mac, and windows I platform independence, bliss for programmers 13/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References The .NET case Microsoft reacted and created the JVM counterpart: The CLR Which language was used to target the CLR? I The .NET languages: VB.NET, C#, F#, IronPython... I Different languages with a common set of libraries, it is easy to interface them Which operating systems did the CLR run on? I Microsoft released the specification and standardized it I CLR-like implementations arose for linux/mac: Mono I ...but the Windows version is always ahead, more complete, with additional tools, better tested, etc. Microsoft tries to win in the cross-platform territory (sweet spot for developers) while still favoring its flagship product, Windows: Opposing objectives 14/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References Infer.NET’s interoperability I Infer.NET targets all .NET languages, with a focus on C#, F#, and IronPython I It can be used on Mono (Mac/Linux) or the CLR (Windows) (better experience/less buggy on the latter) I The .NET languages have a growing set of tools for scientific computing, but nowhere near Matlab yet I IronPython cannot use Numpy/Scipy/Matplotlib natively I A new tool called Sho provides an IronPython shell with Matlab-like capabilities (but Windows only) 15/27 Contents Towards a modern machine learning Probabilistic programming languages Infer.NET The software stack: A digression The modeling language References Towards a modern machine learning Probabilistic programming languages Infer.NET References Using Infer.NET Infer.NET provides: I A probabilistic modeling language embedded in other languages I I F# allows a more natural embedding Compilation to three inference engines I EP: Approximation includes all non-zero probability points I VB: Approximation avoids zero probability points I MCMC: Gibbs sampling, slower Let’s browse Microsoft Research examples and create a new model http://research.microsoft.com/en-us/um/cambridge/ projects/infernet/ 16/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References Two coins (F#) 17/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References Two coins (IronPython) 18/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References Learning a Gaussian (F#) 19/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References Learning a Gaussian (IronPython) 20/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References Truncated Gaussian (F#) 21/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References Truncated Gaussian (IronPython) 22/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References Gaussian Mixture (F#) 23/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References Gaussian Mixture (IronPython) 24/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References k-means (F#) [Demo] 25/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References Conclusions I Probabilistic programming is in its infancy I We might produce alternative language definitions/implementations I We can leverage it to test new models faster I We can build custom models on the fly 26/27 Towards a modern machine learning Probabilistic programming languages Infer.NET References References I [BayesSVM] Sollich, P. (2002). Bayesian Methods for Support Vector Machines: Evidence and Predictive Class Probabilities. Machine Learning, 46:21-52. I [ProbProg] The probabilistic programming wiki http://probabilistic-programming.org/wiki/Home I [InferNET] T. Minka, J. Winn, J. Guiver, and D. Knowles Infer.NET 2.5, Microsoft Research Cambridge, 2012. http://research.microsoft.com/infernet I [KRLST] M. Lázaro-Gredilla, S. Van Vaerenbergh, and I. Santamarı́a. “A Bayesian approach to tracking with kernel recursive least-squares”, MLSP 2011. 27/27
© Copyright 2026 Paperzz