Friday, March 6, 2015

The Pseudo-Marginal Miracle

Over the last month I have been making some interesting connections between different areas of statistics.  Explaining these connections (or even the things being connected) is perhaps an overly ambitious aim for a blog post, but I think it is worth a try!

The three problems below are all interconnected.

MCMC for doubly intractable problems

This typically occurs when you want to use MCMC to estimate parameters of a model where the normalization term is intractable and it is dependent on the parameters you are interested in estimating.

With the basic MCMC approach, the normalization term needs to be evaluated on every iteration of the algorithm.  Recent advances in MCMC (the Pseudo-Marginal approach) have lead to algorithms where a suitable approximation to the normalization terms can be used and the algorithm still gives the same results (asymptotically) as the basic algorithm.

Reversible Jump MCMC

This occurs when one of the things you don't know is the number of parameters in your model (e.g. number of clusters or components in a mixture model).

With the basic MCMC approach it is difficult to make good proposals for the parameter values when your proposal changes the number of parameters in the model.  However the the Pseudo-Marginal approach can also be applied to Reversible Jump MCMC.  This results in parameter proposals that are more likely to be accepted in the MCMC, and therefore makes the method much more computationally efficient because less time is spent generating proposals that subsequently get rejected.

MCMC parameter estimation for nonlinear stochastic processes

This is useful for some models in econometrics, epidemiology, and chemical kinetics.  Suppose you have some data (e.g. stock prices, infection numbers, chemical concentrations) and a nonlinear stochastic model for how these quantities evolve over time.  You may be interested in inferring posterior distributions for the parameter values in your model.

The Pseudo-Marginal approach was used to develop a method called particle MCMC that does this parameter estimation more efficiently than basic MCMC.

In all three cases (doubly intractable, reversible jump, nonlinear stochastic processes), applying basic MCMC results in the need to evaluate the density of an intractable marginal probability distribution.  Somewhat miraculously, the Pseudo-Marginal approach obviates the need to evaluate this probability density exactly, while still preserving key theoretical properties of the algorithm related to convergence and accurate estimation.

Thursday, February 5, 2015

New Perspectives in Markov Chain Monte Carlo

In my last post I mentioned that I had been finding out about SQMC (Sequential Quasi Monte Carlo), a new method for sequential problems (such as time-series modelling).  I would like to revise some of what I said.  I implied that SQMC would be useful for Hidden Markov Models.  However this is not entirely accurate as there are other alternatives to SQMC that can be used for Hidden Markov Models that are much more computationally efficient (such as the Viterbi algorithm for finding the most likely sequence of hidden states).  The benefit of SQMC (and its main competitor SMC) is that it can be applied to a very wide class of models (such as stochastic volatility models).

Since my last post I have been further expanding my repertoire / toolbox of methods for statistical inference.  The aim of this exercise is to expand the class of models that I know how to do statistical inference for, with a view to doing statistical inference for realistic models of neural (brain) activity.

I have been concentrating my efforts in the following two areas.

Particle MCMC

This is an elegant approach to combining a particle filter / SMC with Markov Chain Monte Carlo for parameter estimation.  The main applications in the original paper (Andrieu et al. 2010) are to state space models (such as the stochastic volatility model).  However the method has since been applied to Markov Random Fields (e.g. the Ising Model and random graphs models) and to stochastic models for chemical kinetics.

The exciting thing about Particle MCMC is that parameter estimation for nonlinear stochastic models is much more computationally efficient than it is with standard MCMC methods.

More generally it is also exciting to be working in a field where major breakthroughs are still being made.

MCMC Methods for Functions

Particle MCMC is a solution for cases where the processes in the model are too complex for standard MCMC to be applied.

The work on MCMC methods for functions is a solution for cases where the dimension of the problem is very high.  The paper I have been looking at (Cotter et al. 2013) presents applications to nonparametric density estimation, inverse problems in fluid mechanics, and some diffusion processes.  In all these cases we want to infer something about a function.  Some functions are low-dimensional (such as a straight line which only has 2 parameters).  However the functions considered in the paper require a large number of parameters in order for them to be accurately approximated, hence the statistical inference problem is high-dimensional.

The methods in the paper appear to be much more computationally efficient than standard MCMC methods for high-dimensional problems, and it looks like they are quite easy to implement.

However the methodology does not apply to as wide a range of models as Particle MCMC.

As far as I know there are not currently any methods that work well for high-dimensional complex models.  This means it is difficult to use bayesian inference for weather prediction where high-dimensional complex models are needed to make accurate predictions.  This is an active area of research at Reading (which has a very strong meteorology department), and I will be interested to see what progress is made in this area in the coming years.

I haven't got very far into the neuroscience modelling part of my project.  However I have a feeling I may be faced with a similar problem to the meteorologists, i.e. needing to use a high-dimensional complex model.

I will be continuing to develop my interest and knowledge in these areas by going to a summer school in Valladolid this summer (New Perspectives in Markov Chain Monte Carlo, June 8-12).

Monday, January 5, 2015

Autumn Term

Here are a few things I have found interesting from my first term as a PhD student at Reading University.

I started with the intention of quickly writing a short blog post, but that proved too difficult...

Parameter estimation for the Ising model

The Ising model comes from statistical physics, and describes how neighbouring atoms interact with each other to produce a lattice of spin states.  Each atom is a variable in the model, leading to a complex multivariate model.  For example, a 10x10 lattice is associated with a 100 variable joint probability mass function with a total of 2^100 states (assuming there are 2 spin states for each atom).  The combination of high dimension (number of variables) and dependencies between variables make it challenging to analyse or compute anything of interest.  Monte Carlo methods are needed to approximate the integrals that are used in Bayesian inference for this model.

Meeting people working in Neuroscience at Reading University

I have found out more research in the Systems Engineering department by meeting with Ingo Bojak and Slawek Nasuto.  Among other things, they work with neural field and neural mass models.  Their models include spatial interaction, stochasticity, and uncertain parameter values.  The combination of these model features means that the challenges of statistical inference may be the same as for the Ising model: high dimension and dependencies between variables.

Introduction to Neuroscience course

I am sitting in on an Introduction to Neuroscience undergraduate course to improve my subject knowledge in this area.  I have enjoyed learning random facts about the brain.  Just to pick out one area, the story of Phineas Gage is quite fascinating both at a superficial level (how could someone survive an iron bar going through their brain!?) and at a deeper level (how and why did Gage's mind change after the accident?)  The story is well told in Descartes' Error by Antonio Damasio.

As well as learning random facts, the good thing about taking a course is seeing how different things link together.  I have found the concepts of synaptic plasticity and receptive fields quite interesting.  Synaptic plasticity is the process by which connections between neurons change over time, and underpins learning.  Receptive fields are representations of how neurons respond to a range of different stimuli.  Receptive fields are important for understanding how physical stimuli are transformed into perception.  These concepts are fundamental for understanding a wide range of brain functions - hearing, vision, bodily sensation etc.

RSS Read Paper on SQMC

I travelled to London for an event held by the Royal Statistical Society on Sequential Quasi Monte Carlo, a new method developed by Mathieu Gerber and Nicolas Chopin (a relative of the great Frederic, or so I am told...).  Only a handful of papers each year are read before the society, so they are usually fairly significant advances.  This particular paper was presented as an advance from Sequential Monte Carlo (SMC).  SMC methods are useful for statistical inference of time series or dynamic models (e.g. Hidden Markov Models).  SQMC is also useful for statistical inference of time series / dynamic models, but, computationally, it is an order of magnitude faster than SMC.

APTS

The Academy for PhD Training in Statistics (APTS) held a week long course in Cambridge where I learnt about Statistical Inference and Statistical Computing.  There were 50-100 other first year PhD students in Statistics there from all over the UK and Ireland.

It was nice to hear from some statisticians who work in different application areas to me.  I enjoyed finding out about statistical modelling of volcanoes and of the Antarctic ice sheet.

It was also interesting to hear how academic statisticians have worked with government, e.g. MRC Biostatistics Unit, and the Cabinet Office National Risk Register.