Monday, March 6, 2017

Pushing the boundaries of what we know

I have recently been dipping into a book called 'What we cannot know' by Marcus du Sautoy.  Each chapter looks at a different area of physics.  The fall of a dice is used as a running example to explain things like probability, Newton's Laws, and chaos theory.  There are also chapters on quantum theory and cosmology.  It's quite a wide-ranging book, and I found myself wondering how the author had found time to research all these complex topics, which are quite different from each other.  That is related to one the messages of the book - that one person cannot know everything that humans have discovered.  It seems like Marcus du Sautoy has had a go at learning everything, and found that even he has limits!

I think the main message of the book is that many (possibly all) scientific fields have some kind of knowledge barrier beyond which it is impossible to pass.  There are fundamental assumptions which, when you assume they are true, explain empirical phenomena.  The ideal in science (at least for physicists) is to be able to explain a wide range of (or perhaps even all) empirical phenomena from a small set of underlying assumptions.  But science cannot explain why its most fundamental assumptions are true.  They just are.

This raises an obvious question: where is the knowledge barrier?  And how close are we to reaching it?  Unfortunately this is another example of something we probably cannot know.

In my own field of Bayesian computation, I think there are limits to knowledge of a different kind.  In Bayesian computation it is very easy to write down what we want to compute - the posterior distribution.  It is not even that difficult to suggest ways of computing the posterior with arbitrary accuracy.  The problem is that, for a wide range of interesting statistical models, all the methods that have so far been proposed for accurately computing the posterior are computationally intractable.

Here are some questions that could (at least in principle) be answered using Bayesian analysis.   What will earth's climate be like in 100 years time?  Or, given someone's current pattern of brain activity (e.g. EEG or fMRI signal) how likely are they to develop dementia in 10-20 years time?

These are both questions for which it is unreasonable to expect a precise answer.  There is considerable uncertainty.  I would go further and argue that we do not even know how uncertain we are.  In the case of climate we have a fairly good idea of what the underlying physics is.  The problem is in numerically solving physical models at a resolution that is high enough to be a good approximation to the continuous field equations.  In the case of neuroscience, I am not sure we even know enough about the physics.  For example, what is the wiring diagram (or connectome) for the human brain?  We know the wiring diagram for the nematode worm brain - a relatively recent discovery that required a lot of work.  The human brain is a lot harder!  And even if we do get to the point of understanding the physics well enough, we will come up against the same problem with numerical computation that we have for the climate models.

There is a different route that can be followed to answering these questions, which is to simplify the model so that computation is tractable.  Some people think that global temperature trends are fitted quite well by a straight line (see Nate Silver's book 'The signal and the Noise'.)  When it comes to brain disease, if you record brain activity in a large sample of people and then wait 10-20 years to see whether they get the disease, it may be possible to construct a simple statistical model that predicts people's likelihood of getting the disease given their pattern of brain activity.  I went to a talk by Will Penny last week, and he has made some progress in this area using an approach called Dynamic Causal Modelling.

I see this as a valuable approach, but somewhat limited.  For its success it relies on ignoring things that we know.  Surely by including more of what we know it should be possible to make better predictions?  I am sometimes surprised by how often the answer to this question is 'not really' or 'not by much'.

The question of what is computable with Bayesian analysis is still an open question.  This is both frustrating and motivating.  Frustrating because a lot of things that people try don't work, and we have no guarantee that there are solutions to the problems we are working on.  Motivating because science as a whole has a good track record of making the seemingly unknowable known.


  1. This is a really interesting reflection, thanks for sharing it. I am particularly interested in the point you make about models that are not really improved by including all the current knowledge available. It is challenging to understand what variables really have a weight on the mechanisms one is trying to capture, and what can be discarded in relation to the specific research question. In my limited experience I employ sensitivity analysis, but this is only a limited answer. How do you proceed when selecting or discarding variables? Can you recommend further reads on this topic?

  2. Thanks for the question! One specific issue that can come up when you are deciding what variables to include in your model is that more complex models tend to fit data getter than simpler ones. Sometimes this is because they more accurately describe the underlying physical process / signal, but sometimes they give a better fit because they are fitting the noise better.

    There are two different paths you can go down to address this issue.

    (1) Validate the model by calculating out-of-sample prediction errors.
    (2) Calculate / estimate marginal likelihoods and use Bayes factors to compare models.

    Both of these approaches would penalize an overly complex model.

    In terms of references, the following books may be useful,

    Hastie, Trevor, Tibshirani, Robert, Friedman, J. (2009). The Elements of Statistical Learning (2nd ed.). Springer

    Gelman, A., et al. (2014). Bayesian Data Analysis (3rd ed.). CRC Press.

    The following paper is a good example of the marginal likelihood approach applied to differential equation models. It uses fairly recently developed methods.

    Penny, W., & Sengupta, B. (2016). Annealed Importance Sampling for Neural Mass Models. PLoS Comput Biol

  3. Thank you very much for the references, now I think I have a better idea of where to start studying!