Tuesday, February 2, 2016

A statistician's bookshelf

What should a statistician have on their bookshelf?  In an age where most academics have access to electronic journal archives and Wikipedia is a fairly reliable source of information for basic facts, some statisticians might say they don't need anything on their bookshelf.

I often find myself eyeing up other people's bookshelves.  My co-supervisor in Reading, Ingo Bojak, has quite a voluminous bookshelf covering advanced particle physics and neuroscience.  A large personal library can be quite imposing in a way, making you feel small.  But it can also inspire curiosity.  A couple of the conversations I have recently had with Ingo ended with him picking a book off his shelf and lending it to me.  It can feel like a rite of passage, being lent a book, and in this case it was also practically useful because Ingo has books that are not in the University of Reading library.

My main supervisor, Richard, has quite a different approach to his bookshelf.  As well as a handful of core Statistics books, mainly related to his undergraduate teaching, he has a lot of proceedings - from the Valencia meetings, a series of influential meetings in Bayesian Statistics that has now morphed into the ISBA conference, and from Read Papers at the Royal Statistical Society.  So occasionally when we are meeting, he will say, 'Oh yes, I remember somebody talking about something to do with that at a conference / Read Paper I went to 15 years ago' and then be able to look it up in the proceedings to remind himself of more of the details.

The other interesting thing I have seen on his, and on people's bookshelves is undergraduate / postgrad course material, which seem to provide useful reference material for some people, even several decades after the end of their course.  My dad, who is an engineer for Total, recently commented that he wished he had had his own MSc thesis from the late 70's at a meeting recently because he wanted to look something up in it that was relevant to what they were discussing.

I think most of the books I have on my bookshelf have grown out of successful undergraduate / early graduate lecture courses (see photo below).  A good lecture course distills the lecturer's knowledge, bringing out the ideas and methods that they think are useful for practice or research in the subject.  A good textbook is one where you feel like you can converse with the author(s), asking 'How would you go about... ?', and getting a specific answer back.

I have recently been finding the Shumway & Stoffer book on time series tremendously useful.  It is freely available online, and I used it in that format for quite a while, but I am glad I have a hard copy of it now, as I find it much easier to read from paper.  They have a very skilful way of presenting many important results from the time series literature (which is vast) in a coherent way, tying together time-domain and frequency-domain approaches, which are often treated separately by different authors.

The Mathematical Foundations of Neuroscience is a book I am currently borrowing from the Reading library, and has been quite useful for getting a better understanding of the model that I am doing statistical inference for.  I think this is something that statisticians should do more of - finding out more about the models and underlying science in whatever application area they are working in.  Anecdotally, I have found lots of scientists that need help from statisticians, but can only be helped if a statistician is prepared to invest the time in actually learning something about the scientist's subject.

The other textbooks on my bookshelf cover a varied range of topics within statistics and machine-learning.  They have proved practically useful on a number of occasions.  Here are a few specific examples (which are far from exhaustive in terms of what the books cover),

  • Monte Carlo algorithms (MacKay)
  • Kalman Filters (Bishop)
  • Regression analysis (Ramsey & Schafer)
  • Boosting (Hastie, Tibshirani & Friedman)
  • Basic hierarachical models (Gelman et al)
  • Dynamic hierarachical models (Cressie & Wikle)

I sometimes flick through the books and think that there is still an awful lot there that I haven't looked at.  Who knows what these books might still have to teach me?