Monday, May 7, 2018

Navigating the C++ forest

I have recently been working my way through 'C++ Primer', 5th edition by Lippman, Lajoie and Moo.  The front cover says the book has been a bestseller since 1986 and also that the most recent edition has been completely rewritten for the new C++11 standard.  The authors have worked in leading companies and laboratories such the Bell Laboratories, Pixar, Microsoft, IBM and AT&T.  And they have worked closely with the creator of C++, Bjarne Stroustrup.  The contents of the book live up to the billing on the cover!

I learnt some C++ a long time ago with the help of a book called 'C++: A Beginner's Guide', 2nd edition by Schildt and a university course based on the book 'Guide to Scientific Computing in C++' by Pitt-Francis and Whiteley.  These books are both fairly introductory.  I would recommend them, not least because they are both much shorter than C++ Primer. 

The advantages of C++ Primer, in my opinion, are that it gives a more comprehensive overview of the language and has a strong emphasis on Modern C++, e.g., features that were introduced in the C++11 standard.  Somehow C++ Primer achieves this without being dry!

Another popular C++ book is Bjarne Stroustroup's 'The C++ Programming Language.'  I borrowed this from my university library before I bought C++ Primer and didn't really like it.  I think it covers advanced material in more depth than C++ Primer, but it reads much more like a reference book than a tutorial.  It is longer than C++ Primer and I suspect it contains a lot of material that is not relevant to what I'm doing.

The main thing that I've learnt over the last few months is that C(++) is not really one programming language.

Although you can compile and run programs written in C, old-style C++, and Modern C++ using the same compiler, they look and feel very different from each other.  When I only knew C and old-style C++ I found it very difficult to understand programs written in Modern C++.

The C language first appeared in 1972.  If you write in C then you have to learn about pointers and use arrays if you want to do calculations with vectors and matrices.  Dynamic memory allocation is particularly awkward to implement.  There are not that many concepts to learn, and I think it is not a bad language to learn if you are prepared to learn some basic low-level programming and want to write code that runs quickly.  Because C is so simple it tends to be easier to interface with other languages, such as Fortran and Matlab.

C++ introduces features that are not available in C such as function overloading, inheritance, and the idea of data abstraction.  It is designed to facilitate Object-Oriented Programming (OOP).  Code written in OOP separates interface from implementation.  It is designed to give full control to library developers and stop users from doing anything stupid.  I generally find code written in OOP difficult to read.  I like to work in an environment where I can implement algorithms myself, but also have access to a library of algorithms that I can call easily.  Old-style C++ does not really seem to be set up for this.

In contrast, it is relatively easy to learn Modern C++, at least to the level where you can use it for simple programming tasks.  There doesn't seem to be a precise definition of Modern C++, but this page has a fairly good summary - https://docs.microsoft.com/en-gb/cpp/cpp/welcome-back-to-cpp-modern-cpp.  I think it is fair to say that I knew almost nothing about Modern C++ from the initial learning I did (which was around 10 years ago).

Although I am not generally a fan of new standards for programming languages, I have noticed that a lot of answers on Stack Overflow for C++ questions use C++11.  Often solutions for different standards are presented and the C++11 standard looks nicer / more elegant.

If you want to write programs that run fast without having to learn much / anything about low-level programming, Modern C++ is probably the way to go.  The vector and string classes in the Standard Library implement dynamic memory allocation without the user having to know anything about how this is implemented.  So it is possible to write clean looking programs that are actually doing quite sophisticated operations.

Really understanding Modern C++, to the point where you can write high-quality libraries requires a lot of effort.  To give you some idea, C++ Primer starts with a section called The Basics, which runs to about 300 pages.  The reason it is long is because there is a lot to cover, not because the writing is verbose!  Object-Oriented Programming (including concepts such as inheritance) is not discussed until p600.  And the section on Advanced Topics starts on p715.

In summary, the world of C++ is difficult to navigate, at least in my experience.  C++ Primer is the best guide that I have found.  Chapter 1 of C++ Primer on Getting Started  (only 30 pages) is particularly impressive for the range of ideas that it introduces.  I would recommend C++ Primer to people who are literally just getting started with programming as well as to people who want to write high-quality libraries.


 

Thursday, March 29, 2018

an ancient annal of computer science

Over the last year I have been interested in developing my programming / coding, to get to the point where I can be more confident of sharing my code with other people.  And also to be able to contribute to general purpose numerical / statistical software.

As part of this effort I have dipped in to The Art of Computer Programming (TAOCP) by Donald Knuth.  The cover says "this multivolume work is widely recognized as the definitive description of classical computer science."  American Scientist listed it as one of the 12 top physical-science monographs of the 20th century alongside monographs by the likes of Albert Einstein, Bertrand Russell, von Neumann and Wiener - http://web.mnstate.edu/schwartz/centurylist2.html.

I am sure there are many other books that cover similar material at a more introductory level, but I find something exciting about going back the source and reading an author who was personally involved in fundamental discoveries and developments.

There are also probably more modern accounts of computer programming that better reflect more recent innovations.  Knuth himself encourages readers of TAOCP to look at his more recent work on Literate Programming.  But I also think it is worth dwelling on things that have proven to be useful to a wide range of people over an extended period of time.

I have Volume 1 in the Third Edition of TAOCP, published in 1997, which is already prehistoric in some senses - it is before Google was founded (1998) and way before Facebook was launched (2004).  However parts of the book date a lot further back than that - Knuth's advice on how to write complex and lengthy programs was mostly written in 1964!

Here is a summary of that advice (p191-193 of TAOCP Volume 1),

Step 1 : develop a rough sketch of the main top-level program.  Make a list of subroutines / functions that you will need to write.  "It usually pays to extend the generality of each subroutine a little."
Step 2 : create a first working program starting from the lowest-level subroutines and working up to the main program.
Step 3 : Re-examine your code starting from the main program and working down studying for each subroutine all the calls made on it.  Refactor your program and subroutines.

Knuth suggests that at the end of Step 3 "it is often a good idea to scrap everything and start again".  He goes on to say "some of the best computer programs ever written owe much of the success to the fact that all the work was unintentionally lost, at about this stage, and the authors had to begin again." - quite a thought-provoking statement!

Step 4 : check that when you execute your program, everything is taking place as expected, i.e., debugging.  "Many of today's best programmers will devote nearly half their programs to facilitating the debugging process in the other half; the first half, which usually consists of fairly straightforward routines that display relevant information in a readable format, will eventually be thrown away, but the net result is a surprising gain in productivity."

I don't know whether today's best programmers still do this.  I know some pretty good programmers and have been surprised how much effort they devoted to the kind of activity that Knuth is describing.  Personally I now rely quite a lot on the debugger in Visual Studio, and (indirectly) on compilers to give me most of the debugging information I need for not much effort.





Friday, March 2, 2018

Pensions for professors

It is not often that universities make front page news but the recent strike by university lecturers seems to have got quite a lot of media coverage.

On the surface it looks like quite a straight-forward dispute about money.  University vice-chancellors (represented by a body called Universities UK) are proposing to reduce the pensions that university staff will receive in the future.  The reason they are doing this is that existing contributions to the pension fund for universities (the USS) are not expected to cover the cost of future pensions.

One political commentator, who I have a lot of respect for, Daniel Finkelstein, has said that lecturers are striking against themselves.  He argues that increased contributions from universities to the USS would have a damaging effect on university lecturers.  As a result of increasing contributions, universities would have to either pay lecturers a lower salary and/or employ fewer of them.

He also argues that it would be unfair for the government to increase funding to universities in order to pay generous pensions at a time when the NHS is strapped for cash, prisons seem to be nearing a state of anarchy and universities are already generously funded by students through expensive tuition fees.  A large chunk of these tuition fees may end up being paid by the government if students are unable to pay back their loans.

While I find this line of reasoning quite persuasive, it seems to be predicated on the assumption that there will be an indefinite squeeze on the nation's finances.  As country we have had around 7 years of government austerity.  Recent news suggests that this austerity has been successful in eliminating the government deficit from around £100bn a year down to zero - https://www.ft.com/content/3f7db634-1cac-11e8-aaca-4574d7dabfb6.

So will the squeeze be indefinite or are we approaching the end of it?  Nobody really knows.  As of 12 months ago, the OBR, which produces official forecasts of the government deficit, was still forecasting a large deficit for 2018-19.  But tax receipts have been a lot stronger than expected.  Speaking from personal experience, these things are difficult to forecast!

My view is that economic growth and tax receipts will be stronger than they have been for much of the last 10 years.  As a result, the USS will probably not run out of money and if it does, the government should inject some extra cash to keep it afloat.  There are many competing spending priorities for the government, but I think that attracting and retaining bright people across the public sector is essential.  While there are many who are drawn to the public sector purely with a desire to contribute to society, generous public sector pensions do play a big role in encouraging people to stay.  I think these pensions should continue so that public services can flourish as they ought to.