Big Data and Working Lives

Letter carrier with dog, silhouette (watercolour). Courtesy of The Postal Museum, London.

In the last decade or so big data has become an increasingly common phrase in all manner of academic fields. British history has been no exception. Numerous projects have produced large datasets about various issues including the legacies of slavery, British fertility decline, entrepreneurship in Britain and crime and punishment in London and beyond. In many respects ‘big data’ is a new term for a long-running kind of historical scholarship as historians and demographers working on the census, civil registration and parish records have been assembling large databases of individual-level records for more than fifty years.

However, the current era of big data scholarship has some new features. First, it tends to be more easily accessible to researchers as most outputs are stored online in the UK Data Service or other similar repositories. Secondly, the development of computing power and programming, especially geographic information systems (GIS), makes analysis of these data far easier than before. Thirdly, advances in digitisation techniques have meant that the geographical scale on which such projects operate has increased; where once scholars worked with case studies or samples of census data we can now use the entire census. These advances have occurred throughout the world and so international comparative work can be done at a scale that was previously difficult or impossible. Such historic data can also be linked to contemporary data to investigate the long-run history of contemporary economic, demographic and social issues.

Our project is in part a big data project. We will be gathering information on around 25,000 Post Office pensioners and using these data to investigate occupational health and morbidity. These pensioners will also be traced into other sources: the census and death certificates. As a mass these thousands of people can tell us a great deal about important historical topics. Big data and the sources such approaches use are often thought of only as useful in terms of the aggregates they create. However, these sources are valuable not just for what they show when they are combined to create grand totals: they are often the only source of information for individuals who are otherwise absent from the archive and from many narratives about nineteenth-century history.

This reminds us that even remarkable individuals often leave little visible mark on the archival record. For example, in 1869 Richard Carroll, a rural messenger working for the Post Office in Llanbeblig, Wales, retired at the age of 87. His final salary was £31.5.8 and he was awarded a pension of £6.15.6 The reason given for his retirement is simply ‘worn out’. To have been working at this age already tells us something about the nature of old age in this period, but his pension record reveals that he had worked for the Post Office for just 13 years, meaning he started delivering mail at the age of 74. He can be found in the census, where in 1871 he is living in Llanbeblig and returns himself as a ‘Late Letter Carrier’ (hopefully late referred to his age rather than the timeliness of his deliveries!). He was living with his wife, Elenor, and two of their daughters, and a granddaughter. They are likely to have lived there for some time as both daughters, aged 32 and 29 in 1871, were born in Llanbeblig. Tracing him back in the census, we find Richard in Llanbeblig in 1861 and 1851, in 1861 he was working as a ‘Letter Carrier in a Post Office’, but in 1851 he was a ‘Waiter’, confirming the relatively short career in the Post Office given in his pension record. Perhaps most intriguingly, the census entries reveal that he was born in Jamaica. The pension records reveal that he died on the 12th July 1872. More is to be done, but the pension records, combined with the census, allow us to begin to construct an account of this life that involved transatlantic migration, marriage and work late into old age. In doing so this information contributes both to the larger data set that underpins the statistical analysis of the historical geographies of morbidity that we will undertake, but also reveals a great deal about a life that is otherwise little touched upon by the historical record. It also provides context to current government initiatives for people to adopt longer working lives.

This ability of big data to speak to both the micro and the macro aspects of history is what has attracted me to it as a means of historical scholarship and what makes the Addressing Health project so interesting. This is the third project I have worked on that has involved the creation and analysis of substantial historical databases. The first, was the Victorian Professions project, a collective biography of thousands of Victorian professionals and their families and descendants. The second, the Drivers of Entrepreneurship and Small Business project, which used the censuses to identify every business proprietor in Britain between 1851 and 1911. In both cases these data allowed us to produce new interpretations of British economic and social history, but equally they shed light on individuals who otherwise can be hard or impossible to study. The reconstruction of such lives is as important an aspect of big data as the number crunching.

Harry Smith