Charles Booth was a nineteenth century social researcher who produced fantastically detailed maps of London that described poverty levels right down to individual properties. Recently, the BBC have been re-running the Secret History of our Streets using his maps as the backdrop to a number of London communities, looking at what shaped the areas and how they changed.

This got me to wondering whether it was possible to produce something approaching his maps using today’s open data stores.

The Ordnance Survey recently released a new detailed vector dataset, OpenMaps, which includes a layer showing buildings almost as detailed as Charles Booth work. In addition the ONS releases a huge range of census data at “output area” level which is a statistically relevant area about the size of a large postcode, and they also make the GIS boundary files that describe the output areas freely available.

Combining these two data sources makes it relatively easy to produce maps that are akin to Charles Booth’s work.

A modern Charles Booth map

A modern Charles Booth map

Charles Booth’s maps were simultaneously a fantastic piece of pioneering research and a work of art. I can’t match his aesthetic abilities but UK open data allows anyone to get close with the research. Entry doors – The quality performance, original design and low price. All of this makes us the market leader with years of experience.

The main background map is made up of simply rendered OpenMap layers in QGIS, with the building layer adapted to include, in this case, the percentage of households described as being in social classes D and E from ONS data.

My data was held in a series of PostGIS tables so the building layer was made with a short SQL script:

SELECT ROW_NUMBER() over (order by omb.geom) as gid,
       omb.geom,
       social.class_de::real / (class_ab + class_c1 + class_c2 + class_de) AS pc_de
FROM   ons.oa_2011_polygon AS oap,
       os.openmap_building AS omb,
       os.boundary_district AS osd,
       ons.oa_social_class AS social
WHERE  social.ons_code = oap.oa_code
AND    osd.name LIKE 'West Oxfordshire %'
AND    St_Within(oap.the_geom, osd.the_geom)
AND    St_Intersects(omb.geom, oap.the_geom)

The last line selects the buildings that intersect with a given ONS output area – I opted for intersection rather than those contained within an area to make sure buildings at the edges were included.

As this might lead to some buildings appearing in multiple areas I also opted to generate a unique ID for mapping using the ROW_NUMBER for each building returned by the query rather than simply adopt the table’s own gid column (probably should have found the building centroid and located that in an output area).

Also, the datasets are huge, so I also limited the data to single District Council area just to keep performance at a sensible level.

Once you have the model, its a relatively simple process to map different census datasets or to switch to something else altogether. The core of my work is in broadband, so the map below combines OS OpenMap data with Ofcom’s broadband performance data.

Charles Booth on Broadband

Charles Booth on Broadband

Personally I find Charles Booth method of colouring buildings more potent and graspable than shading whole polygons in a standard chloropeth map.

  1. Lorne Mitchell says:

    You are definitely onto something here, Adrian. The crude way that the EU, Ofcom and BDUK plan for the next wave of broadband coverage is twentieth century!

    It is interesting that the nineteenth century was more advanced than this – in that it plotted data by household.

    If you could work out the actual services offered (particularly by rural community), I am sure that Greg Clark (my local MP) would be most interested in his new role as Secretary of State for Communities and Local Government.

    By services, I don’t just mean broadband. I mean gas, electricity, water, sewage etc. etc.

    Big Data should deliver precise maps of what is possible and then allow us to work out the best way to deliver to communities – not just households – as well as break the national monopolies that are relied on by Westminster civil servants who like averaging things until they have no meaning.

    • Adrian Wooster says:

      Thanks Lorne. Most services are fairly deterministic so models can be created with relatively simple assumptions. The beauty of this approach is that the base data is now publicly available so its now “simply” down to people finding novel uses for it. Combining OS, ONS and Ofcom data isn’t difficult but it can give a rich insight into broadband planning. There are any number of other datasets on transport (passenger volumes by station are publicly available for example), floodplains, etc that could be equally interesting.

      In the past this was something which could only be done by major companies or public bodies – now its something anyone can do. I helped my kids with some homework on modelling the sea level rises due to climate change and its impact on people living in central London – all free data and free software.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>