Friday, 1 July 2016

How far are Hedgehogs from a road?

My last hedgehog siting (2010)2887a
My last hedgehog sighting in Britain: Elston, Nottinghamshire 2010.


One of my great joys with OpenStreetMap (and other (mainly) geographical Open Data) is that it provides a way into answering intriguing analytical questions.

A few weeks ago the query was from a Hedgehog ecologist: naturally I learnt of the query through OSM (via IRC to be precise).

The question was very simple:  

What proportion of Britain's land area is more than 100 m from a road?  

The reason it is germane for hedgehogs is that historically they have had a very high mortality from crossing roads. These days they are so rare, that spotting a squashed hedgehog is itself a rarity. Certainly this cartoon would not have the same resonance it did when it first appeared in the 1970s.

To answer the query is fairly straightforward: providing one has either a GIS tool or database to hand AND a full data set of British roads. QGIS and PostGIS were available & I also have a full set of OSM data for May 2015 in the latter.


The steps involve in answering the query were simple ones as follows:
  • Select a relevant subclass of highway elements from OSM (in this case deliberately selecting the main types of motorable roads)
  • Re-project into British Grid (which enables more precise answers to this type of query)
  • Buffer all the roads by 100m
  • Eliminate overlaps
  • Sum the area covered 
  • Subtract that from the known land area of Great Britain.
In practice for this type of work I nearly always grid the data so that costly queries relying on the GIST indexes on geometries have to do less work. So I clip roads using a 2 or 5 km grid. All roads in a given grid square can either be merged into a multiline structure & then buffered, or buffered and then merged into a single multipolygon. One way is probably more efficient than the other. Once buffered the buffer area

In my initial analysis which I carried out very quickly I made three errors:
  • Failing to include trunk roads in the initial road selection through missing quotes in the SQL predicate
  • Using st_within & not st_intersects to find roads crossing a grid square.
  • Clipping back to grid square boundaries in all cases. A road ending on the edge of one grid square still creates a 100 m semicircle in the adjacent square
As usual it was visualisation of the data which hinted at the things I needed to fix. Certain main roads were missing, there were islands of roads in central Wales, and the grid boundaries were too obvious when visualised.

Initial visualisation. Note big gaps in Central Wales and absence of continuous lines on motorways.

Visualisation with fixed data. Compare Central Wales and motorways.
Area W of Birmingham enlarged. The main population centres are Shrewsbury (top centre) and Telford (right of top centre) with the West Midlands conurbation on right edge. Welsh hill country is top loeft.

The first I suspected anyway.  I'd written the selection rather hurriedly & there are 11 types of highway I include normally, so it's rather easy to mis-spell one, or as in this case not spot missing quotes. Easy to fix.

The second was less immediately obvious, but it's one which has caught me out in the recent past too. This is writing st_within in a query where st_intersects is required. I suppose I do point-in-polygon type queries more frequently so instinctively type st_within. Also st_within is specifically a function which works on polygons whereas st_intersects has broader applicability.

Either way these were easy errors to fix. To do the calculation I used an Ordnance Survey Open Data file to hand (one of local authority boundaries) which results in a slight overstatement of land area as some LAs boundaries include marine areas (notably in The Wash & in the Bristol Channel) together with the OSM outline for the Isle of Man (as this is not available in OSGB Open Data). The area for Great Britain and Isle of Man which I used was 238,905 square km, but this can be corrected to any other widely available figure, such as that given on Wikipedia of 229,519 square km. Equally the area of road buffers will include some sea for coastal roads. Ideally I need my grid cells to be land area only, something I'm thinking of doing for other purposes too.

The sum of the area of the road buffers was for 100 m, for 200 m, for 500 m and 1000 m. I was also asked for a 5km buffer and I calculated that separately using a larger grid size. Note that gridding is purely something done to speed up the process. A recent talk highlights (sorry, I can't find the link for this slidedeck but it was from a recent Geo-conference) that many standard geometry routines have poor computational complexity properties so one often has to resort to divide-and-conquer approaches such as gridding the data.

So here are the approximate figures, given the caveats above:
  • Within 100 m of a road: 53,964 km2 or about 23.5% of land area of Great Britain
  • Within 200 m of a road: 87,971 km2 or about 38.3% of land area
  • Within 500 m of a road: 150,147 km2 or about 65.4% of land area
  • Within 1000 m of a road: 189,562 km2 or about 82.6% of land area
  • Within 5000 m of a road: 260,419 km2 or rather more than the land area concerned.
The last figure made it abundantly clear that I needed to address the sources of approximation. Coastlines & other borders introduce edge effects which can seriously complicate this type of analysis and often require ensuring that one has appropriate data to handle it. So I did the following:
  • Created a shape file of the land area for Great Britain & Isle of Man
  • Clipped each buffered result set by that shape file
  • Used each of these inputs for the final calculation
In order to make the shape file  needed something based on the coastline. Fortunately I remembered Geolytix had used OSGB Open Data of coastlines to produce refined postal area polygons. This was immediately available to me and I therefore used this as the starting point. I created a separate Isle of Man shapefile from OSM data and merged the two together. So here are the revised figures:
  • Within 100 m of a road: 53,760 km2 or about 23.3% of land area of Great Britain
  • Within 200 m of a road: 87,353 km2 or about 37.9% of land area
  • Within 500 m of a road: 147,740 km2 or about 64.2% of land area
  • Within 1000 m of a road: 186,533 km2 or about 79.7% of land area
  • Within 5000 m of a road: 226,418 km2 or 98.3% of the land area. 
And just to enable checking here are some visualisations:

Distances for England & Wales
Mainland Scotland

English Peak District (Dark Peak)
Manchester is on W, Sheffield on R

North Wales & Merseyside

Isle of Wight and South Coast of England

South West England

Central Belt & Southern Inner Hebrides, Scotland
Kintyre (bottom left of centre) shows forest tracks erroneously mapped as roads

Even now there are one or two problems: specifically forestry tracks on Kintyre mapped as unclassified roads (I think I caught most of these and corrected them sometime ago). Some roads on Scottish Islands have a status which is unclear too. However the visualisations of the data do help to highlight such cases.

The other thing is that for much of the country the patches of land which are further than 1 km from any road are often easily recognisable to the extent that they can often be readily names. The two big patches in the Peak District are Bleaklow and Kinder Scout. So the visualisation also shows other things which may be of interest for other reasons


The sad part of this post is that it wouldn't be necessary if hedgehogs were not in a steep decline across Britain.

No comments:

Post a Comment