Is there going to be a new and complete Naptan / bus stop import?

Hi,

I am aware that there was a small Naptan import into openstreetmaps sometime a go as detailed in this wiki

http://wiki.openstreetmap.org/wiki/NaPTAN

However, as the bus stop data on OSM is not accurate, are there plans to clear the bus stop data in the UK and replace it with a bulk import of the Naptan data now what it is open data (http://data.gov.uk/dataset/naptan)?

Here is a comparison between OSM and Google, Google’s stops are correct and match the Naptan data

https://www.openstreetmap.org/node/21664801#map=17/53.74898/-2.46514
https://www.google.co.uk/maps/@53.7488507,-2.4650802,17z

Thanks,

K

Firstly, would probably be best discussing this on the talk-gb mailing list, probably more interested people there.

Yes, quite a few areas did have the Naptan data imported a few years ago.
For accuracy of Naptan, it seems it can vary a lot by area. I think some council areas have very inaccurate positions, or contain stops which don’t actually exist etc. So I would not assume that it is always correct.
Quite lot of work has been done checking the Naptan stops imported into OSM, and correcting and improving it where necessary. Deleting all this for a new import would be rather unhelpful.

Yes, it could be helpful to import data for the areas where it hasn’t been done.
Also would be good to check the latest version of Naptan, and see how it compares with previous versions, and stops already in OSM. I don’t know if there’s any suitable tools to help with this.

OK can you point me towards the mailing list?

Re: Naptans, their location may not be 100% accurate in the Naptan DB, but every bus stop has to have a Naptan code (and therefore a description/placename, etc), otherwise the bus stop cannot be included in the tenders that the bus operators need to fulfil when pitching for routes with their local authority.

The problem at the moment is that most stops in OSM are not identified by Naptans, so when the map is edited additional stops are added, rather than the correct one being moved. This results in a haze of poorly placed stops which are unaccountable, as per the example in my first post.

Also note the Naptan database is used by Traveline, the main journey planning provider in the UK. They work on behalf of all bus companies across the UK.

all mailing lists : http://wiki.openstreetmap.org/wiki/Mailing_lists
UK mailing list: https://lists.openstreetmap.org/listinfo/talk-gb

I think we can basically answer: no, there will not be any new NAPTAN imports, whether of a district or the country as a whole. Essentially NAPTAN imported data was not as high quality that it could be used without resurvey. Surveying bus stops proved too labour intensive, so only data in one district was actually fully resurveyed. Originally NAPTAN data provided far more than just bus stops: in many cases it enabled naming of streets and plotting of missing roads.

If it is not realistic for mappers to check the data then it tends to erode in quality, and bus stops in particular tend to change a fair bit. Continual re-importing would mean that mappers wouldn’t bother with bus stops, which would also lead to progressive loss of quality.

A better solution is to a) make use of a mash-up of NaPTAN and OSM data in applications (which is what Traveline do); and b) highlight discrepancies in such a way that it helps mappers target these for surveying (see Post Hoc, a site which does this for post boxes).

thanks @escada

Naptan data is improving all the time and is now much more reliable. The reason for this is that modern buses use GPS which means Naptans are used to determine fares. i.e. a bus pulls up at a stop and the bus driver does not press where he is on the ticket machine, it is set automatically by GPS, then the destination for each passenger is entered (if paying in cash). It also means that lat/lng of bus stops is surveyed by the buses running the routes. This is in addition to the demands put upon the Naptan database to be accurate when route tendering for local transport executives and online journey planners / mapping.

I have spoken to Traveline and they exclusively use Naptan data for their bus stop information, they are the organisation tasked with maintaining bus information in the UK for all operators and the source of information for all national journey planners.

If there is a mashup of data then the latest Naptan database should be the primary source, this could then be compared to hand surveyed stops on OSM that have Naptan codes and as you say discrepencies reported and checked. However, stops on OSM which do not have a Naptan code have to be removed otherwise they will remain on the map verbatim - they can never be corrected or merged with any other dataaset as they have no reference. The sheer volume of duplicate and misappropriated stops on OSM makes a manual clean impossible.

However, I believe the more sensible approach is for mappers to contribute to the Naptan database so that changes are made at source, and the Naptan database is regularly imported into OSM. This prevents the datasets from forking once the full import is undertaken.

I have to say I don’t quite understand the problem here. If you want to use NaPTAN data, then you can - and just ignore the bus info in OSM altogether.

Any imported copy of any dataset that continually changes will be out of date with respect to the source data the moment that it’s been imported - IMHO any copy of that data in OSM will only ever be a convenience to those people who don’t want to fetch data from multiple sources*. However, all “official” data sources contain errors, and what I’m not seeing though is any way to get corrected data from OSM back into NaPTAN, so that where the stop has moved (e.g. http://www.openstreetmap.org/node/502386142/history ) the change can be fed back**.

The experience with NaPTAN last time was a bit of a curate’s egg - good in some areas, lousy in others. There’s definitely scope for “how does OSM compare with NaPTAN” tools though - which might be as simple as a bit of data wrangling and an OSM diary entry with a uMap link in it.

Cheers,

Andy

  • though a quick scan through OSM lists shows that there are plenty of people who just want to have “all the things” in OSM, without thinking at all about data validity, maintenance etc.

** neither Google Maps nor Traveline (both of which use raw NaPTAN) have a stop there - they suggest that I walk from one up the road. It looks like they deleted it when it moved, but did not re-add the new location.

(edit: spelling)

Do you work for Naptan?
How many bus stops have you surveyed and checked whether the Naptan data is correct and accurate? And in what areas of the UK?
Maybe it is improving, but I don’t think many buses actually have GPS yet.

I’d agree that the Naptan database should be the primary source.
But how can mappers contribute to it? If I notice an error with a bus stop location, how do I report it?
Does Naptan have details of bus shelters? The original import into OSM didn’t have this.

The NaPTAN import is controversial, particularly because it was done with no mechanism for subsequently maintaining it. In fact, there seems to be a very strong no lobby for any mechanical import in the UK.

If there was a re-import of NaPTAN data, it would not be a clear and reload, it would be a merge with the existing data, and, in particular, I would not expect the existing data to take precedence as far as the location is concerned, although one might make an exception if the NaPTAN data had not been verified and the location had not been manually changed. NaPTAN names and actual names are fortunately distinct. Even it if turns out that a re-imported NaPTAN had names that were actually current, the name fields include distinctions between the major and minor part of the name which are not in the NaPTAN data, so one should not overwrite a name that could be plausibly derived from the new NaPTAN data.

One big problem with NaPTAN is stop areas. As these are not constructs that are verififiable on the ground, one can end up with a stop area name that no longer reflects any of the names of the stops it contains and likely refers to a pub or school that no longer exists (stop name change remarkably frequently). These stop areas are used on the transport map, when you zoom out.

Another problem is that, where a stop doesn’t seem to exist on the ground, or has moved a long way, people who don’t read the NaPTAN import rules will simply delete the node entirely. A lot of other nodes have vanished because of users doing things like cleaning up the map for personal mapping. Some peole on talk-gb say you should remove any stop for which there is not evidence on the ground.

I’d note that Google Maps also seems to have problems with NaPTAN data, often having stops present multiple times, or in the case I’ve just checked, keeping the NaPTAN entry, which is in the worng place, when correcting the duplication. This is also a case of a hail and ride section, so the stops are not really stops, but just lamp-posts with time tables attached, which results in some rather arbitrary stops being imported.

thank you all for your feedback.

@SomeoneElse

  • The problem i have is that all the standard tiles have the OSM bus stops on, so even though we highlight stops in their correct location it is confusing as they do not match up with the blue stop markers on the map. Plus it looks bad for OSM if the bus stop data is so inaccurate.
  • I have spoken to Traveline about the stop you mention, the Naptan data I have (http://data.gov.uk/dataset/naptan) has the stop situated on the other side of the junction, and they agree the Naptan db is not correct in this instance, so they are raising this with the individual for maintaining Naptan in the region for amendment.
  • The contact I have at Traveline is happy to receive amendments to Naptan data to ensure it is kept up to date

@vclaw

  • No I don’t work for Naptan, Naptan is maintained by public transport executives regionally. However, I have been developing transport software for nearly twenty years and have implemented a number of web sites and apps for bus, train, and tram operators in that time. The areas I have most experience in are the North, Midlands and London.
  • GPS is becoming essential in the transport industry, there is a big push for Realtime Information (RTI) for customers and even when the information isn’t publically available operator’s internal systems now use GPS extensively for tracking vehicles and monitoring times, fuel efficiency, accidents, etc. A very small regional operator may have buses without GPS, but this is not the norm and will certainly change over time.
  • As above I have spoken to Traveline and my contact there will accept corrections to the Naptan db, so maybe amendments could be collated and send for processing in bulk
  • Unfortunately the Naptan db does not have bus shelter information, which is a real shame. If there is a lot of OSM data for this it could be something we could share with them.

@hadw

  • The key here is to have a mechanical update WITH a means to maintain it. I totally agree there is little point otherwise.
  • If I were handling the merge I would clear OSM stops without a Naptan and provide a difference report for where OSM data has been manually updated so the best candidate can take precedence. Where OSM data is verfied as being more accurate this could be sent to Traveline for update in the Naptan db
  • I totally agree with stop areas, and in fact stop attributes (e.g. landmarks), these change over time and can be inaccurate. However if OSM has more accurate (verified) information we can submit this to Traveline/Naptan
  • re: stops being cleared away, or moved for personal mapping, this is a real problem. My view is that in the UK that stop editing should not be direct on map, but via a submission/request which can then be sent to Traveline/Naptan if verified.
  • re: hail and ride, I totally agree, they can lead to a real mess. It would be great of they were a different colour (or different icon) on the maps so it is clearer that they are not actual stops, but an area in which a bus can be hailed
  • I agree Naptan is not perfect, but it is by far the best source of information we have and its accuracy is improving as bus operator move to GPS ticketing machines. If we can contribute to the accuracy of Naptan then all the better.

The general view on OSM is that the holy grail is data is that is entered as the result of on the ground surveys by individual mappers; I think there is no chance of an import being allowed which mechanically removes manually entered data. Generally the way this is handled is by having a tool that highlights mismatches between the OSM data and an external source, as a hint for

In fact, given the way in which proposals for even quite small imports, with manual verification, get knocked down on talk-gb, I think it unlikely that any new NaPTAN import will be allowed.

Location data on the previous NaPTAN import was pretty poor for many London stops, often in the middle of buildings, or even on an underground line.

Anyone operating a service based on the OSM map should be doing their own rendering, and operating their own tile server, so should be able to use a local import of NaPTAN data and not render bus stops from the main map.

I totally agree that the holy grail would be a community driven survey of all bus stops, but the bottom line is that with the number of stops, and how frequently they change (e.g. even if a bus stop position is unchanged it may be suspended due to a change of route) that this is impossible. It would take hundred of surveyors working full time to create and maintain the stop database, the current state of the stops on OSM demonstrates this is not happening and that the free-for-all approach is not working.

Re: quality of Naptan data, it is not perfect, but it has improved considerably. The reason for this is that most bus operators now have GPS on their buses so the location of stops can be surveyed automatically as buses traverse their routes.

I also take the point that we can operate our own tile server and we are doing this, but why should public OSM data be plagued with very inaccurate bus stop information when there is an available and maintained datasource that we can contribute to freely available?

Actually, the use of GPS is probably one of hte main reasons that location are so inaccurate. GPS can result in errors of up to about 50m, whereas current mapping tends to be within about 3m. Most of the NaPTAN stop locations that I have fixed are within the GPS error bounds (although there are cases where stops have moved).

More generally though, I think you need to look at https://lists.openstreetmap.org/pipermail/talk-gb/ where you can get the general mood on mechanical imports. I think you will also find NaPTAN being used as evidence against allowing them.

The ohter thing that will be required is that someone volunteers their time to do the work. Things get done on OSM as a result of initiatives by volunteers, and, unless a relevant volunteer is reading this, no-one here will really know if anyone intends to do it. In this case, to even stand a chance, the volunteer needs to be able to guarantee long term support, not just a one off import.

Because of the automated ticket machines the GPS has to be far more accurate than 50m, 2-3m is typical. i have streams of GPS data that are very accurate. Think about Uber displaying the taxi on its way to you and that is done with a mobile phone GPS, the GPS on buses are dedicated units. Having said that I’m sure there is some ‘troubled’ data out there!

I will def look up the talk-gb list, but re: import and duplicate checking I could write a system that would automate a regular import and also a ‘submission of revisions’ process no problem. This would reduce the stress on volunteers.

I was confusing metres and feet, so my figures were about three times too high. I just took out my Galaxy GT-S5360 and got CEPs bouncing between 33 and 49 feet and starting at 115 feet, even though there were a lot of satellites in view. I don’t think it ever reports less than 15 feet.

If you are consistently getting 1 or 2 metre CEPs you are either using professional or military receivers, or the data is being augmented by copyright sources, e.g., unless you are careful, Apple devices will snap to the nearest road and Android devices will use their proprietary database of WiFi SSID locations.

If you are measuring bus stop locations from a bus, you are not going to time things for the best satellite coverage, considering local shadowing, so you can expect quite a few to have rather large CEPs. There would be no value in a bus using anything but the cheapest GPS module.

A quick search suggests that the GPS CEP is at the 50 pecentile level, so half the measurements will exceed the CEP value.

Incidentally, even if they used OS maps, they would have errrors of several metres in some places.

Yes, for sure there is an accuracy issue with GPS full stop because the miiltary like to keep the very accurate stuff for themselves. The accuracy I have is in comparison to the Naptan lat/long which is plotted using similar equipment (or sometimes the same equipment)

Note the data I have is from the GPS on the buses, it is more accurate than a phone. The reason the buses need a fairly decent (not expensive) GPS is that the ticket machines now use GPS to set the start stop automatically, so that only the destination needs to be entered by the driver. Stops can often be very close to each other.

Having said that, there definitely needs to be an acceptance that there is a limit to how accurate the data can be and there will always be times when changes have to be manually made (i suggest they are sent to Traveline so the source data Naptan can be updated). e.g. it may be necessary to move a stop to the correct side of a junction.