Automated incremental Bus Stop (GTFS) updates

Version 2 first run: #59668811, #59670297, #59670983

@Safwat Appreciate the continued work on this. What’s to be done if the name in MoT’s database doesn’t match the name on the ground? (The station’s name is included in the black-ruled-yellow-background station sign and usually also in the in-station maps.) The wiki page says that the MoT database name wins, but I think we should map both names: the MoT name for interoperability with other services, the on-the-ground name due to https://wiki.openstreetmap.org/wiki/Good_practice#Map_what.27s_on_the_ground .

Hello dsh4,

I am glad you like my work :slight_smile:

I wish you’ve brought up the issue earlier. This has been decided after several messages and after I waited a long time for comments. Nevertheless, I am open for change. Let us discuss this. As of now, I believe it’s not worth the effort to handle this problem. Here are some reasons.

This change is based on past experience. Nearly all mapper bus stop edits are based on armchair fixes. They are not the “on the ground name”; the mapper sees a typo, or a stop having the name of a long gone road, and changes it. Even if the change makes sense, it is not OK to change it so long as the actual stop name has not changed on the sign and in the MOT data. It seems most mapping nowadays is armchair mapping, and that kind of mapping cannot possibly tell the physical name differs from the MOT name.

Experience also shows that we as a community are incapable of maintaining the stops. There are 27k stops that change rapidly every day. Maintenance requires a full time job for several people. The staleness of the stops prior to the introduction of the script, and the “gtfs:verified” tag, which I removed yesterday, are both testimony of this. That tag was effectively dead. We don’t have the power to physically survey the stops and verify them and flip the gtfs:verified flag or update the stops. An “on_the_ground_name” tag would be the new stale “gtfs:verified”, consuming 27KiB multiplied by ~10 or more with little use.

The physically printed bus stop name on the sign is increasingly becoming less important. Digital systems screens, which are becoming increasingly common right next to the stops, voice systems in buses, other apps and services, they all use the digital MOT name. Consequently, the digital names are highly maintained.

MOT has an active support E-mail. They are willing to change their data if it has mistakes. Highly confusing mistakes can be fixed on their part, and trivial mistakes too. Since the data is heavily relied on, perhaps they are even willing to physically change the on-the ground name for critical mistakes, though I have never tried this.

If this is a solely theoretical problem, then I think we ought to ignore it. A “third dataset” in addition to the two existing ones worsens everything - the code, mapper confusion, consumer confusion, the amount of warning logs I get, the debugging time required, the Israel map size, and more. However, if actual problems are arising because of this, then that’s something to consider.

So I think it should be added only if it’s really needed, and right now, I am not convinced it is. But I am open minded and would love to hear other opinions.

By the way, if there are specific examples of MOT-ground discrepancies, I would love to hear about them.

On further thought, I think there’s a way that already works: Manually add different name tag such as alt_name whenever a ground name differs from the MOT name. This doesn’t require 2 name tags per stop and doesn’t require any script changes because the script would ignore alt_name or similar tags.

Adding the on-the-ground name in the “name” tag and the MOT in another does complicate things.

Would adding something like this to the Wiki be sufficient?


If the on the ground name is not the same as the MOT name, please:
1. add it to alt_name.
2. Optionally add a note that mentions this difference.
3. Optionally report the problem to MOT.


Good approach!
I’d like to suggest adding the appropriate email or web address to number 3 above.

The on-the-ground notes have been added to the Wiki.

From now on, updates will usually run at Saturday evenings.

Most bus stops have an “addr:street” and an “addr:number” now. This covers most of Israel. I wonder how this affects navigation apps such as Osmand. Does it improve the search?

What would be the suitable OSM equiavlan of “רציף”? E.g.


55137,ת. רכבת כרמיאל/רציפים, רחוב:  עיר: כרמיאל רציף: 2   קומה:  ,32.923817,35.298353
55137,ת. רכבת כרמיאל/רציפים, רחוב:  עיר: כרמיאל רציף: 3   קומה:  ,32.923817,35.298353
55137,ת. רכבת כרמיאל/רציפים, רחוב:  עיר: כרמיאל רציף: 4   קומה:  ,32.923817,35.298353
55137,ת. רכבת כרמיאל/רציפים, רחוב:  עיר: כרמיאל רציף: 5   קומה:  ,32.923817,35.298353
63111,מסוף אורנית, רחוב:  עיר: שומרון רציף: 1   קומה:  ,32.107438,34.999197

I have made a stupid mistake. I added “addr:number”, which is a nonexistent tag.

But what should be used instead? “addr:housenumber” doesn’t seem right either.

https://wiki.openstreetmap.org/wiki/Key:addr

addr:number changed to addr:housenumber in https://www.openstreetmap.org/changeset/59727988

Osmose now jumps on "suspicious tag combination
highway together with addr:* "

I don’t think the address should be saved in the bus stop node in OSM. This might cause duplicate addresses when there’s already an address node (or tag) on a nearby building.

I apologize for the problematic tagging. I am not very familiar with address tagging and should have studied this further first.

What do you propose? It is possible to revert this and put “addr:housenumber” and “addr:street” in the description tag. But I would have preferred something which allows the clients like Osmand to use the addresses even in areas where no one has tagged the houses. That would be much better for usability.

Is it OK to keep “addr:street” and put the housenumber in the description?

Isn’t Osmose wrong here? (Assuming we keep addr:street only)

Since address duplication is dangerous and may cause unknown behavior in client address lookup, I have decided to move addr:housenumber to gtfs:addr:housenumber in the meantime, even before the discussion is finished. Please feel free to post your opinions on what should be done with addr:housenumber and addr:street.

Changesets: #60009323, #60010547

I’m investigating ways to increase the “bus factor”. Currently, when fetching new MOT GTFS files, the script needs to compare them with the files fetched the previous run. This means that if I lose both my PC and my PC backup, or if I ever get hit by a bus, running the script properly would be a bit fiddly, because one would have to reconstruct the lost file.

I would like to make the script completely stateless locally. This should be technically possible: Fetch latest bus stop changeset by SafwatHalaby_bot, and use that as the “old gtfs file”. This would make the script solely dependent on OSM servers, and not on any local hard drive.

This might be a major overwrite, so while we’re at it, I would like to rewrite the script such that it does not depend on JOSM. This should make running it headless natural and pave the way for complete automation.

Replying to an old point:

According to https://wiki.openstreetmap.org/wiki/Tag:public_transport%3Dplatform , ref= should be used for that. I’m not sure whether that tag is supposed to go on the highway=bus_stop node or on something else.

Glad to hear of the planned bus factor improvements.

There is a semantics issue here. The thing called “platform” in the Wiki is essentially a bus stop, we already use the “ref” tag for bus stop numbers.

The thing called a “ratzeef/platform” in the MOT GTFS is better translated into a “terminal number”, available only for some bus stops, mainly in central bus stations. So far I couldn’t find a Wiki entry which discusses it.

I don’t think you can call it “terminal number”, the רציף/platform in the GTFS is a platform just like a train station platform.

A bus terminal is what we call מסוף אוטובוסים in Hebrew. The Tel Aviv Central Bus Station or Arlosoroff Terminal are bus terminals. They don’t have numbers.

The “platform” in the GTFS file is there so that transit navigation apps will point users to the right platform for their bus. The GTFS file does not have the accurate coordinates for each platforms, so it uses the same coordinates for all and the same name for all, for example “Tel Aviv Central Bus Station 7th Floor/Platforms” ( https://www.openstreetmap.org/node/1803094757 ) is one node in OSM and one stop_code in the GTFS, but it has many lines in stops.txt with different stop_id for each, and the difference is just the stop_id and the platform number.

Considering this, I don’t see how keeping the “platform number” is useful to OSM in any way, and I suggest you ignore this in your automated imports, and only have one OSM node per GTFS stop_code.