Automated incremental Bus Stop (GTFS) updates

@Safwat - I’m happy to defer to you on how to handle conflicts. You have far more experience than I there.

@zstadler - This may sound heretical, but why don’t we import the routes without importing the ways? I realise that the schema specifies that members ways are mandatory, but (1) adding the stops serves a real use-case (it enables apps to be written that can’t be written without it; travellers wouldn’t care whether the precise path is represented in the dataset, so long as they see where the busses stop), (2) adding the stops would be an accurate, incomplete edit, which is generally a good thing (as opposed to an inaccurate edit, which would be frowned upon); which is to say, we shouldn’t let the perfect (having routes that have both ways and stops) be the enemy of the good (having routes that have just the stops); (3) since OSM is a wiki, data consumers already have to be defensive and validate the relations they work on.

@anonymous_gushdan_mapper - As I said, the thinking is that if we import the data, somebody will write a smartphone app that uses it for navigation. That app would be usable in any country that has bus routes defined in OSM. At the moment it’s not possible to have such an app for Israel because OSM doesn’t have the necessary data for Israel (though we have the bus stops with gtfs:id’s — that really is fantastic, but doesn’t enable navigation apps to be written based on OSM data alone). Re good-looking, I don’t think that’s a valid argument. OSM should map the world as it is. If you don’t find the shapes of bus routes aesthetically pleasing, ask the MOT to change the routes… but if the world is spaghetiish, then OSM’s map data should contain spaghetti. Regarding your argument about the transport map, isn’t the right answer to that to ask the maintainers of the osm.org slippy map (carto) to add a mode that shows only railroads but not bus lines? Again, the primary criterion for map data is accuracy.

Like @anonymous_gushdan_mapper, I see no value in entering data into OSM that would not be used. This is also applicable to the proposal to enter routes without their way segments. Existing sites and applications assume that bus routes follow the required scheme. As a result, their handling of ill-constructed routes is unpredictable. This is why standards are created.

Indeed OSM allows using any tags you like. However, when using existing tags, it is expected to comply with the standards. Not using the standards can be considered bad mapping because of its affects on users of these tags/schemes.

@dsh4,
You can create a relation that includes just the bus stops of a route, and use it in a new application or site that you will build, but please avoid use the route=bus tag.

@dsh4 - it’s not about being “ugly”, it’s about being usable. Just like we don’t map individual trees (unless they have special significance). OSM is not meant to collect every possible detail about the world, just ones that would be usable.

This episode of Map Men is a good explanation of this philosophy in a broader context https://www.youtube.com/watch?v=kwprznh3d-o

If every street in Tel Aviv is covered in a red line that indicates a bus line, it won’t be too usable.

There exists apps that support bus routing in many countries. Saying that adding this data to Israel specifically will enable such apps is a bit far fetched, as it’s much easier for a developer to just consume GTFS feeds than interact with the OSM API, or keep an updated copy of the entire OSM database.
Also, bus routing that is based on OSM only would not be very useful, as it won’t contain the actual schedule - which is very important when doing public transit routing.

route=bus data exists for some European cities, but I haven’t seen an app that uses it. Do you know of any such app? And if not, why do you think adding Israel specifically would cause these apps to be created?
what would be the usecase of a transit routing app that doesn’t have the actual schedule?
What would be the usecase of a bus route map so dense that it can’t be possibly used for navigation?

None, but that’s a presentation layer problem, which is a different kettle of fish to the “which data should be added to the map” question.

Agreed with your good points about schedule and GTFS feeds.

@zstadler Yes, I thought that counter-argument might be offered :slight_smile: I suppose I should try and convince the tagging list to make the way members optional (or, more generally, to invent some incomplete=yes tag to facilitate the “accurate, incomplete survey” case).

Thanks everyone for all the enlightening feedback. It’s clear there’s no consensus on proceeding so I’ll drop the matter (and seek some non-OSM-based solution to my original problem).

Can you elaborate on the original problem? Perhaps we could assist…

The discussion on routes was essentially about having a partial copy of MOT GTFS information within the OSM DB. The idea to link this data with other OSM data, such as roads, was dropped during the conversation. As such, I wonder what value could OSM bring to your original problem.

The Saturday updates were postponed due to algorithm changes. Although this change has been proposed several times before with no objections, I will propose it once more and wait 1-3 weeks in case someone has comments.

The change is based on ground experience. Are user edits better than original mot data? The answer appears to be NO for tags, and YES for bus stop locations. Therefore, the algorithm will change as follows:

  • The bot will OVERRIDE all name tag changes users make. As explained before, the reasoning is that a bus stop’s MOT name is the one true official name, and the name used on digital monitors, bus stop voice systems, etc. Therefore, even if it is logically incorrect or has some spelling errors, it is the correct name of the stop as long as it is not fixed upstream in MOT.
  • As before, the bot WILL NOT override bus stop location changes, (unless MOT has a more recent update), however, if a user moves a stop only slightly (less than 4 meters), then it is assumed to be an accident and the bot will OVERRIDE the location, snapping it back to the original MOT position.
  • The rest of the behavior remains identical. e.g. the bot will not re-add user deleted stops (unless MOT has updated a stop after deletion), and so on.

edit: removed redundant points that have already been made before.

Sounds good!

I thought that if imported the set of stops in each bus route into the OSM DB, then bus routing apps could be written that would work both in Israel and in other countries, without depending on the peculiarities of each country’s upstream bus route formats. If we don’t do such an import, I’ll keep using per-country public transport routing apps, that’s all.

Cheers.

I apologize for the incremental updates delay. The code requires certain updates before the next run which I hadn’t had the time to do. I will do this eventually, as soon as I can.

An incremental update has been applied with the new code: https://www.openstreetmap.org/changeset/59589851

The “description” tag has been refined, but the tag was intentionally not updated and will be updated in a separate changeset.

Version 2 is out. As a mapper, here is what you should know:

https://wiki.openstreetmap.org/wiki/User:SafwatHalaby/scripts/gtfs#Information_for_mappers

Version 2 now adds “addr:address, addr:number”, a much cleaner description tag, and overriding user changes for certain tags. Additionally, the PTv2 is now used in addition to highway=bus_stop. Lastly, certain parts of the source code have been cleaned up.

No runs were performed with version 2 as of now. (The run described in the previous post was v1 + name overrides). But the source code can be found at the repository. I am making sure everything works as expected first. Also, this is a final opportunity for anyone to leave comments prior to running it.

Version 2 first run: #59668811, #59670297, #59670983

@Safwat Appreciate the continued work on this. What’s to be done if the name in MoT’s database doesn’t match the name on the ground? (The station’s name is included in the black-ruled-yellow-background station sign and usually also in the in-station maps.) The wiki page says that the MoT database name wins, but I think we should map both names: the MoT name for interoperability with other services, the on-the-ground name due to https://wiki.openstreetmap.org/wiki/Good_practice#Map_what.27s_on_the_ground .

Hello dsh4,

I am glad you like my work :slight_smile:

I wish you’ve brought up the issue earlier. This has been decided after several messages and after I waited a long time for comments. Nevertheless, I am open for change. Let us discuss this. As of now, I believe it’s not worth the effort to handle this problem. Here are some reasons.

This change is based on past experience. Nearly all mapper bus stop edits are based on armchair fixes. They are not the “on the ground name”; the mapper sees a typo, or a stop having the name of a long gone road, and changes it. Even if the change makes sense, it is not OK to change it so long as the actual stop name has not changed on the sign and in the MOT data. It seems most mapping nowadays is armchair mapping, and that kind of mapping cannot possibly tell the physical name differs from the MOT name.

Experience also shows that we as a community are incapable of maintaining the stops. There are 27k stops that change rapidly every day. Maintenance requires a full time job for several people. The staleness of the stops prior to the introduction of the script, and the “gtfs:verified” tag, which I removed yesterday, are both testimony of this. That tag was effectively dead. We don’t have the power to physically survey the stops and verify them and flip the gtfs:verified flag or update the stops. An “on_the_ground_name” tag would be the new stale “gtfs:verified”, consuming 27KiB multiplied by ~10 or more with little use.

The physically printed bus stop name on the sign is increasingly becoming less important. Digital systems screens, which are becoming increasingly common right next to the stops, voice systems in buses, other apps and services, they all use the digital MOT name. Consequently, the digital names are highly maintained.

MOT has an active support E-mail. They are willing to change their data if it has mistakes. Highly confusing mistakes can be fixed on their part, and trivial mistakes too. Since the data is heavily relied on, perhaps they are even willing to physically change the on-the ground name for critical mistakes, though I have never tried this.

If this is a solely theoretical problem, then I think we ought to ignore it. A “third dataset” in addition to the two existing ones worsens everything - the code, mapper confusion, consumer confusion, the amount of warning logs I get, the debugging time required, the Israel map size, and more. However, if actual problems are arising because of this, then that’s something to consider.

So I think it should be added only if it’s really needed, and right now, I am not convinced it is. But I am open minded and would love to hear other opinions.

By the way, if there are specific examples of MOT-ground discrepancies, I would love to hear about them.

On further thought, I think there’s a way that already works: Manually add different name tag such as alt_name whenever a ground name differs from the MOT name. This doesn’t require 2 name tags per stop and doesn’t require any script changes because the script would ignore alt_name or similar tags.

Adding the on-the-ground name in the “name” tag and the MOT in another does complicate things.

Would adding something like this to the Wiki be sufficient?


If the on the ground name is not the same as the MOT name, please:
1. add it to alt_name.
2. Optionally add a note that mentions this difference.
3. Optionally report the problem to MOT.


Good approach!
I’d like to suggest adding the appropriate email or web address to number 3 above.

The on-the-ground notes have been added to the Wiki.

From now on, updates will usually run at Saturday evenings.

Most bus stops have an “addr:street” and an “addr:number” now. This covers most of Israel. I wonder how this affects navigation apps such as Osmand. Does it improve the search?

What would be the suitable OSM equiavlan of “רציף”? E.g.


55137,ת. רכבת כרמיאל/רציפים, רחוב:  עיר: כרמיאל רציף: 2   קומה:  ,32.923817,35.298353
55137,ת. רכבת כרמיאל/רציפים, רחוב:  עיר: כרמיאל רציף: 3   קומה:  ,32.923817,35.298353
55137,ת. רכבת כרמיאל/רציפים, רחוב:  עיר: כרמיאל רציף: 4   קומה:  ,32.923817,35.298353
55137,ת. רכבת כרמיאל/רציפים, רחוב:  עיר: כרמיאל רציף: 5   קומה:  ,32.923817,35.298353
63111,מסוף אורנית, רחוב:  עיר: שומרון רציף: 1   קומה:  ,32.107438,34.999197