OpenStreetMap Forum

The Free Wiki World Map

You are not logged in.

Announcement

A fix has been applied to the login system for the forums - if you have trouble logging in please contact support@openstreetmap.org with both your forum username and your OpenStreetMap username so we can make sure your accounts are properly linked.

#51 2017-07-31 18:09:10

SwiftFast
Member
Registered: 2017-04-10
Posts: 395
Website

Re: Israel GTFS release

. I don't see any better way of doing this and IMHO, they are not less important than any other fixme

The vast majority of the GTFS fixmes are false positives. On the other hand the other fixmes almost always require attention.

Maybe we could use another tag, e.g. gtfs:accuracy_fixme/source="Auto-imported GTFS, check for accuracy!"/precision=10m.

Preferably something already being used.

Offline

#52 2017-08-03 18:36:22

dsh4
Member
Registered: 2017-06-24
Posts: 43

Re: Israel GTFS release

I've asked about accuracy=* on help.osm.org:

https://help.openstreetmap.org/question … seable-way

The discussion is ongoing.

Offline

#53 2017-08-04 12:44:38

SwiftFast
Member
Registered: 2017-04-10
Posts: 395
Website

Re: Israel GTFS release

Worth noting this comment by SomeoneElse:

The way that the NaPTAN bus stop data import was handled in the UK was to have a "naptan:verified" tag, initially set to "no". Other imports have had similar schemes.

Offline

#54 2017-08-04 19:42:05

dsh4
Member
Registered: 2017-06-24
Posts: 43

Re: Israel GTFS release

We can combine the methods, e.g., add gtfs:verified=no and accuracy=20m (or whatever the worst-case error is known to be).

Offline

#55 2017-09-19 10:30:15

SwiftFast
Member
Registered: 2017-04-10
Posts: 395
Website

Re: Israel GTFS release

fixmes were replaced with gtfs:verified=no. I did not add accuracy because I don't know the error rates. Changesets: #51169519, #51170352, #51171122, #51174070.

Fixmes down from ~31,000 to 1895. Only 6% of the fixmes left. That's pretty manageable.

Last edited by SwiftFast (2017-09-19 10:30:38)

Offline

#56 2017-09-21 18:30:16

dsh4
Member
Registered: 2017-06-24
Posts: 43

Re: Israel GTFS release

@SwiftFast thanks!  (OT: also, thanks for the Daliyat al-Karmel market fix the other day)

Offline

#57 2017-09-22 12:06:45

anonymous_gushdan_mapper
Member
Registered: 2016-12-17
Posts: 23

Re: Israel GTFS release

Now that all bus stops imported via gtfs have gtfs:verified=no, wouldn't it make sense to write a script to download the GTFS DB every week, add only update bus stops that have gtfs:id and gtfs:verified=no ? stops move around often (especially in Tel Aviv where there are road works), new stops are regularly added and old stops are regularly removed.

At the moment, OSM has a lot of stops that aren't there anymore, for example the ones in Petach Tikva's Jabotinisky middle bus lane (that has been closed for the light railway construction for a while now). Since Israel has so many bus stops, it makes no sense to maintain them manually - no human can track all the changes and keep the map up to date. I know automated edits are generally frowned upon in OSM, but that's the only reasonable way to keep the bus stops up to date on the map.

Offline

#58 2017-09-22 13:52:03

SwiftFast
Member
Registered: 2017-04-10
Posts: 395
Website

Re: Israel GTFS release

Yes, an updating script is needed. It's on my to-do list. See previous posts in this thread. (Though I'm not sure I agree on the specifics you mentioned. Bus stops without gtfs:verified=no should still receive updates).

Offline

#59 2017-09-22 17:11:35

SwiftFast
Member
Registered: 2017-04-10
Posts: 395
Website

Re: Israel GTFS release

I managed to extract bus stops from GTFS and import it into the JOSM scripting plugin. I'll now experiment with different merging techniques.

Offline

#60 2017-09-22 20:06:46

SwiftFast
Member
Registered: 2017-04-10
Posts: 395
Website

Re: Israel GTFS release

Some errors detected so far:

WARN: two osm bus stops same ref. ref=31628, id:1803070181 OR id:518136779
WARN: two osm bus stops same ref. ref=3029, id:1803078612 OR id:1803007057
WARN: two osm bus stops same ref. ref=31395, id:1803085968 OR id:518140581
WARN: two osm bus stops same ref AND gtfsid. ref=15602, gtfsid=27870. id:1803054257 OR id:3725071189
WARN: two osm bus stops same ref. ref=39116, id:454353493 OR id:1803001240
WARN: two osm bus stops same ref. ref=32011, id:458715051 OR id:1803043127
WARN: two osm bus stops same ref. ref=39235, id:542705445 OR id:1803087556
WARN: two osm bus stops same ref. ref=32942, id:1802986047 OR id:941922250
WARN: two osm bus stops same gtfsid. gtfsid=16198, id:1803035944 OR id:1803045065
WARN: two osm bus stops same ref. ref=39306, id:1530647519 OR id:1803080068
WARN: two osm bus stops same ref. ref=9941741, id:1803002992 OR id:4911337809
WARN: two osm bus stops same gtfsid. gtfsid=32804, id:1803002992 OR id:4911337809
WARN: two osm bus stops same ref. ref=39454, id:893250637 OR id:1803005694
WARN: two osm bus stops same ref. ref=39479, id:1803016847 OR id:4298607800
WARN: two osm bus stops same gtfsid. gtfsid=557, id:1803016847 OR id:4298607800
WARN: two osm bus stops same ref. ref=47, id:1803020946 OR id:1803064907
WARN: two osm bus stops same ref. ref=34;35, id:755156662 OR id:752933252
WARN: two osm bus stops same ref. ref=9945554, id:1803040289 OR id:3684783003
WARN: two osm bus stops same gtfsid. gtfsid=33293, id:1803040289 OR id:3684783003
WARN: two osm bus stops same ref. ref=39291, id:1803032401 OR id:1537849739
WARN: two osm bus stops same ref. ref=3;5, id:753080387 OR id:753076542
WARN: two osm bus stops same ref. ref=13037, id:1803001219 OR id:4521026196
WARN: two osm bus stops same gtfsid. gtfsid=26054, id:1803001219 OR id:4521026196
WARN: two osm bus stops same gtfsid. gtfsid=32972, id:1802988407 OR id:2763718615

Offline

#61 2017-09-23 08:05:20

SafwatHalaby
Member
Registered: 2017-04-10
Posts: 395
Website

Re: Israel GTFS release

Here's a merging idea.

Problem: Dealing with conflicts between mapper edits and gtfs data.
Solution: "The most recent version is the correct version" philosophy.

- The first gtfs update would update everything. Conflicts are resolved by prioritizing the gtfs file's version. This is a "necessary evil" but is only needed once.  (edit: I might be able to mitigate this by tracing bus stop OSM history).
- Some time passes, and users update some of the bus stops.
- The ministry of transportation updates some bus stops in its database and publishes a new gtfs file.
- The next gtfs update would inspect the difference between the new gtfs file and the older gtfs file. Only bus stops that have had their data (in the gtfs file) changed since the last file are updated. So, conflicts are resolved by prioritizing the gtfs file version, but only for the bus stops that were changed by the ministry since the last update. The rest of the bus stops are left intact.

Last edited by SafwatHalaby (2017-09-23 10:31:16)

Offline

#62 2017-09-23 10:44:14

anonymous_gushdan_mapper
Member
Registered: 2016-12-17
Posts: 23

Re: Israel GTFS release

This sounds like a good idea. +1 from me.

Offline

#63 2017-09-24 16:25:10

dsh4
Member
Registered: 2017-06-24
Posts: 43

Re: Israel GTFS release

How up-to-date is the data in the gtfs file?  E.g., if a change is made to the Ministry's database on Sunday, how long is it until the publicly-available gtfs file reflects the change?  (Note that this is independent of how often the auto-update script is run)

How about some escape hatch, e.g., "The script never updates objects that have gtfs:auto-update=no"?  (gtfs:auto-update would only be set manually, not by any script)

Offline

#64 2017-09-25 13:54:32

anonymous_gushdan_mapper
Member
Registered: 2016-12-17
Posts: 23

Re: Israel GTFS release

The GTFS file updates nightly. So, realistically, if the official database is changed on Sunday morning, the changes will be in the GTFS file by Monday.

Last edited by anonymous_gushdan_mapper (2017-09-25 13:54:50)

Offline

#65 2017-09-25 19:08:18

SafwatHalaby
Member
Registered: 2017-04-10
Posts: 395
Website

Re: Israel GTFS release

I downloaded a gtfs file a couple of days ago and downloaded one today. It wasn't the same. 2 stations were deleted and 11 were added. There are indeed frequent updates. Apparently nightly.

How about some escape hatch, e.g., "The script never updates objects that have gtfs:auto-update=no"?  (gtfs:auto-update would only be set manually, not by any script)

That could be done, but why? Note that above strategy only touches stations when the ministry updates them. If you modify a station, the bot will not force-modify it back to fit the gtfs database. It'll only modify it if the ministry actually updates the station.

Offline

#66 2017-09-26 21:39:19

SafwatHalaby
Member
Registered: 2017-04-10
Posts: 395
Website

Re: Israel GTFS release

The gtfs:id is needless overhead, because "ref" is unique enough. I've removed it as part of the cleanup. Also, I've changed source=israel_gtfs_v1 to israel_gtfs, because the plan is to make incremental updates, and v1 no longer makes sense.

Offline

#67 2017-09-27 07:13:17

SafwatHalaby
Member
Registered: 2017-04-10
Posts: 395
Website

Re: Israel GTFS release

Some in the talk list seem to be interested in the permission. I've decided to translate the old comment by @yxejamir.

Original:

בנוגע לתנאי שימוש – מותר להעתיק ולשכפל את המידע ולהפיצו כראות עיניכם. ידידי, יהונתן קלינגר מסביר:

    ראשית, אתר דאטה.גוב אומר בצורה מפורשת כי 'המידע פתוח לשימוש חופשי ולא נדרש כל אישור מיוחד'; אבל, גם אם הדבר לא היה כך, העקרון הוא פשוט: סעיף 5 לחוק זכויות יוצרים קובע כי על 'עובדה או נתון' לא יחולו זכויות יוצרים. לכן, במספר מקרים בהם הועתקו מאגרי מידע כמו ספרי טלפונים, מודעות דרושים או מדריכים מקצועיים, פסק בית המשפט כי ההעתקה מותרת, וכי אין זכות יוצרים בליקוט (מעריב נ' אול יו ניד, קווי מידע נ' בל תקשורת). הסיבה לכך היא פשוטה: במאגר מידע אין את ה'יצירתיות' הדרושה לצורך מתן הגנה בזכויות יוצרים.

Translation:

"Regarding usage permission - you are allowed to duplicate and copy the data as you please. Johnathan Klinger explains:

Firstly, data.gov explicitly states that the information may be freely used, and no special permission is required.

But, even if it weren't so, the principle is simple: Section 5 of the IL copyright law [2007] states that "a fact or a piece of data" is not copyrightable. In many cases([Hebrew names of court cases below]) where databases where copied (such as phones, Wanted jobs ads, or guides), it was ruled the copy is permissible, and that scraping is not copyrightable. The reason is simple: Such a [scraped?] database lacks the "creativity" criteria needed for copyright protection."

Court cases:
מעריב נ' אול יו ניד, קווי מידע נ' בל תקשורת

Last edited by SafwatHalaby (2017-09-27 07:17:29)

Offline

#68 2017-09-27 10:53:26

anonymous_gushdan_mapper
Member
Registered: 2016-12-17
Posts: 23

Re: Israel GTFS release

See also the terms&conditions from data.gov.il

https://data.gov.il/terms

שימושים מותרים

אתה רשאי להעתיק את המידע, להפיץ אותו, להעמיד אותו לרשות לציבור, לשדר אותו, לבצע שינויים טכניים במידע וליצור ממנו יצירות נגזרות -- בכל מדיום או פורמט. אתה רשאי לעשות שימוש במידע באופן מסחרי ובאופן שאינו מסחרי.

Translation:

Allowed usage:
You're allowed to copy the data, distribute it, present it to the public, transmit it, preform technical changes to the data, and create derivative works - in any medium or format. You're allowed to use the data commercially or non-commercially.

So I think it's okay to use this data in OpenStreetMap.

Last edited by anonymous_gushdan_mapper (2017-09-27 10:53:38)

Offline

#69 2017-09-27 11:09:19

anonymous_gushdan_mapper
Member
Registered: 2016-12-17
Posts: 23

Re: Israel GTFS release

I'm not on the talk list, but I read the thread on the archives. To address the accuracy & verification concern:

Israel has something like 30,000 bus stops, and they change daily all across the country. There's no way human mappers could ever verify the accuracy of all of them, unless you have someone working full-time on this. However, the data is considered extremely accurate, and inaccuracies are quite rare. We do have a system that announces the name of the next stop, which uses this data.

I think people from other countries don't realize that this is not a single, private operator data, nor it's a single city data - it's government generated data that controls the entire public transportation network in the country. If a bus stop is not in this dataset, it doesn't exist. There will never be anything more accurate for bus stops in Israel than this dataset.

If accuracy is important to us, we *must* implement this importing script, otherwise the data on OSM will get stale quickly - just like the current data is stale and shows a lot of bus stops that have been since then moved or canceled.

People who have read this entire forum thread probably know this already, but I'm posting this comment to help people coming from the talk list to understand the subject.

Offline

#70 2017-09-27 13:45:16

SafwatHalaby
Member
Registered: 2017-04-10
Posts: 395
Website

Re: Israel GTFS release

Thanks for the post!

So here's my precise plan. I already got a working prototype, but there are some bugs that need resolving:

Column 1: Old GTFS dump
Column 2: New GTFS dump
Column 3: Openstreetmap

for each bus stop ref, find out in which columns it exists in which
it doesn't.If multiple bus stops have the same ref in any column,
the script halts till I manually intervene.

Exception: platforms (ratzefeem) sometimes have
db ref duplication that we should merge into one.

X       : A single bus stop with that reference exists in that column
-       : No bus stop with that reference exists in that column
=>      : action to be taken

1 2 3
- - X  => Nothing. (or maybe delete?)
- X -  => Create.
- X X  => Update. (We would have created but an OSM mapper created it first)
X X -  => Create.
X - X  => Delete.
X - -  => Nothing. (We would have deleted but an OSM mapper deleted it first)
X X X  => Update.

Updating action: scans all tags and:
- if col1's tag value does not equal col2's tag value: 
     sets osm bus stop tag value to col2's tag value.
- tags that only exist for the bus stop are not touched (e.g. shelter, wheelchair).

Last edited by SafwatHalaby (2017-09-27 13:48:32)

Offline

#71 2017-09-27 15:32:02

anonymous_gushdan_mapper
Member
Registered: 2016-12-17
Posts: 23

Re: Israel GTFS release

regarding the first option on the table, I think deleting is the right approach. If it has a GTFS ref, it means it used to be in the GTFS long ago, or someone inputted the incorrect GTFS ref when manually adding the stop. This means that it's stale data. If someone added a bus stop that isn't in the GTFS (those can't legally exist for public buses, but can for private shuttles, such as shuttles from train stations to industrial zones) it won't have a GTFS ref, so we shouldn't touch it.

Alternatively, if we want to care about mappers accidentally adding private bus stops with a ref field (idk what they'll write in such field in this case) and we want to preserve this data, we can add a condition that in the first table case (exists in OSM, not in the last two GTFS releases) only delete it if it has source=israel_gtfs, otherwise - keep it intact.

Another point worth considering: In Israel, each stop has a short identification number that is written on the stop sign. That is stop_code in the GTFS[1] and not the stop_id in the GTFS. We should make sure we use that, and not the stop_id field when we set the "ref" tag. stop_code is more useful for humans. Also, stop_id can theoretically change without the stop itself changing (I saw it happen in the past), but stop_code is pretty static.

[1] see documentation: http://media.mot.gov.il/PDF/HE_TRAFFIC_PUBLIC/GTFS.pdf

Offline

#72 2017-09-27 15:38:20

anonymous_gushdan_mapper
Member
Registered: 2016-12-17
Posts: 23

Re: Israel GTFS release

Another point worth thinking about: At the moment, OSM only has the Hebrew names for the bus stops, but the GTFS file also contains translations of these names from Hebrew to English and Arabic (see translations.txt).

When importing stops, it'd be a good idea to import their translated name too, if one exists.

Last edited by anonymous_gushdan_mapper (2017-09-27 15:38:46)

Offline

#73 2017-09-28 07:56:31

SafwatHalaby
Member
Registered: 2017-04-10
Posts: 395
Website

Re: Israel GTFS release

Another point worth considering regarding "- - X" is West Bank area C. Do they have marked bus stops? Do they have their own ref system? If so, the rules inside area C should never be "delete" for "- - X" (for bus stops lacking israel_gtfs), regardless of what they are in Israel. Luckily we have the proper relations to do this easily if needed.

Another point worth thinking about: At the moment, OSM only has the Hebrew names for the bus stops, but the GTFS file also contains translations of these names from Hebrew to English and Arabic (see translations.txt).

Good point! I'll first finish the "minimum viable product" and then see if I can implement this.

Last edited by SafwatHalaby (2017-09-28 07:57:25)

Offline

#74 2017-09-28 13:45:17

anonymous_gushdan_mapper
Member
Registered: 2016-12-17
Posts: 23

Re: Israel GTFS release

You're correct about area C.
I think the safest way would be to never delete any bus stop that doesn't have source=israel_gtfs - this way we'll be sure we don't destroy Palestinian bus stops or private shuttle bus stops.

Offline

#75 2017-09-29 08:30:22

SafwatHalaby
Member
Registered: 2017-04-10
Posts: 395
Website

Re: Israel GTFS release

I agree. This seems like the most reasonable approach.

- - X and israel_gtfs > delete
- - X without israel_gtfs > ignore

Offline

Board footer

Powered by FluxBB