street names in Israel have several fundamental problems

@yrtimiD : Yes, name:en1 name:en2 etc are used by nominatum and OSM apps on Android (Navigator).

What do you think about the “Ha” Situation and my suggestions around it?

  1. Keep in mind that some streets are specifically named “Carmel” and some are “HaCarmel”. I don’t know if any cities have both, I couldn’t find any. Google Maps accepts “Carmel” as an alternate for “HaCarmel”, but not the reverse. That seems like a reasonable approach for us too.

  2. From OSM Editing Standards and Conventions: “Watch out for apostrophes. The same rule applies. If the street sign has an apostrophe, the OSM data should have an apostrophe.” However, in my opinion it is fine to add an alternate name without the apostrophe.

Another try:
Dictionary of all he/en names from all ways with highway=* https://docs.google.com/spreadsheet/ccc?key=0AolLjmdDjvyydGJWWkF4U0dFSlFocU92M3QwSnhDZVE

All ways with highway=* and missing one of (name, name:en, name:he): https://docs.google.com/spreadsheet/ccc?key=0AolLjmdDjvyydEhDdmpFR0gtRlZaZnBVVUtvSHdESUE
this one has id and version column, so I’ll know how to update back. also, to update only changed lines - we can put any value into “update” column.
I do not imported all name1, name:he1, etc forms, but name, name:he and name:en - always must exists, right?

want to try?

  • Google seems to have a full text search (its google, what did you expect :slight_smile: )
  • In OSM we should have both version in the way definition saved. Just think of your self if somebody tells you to visit him at Carmel street. You will always have to check in Hebrew or English for “HaCarmel” and “Carmel”. And if you are good you would also write in English “Carmiel”. I’m very sure google saves a table for every street name and all kind of translations and names globally and not street specific. So if you search for Carmel street you will find all streets in Israel. No matter if a translator in Sfar Saba wrote it different on the streetsign than the translator in Eilat.

We should always save both versions (both English and Hebrew):

name:en = HaCarmel
name:en1 = Carmel

I would prefer to reduce the amount of different regional translations. Israel is a small country and every village can use his own translation.
But if we decide on a stricter rule, it would be easier for everybody to find what he searches for. Google has for example a strict translation rule, as iGo has.
This way you will find the street that you searching for if you understand how the rules in OSM are.

I would remove all apostrophes from the translation, no matter if they are written or not on the street sign.
If somebody wants to have it, it should be written as name:en1 etc.

  1. Carmel and HaCarmel are actual different names. See for example:
    HaCarmel: http://goo.gl/maps/VwKi7
    Carmel: http://goo.gl/maps/Ni9Au
    We have to choose the correct one for name, name:he, name:en, etc.
    For name:en1, name:he1, etc. we can put the other one.

  2. Nominatim already removes apostrophes when you search - try searching for “Earls Court” and “Earl’s Court”, for each search you will get results with and without apostrophe. I think the name on the sign has to always be one of the stored names. Whether it should be name:en or name:en1, I’d have to think more.

IMHO, it’s a search engine’s job to recognize wrongly typed queries, and it’s very wrong to add all possible type/translate variants. Name must have the only right value, and search engine must know to derive all other values. It’s not possible now, but for sure will be.

I have created a different kind of table that includes the first 5030 streets (sorted by alphabetic order by name field).

https://docs.google.com/spreadsheet/ccc?key=0AjoRSMeOZcXDdGFaYjFZcXB6eS1RTVR1RGhmWjI3M2c

How is this table different from the one yrtimiD created?

  1. It includes all streets no matter if there is something missing or not (this makes it searchable and you can most probably copy paste the street names, no need to translate).
  2. It marks all missing fields for name:he and name:en with a different color. If somebody writes something in this field, it is changing the background color to white.
  3. It also includes the update field that will change its color if somebody wrote something in it.
  4. It includes the russian and arabic translation. As it is very easy to copy paste things, it could help a lot to add those translations to existing streets.

Why did I create this list only for the first 5030 entries?

Because Google is not capable to load all 21.000 entries in one table in Google Drive.
And because I think we should start with a small chunk and then extend it.
This way we will not cause any damage.

For everybody that would like to know how to create this list by yourself … here is the command line “code”.
After it I loaded it in Excel and used the sort function.


osmconvert israel_and_palestine.osm.pbf --all-to-nodes  --drop-author -o=temp\israel_nodes.o5m
osmfilter temp\israel_nodes.o5m  --keep="highway=residential and name=* and is_in!=Egypt and is_in!=Jordan and is_in!=Syria and is_in!=Lebanon and is_in!=Gaza" -o=temp\israel_streets.o5m
osmconvert temp\israel_streets.o5m -B=poly\israel.poly --csv-headline --csv="@id @lon @lat is_in highway name name:he name:he1 name:he2 name:he3 name:ar name:ru name:en name:en1 name:en2 name:en3 name:en4 name:en5 name:en6" -o=israel_streets_withArRuEnHe.csv
pause

I don’t think that we will get there in the near future.
It is very easy to add more fields to be searched through in software, but it is a lot harder to create a working algorithm that will take care about the different translation types.
Specially for Israel, where we have such a “strange” left to right language :slight_smile:

@yrtimiD: Can we use my table or is there a problem with the ID or other stuff?

I have added many entries to yrtimiD’s table. I have also highlighted in yellow where I think there is probably a small mistake, but don’t want to fix it without looking at name:he1 or name:en1 first.

It would be better to work from Mr_Israel’s table (once the changes to yrtimiD’s table are imported). Currently though I find Mr_Israel’s table hard to work with, because the columns name:he1-4 are too wide. If that is changed, so that it is possible to see the most important names (name, name:he, name:en) at the same time, it will be better because it is more complete.

Hey Eric,

Of course you can change the size of the columns. This shouldn’t be a issue.
This is just a visualization problem you writing about.
We could make them very small and define to grey them out if they are empty anyway.
This would make it even easier to avoid them…

I have changed the view of the table. It should be fitting more onto the browser now.

@yrtimiD: Can you tell me if you can reimport my table (technically speaking)?
I personally don’t know how to do that, thats why I haven’t done anything so far on my table. But I would be very fast finished, if you tell me that you are able to work with it.

I think yes, but to be sure - I’d need a version column.
Also, your ID column looks very strange to me.

If you tell me how get your columns, then I will add them.
I’m using OSMCONVERT to create the table and the details are listed bellow.

yrtimiD - can you upload every entry in your table which has a number in the Update column, OR for all which all three entries (name, name:en, name:he) are present? After that is done, we can use Mr_Israel’s table (or a variation of it).

Mr_Israel: I must have a browser issue…

Here: https://www.dropbox.com/s/8w81rgmuxd0x7lk/israel_only.poly
is mine israel only .poly file.
Because I have no good knowledge in palestinian areas, I tried to include all settlements with hebrew names.
Fill free to edit and publish.

I haven’t received that much feedback about the problems I specified.
So here again a kind of summary… Please give me a feedback of your opinion. Otherwise it doesn’t make sense to work through so much data and update OSM without taking about this and that problem right away.

1. Street name translations not standardized

Eric22: (?)
yrtimiD: (?)
Mr_Israel: I would try to standardize the street names in some kind of way. I would recommend to use a standard translation and don’t care about the translation written on the street sign.

Current situation:


מנדלי מוכר ספרים        Mendele Mocher Sfarim
מנדלי מוכר הספרים        Mendele Mocher Sforim
מנדלי מוכר הספרים        Mendele Moher Sefarim
מנדלי מוכר ספרים        Mendele Moher Sefarim
מנדלי מוכר ספרים        Mendelei Mocher Sfarim
מנדלי מוכר ספרים        Mendeley Moher Sfarim
מנדלי מוכר ספרים        Mendeli Mokher Sfarmin

What I would use in all 7 cases:


[name]                [name:he]                [name:he1]           [name:en]                   [name:en1]                 [name:en2]                     [name:en3]                       [name:en4]                  [name:en5] 
מנדלי מוכר הספרים          מנדלי מוכר ספרים      מנדלי מוכר ספרים    Mendele Mocher Sfarim     Mendele Mocher Sforim       Mendele Moher Sefarim          Mendelei Mocher Sfarim           Mendeley Moher Sfarim      Mendeli Mokher Sfarmin

**2. Ha (“ה”) or no Ha in front of the street name **

Eric22: (?)
yrtimiD: (?)
Mr_Israel:
Use the street name as written on the street sign for the “name” tag of the way and translate the words to English.
Please try as much as possible to use name:en1 for the root of the word, to make it easier to find.

Example:
name:en HaGalil
name:en1 Galil

**3. With a apostrophe after Ha (“ה”) or without in the English translation **

yrtimiD: (?)
Eric22:
Nominatim already removes apostrophes when you search - try searching for “Earls Court” and “Earl’s Court”, for each search you will get results with and without apostrophe. I think the name on the sign has to always be one of the stored names. Whether it should be name:en or name:en1, I’d have to think more.

Mr_Israel:
I would like to remove all quotes from the Ha phenomen, no matter what is written on the street sign.
There is no reason to keep it and create even more variations of street names with this strange character.
It shouldn’t matter if the translation is written on the street name with a apostrophe. We should remove all of them from the name:en tag. If you think the information should be still saved, we could save it as name:en1.

Checked manual for osmconvert, and it looks like impossible to output version of the object.
Also, now I know how to convert back your ID values.

I needed the version value to be sure I do not overwrite somebody else’s changes.

So, Mr_Israel, I think you can continue your fixes, and then you’ll finish, I’ll apply all non empty values from changed rows back to the ways and upload them to the OSM.

I’ll actually will prefer the wikipedia variant of translation in name:en (for example http://en.wikipedia.org/wiki/Mendele_Mocher_Sforim)
also, i’d drop all 100% incorrect variants like “Mendeli Mokher Sfarmin”

I’m really not sure if we need to keep all that strange english varians, I’d keep only one, mostly used translation (if it’s equal to the wikipedia variant)
For hebrew names - must be only one value, as all other are for sure mistaken.

About Ha* and Ha’* I’m really don’t know what is better and right, but for sure we must normalize all to one schema.

So how would you like to decide what version is correct.
With or without “Ha” ?

Example:

מנדלי מוכר ספרים
מנדלי מוכר [b]ה[/b]ספרים

I don’t want to be the one to delete correct value, because I think I’m more accurate than the person that did the editing.
We should try to never loose any data.

Beside that we have also the issue of Jerusalem. Should we translate it to name:en=Jerusalem or name:en=Yerusalaim

This will not be answered by Wikipedia.

And also I’m not sure with “Mokher” and “Moher” and “Mocher”.
iGo for example would use “Mokher”.

IMHO, the version “מנדלי מוכר הספרים” is wrong. But here we do have wiki for answer.
For other cases it’s quite difficult to know what is right, as we all saw many road signs with awful typing errors.
I asked my friend (his mother learned linguistics), so tomorrow we’ll possible get answer about apostrophe usage after Ha.

Let’s for start fix all typing errors and all common errors like:


א.ד. גורדיון (wrong Yud)
אהרונוביץ (no ' at the end)
אירוס ארגמן (for sure must be HaArgaman in hebrew)
אלוף הניצחון or אלוף הנצחון
אריה דולצין or אריה דולצ'ין
all variants of ארלוזורוב must be for sure ד"ר חיים ארלוזורוב
same for ביאליק must be full name and not shortcut
etc...