Frontend transliterator: translit. A battle against the ??????'s

Speeking for my Legend HCX:

In general the device is able to diplay all(?) latin1 characters:

But: When compiling the map, mkgmap changes all street names to uppercase
(unless you use the --lower-case option). But there is no upper case
for the “ß”, so it is converted to “SS”.

The Garmin device convertes back to lower case in the tooltips and in other fields.

If the --lower-case option is used, the street names are displayed
as A… in the map (only first letter is shown).

Chris

Chris you are talking about what mkgmap can/does. But I want to know something about the Garmin. I asked if you had/knew an .img file which contained a ß. Does not matter who put in in.

But your picture shows something very nice. Just above the Süntelstasse hint: ÄËäÜß.

Isn’t that a ß at the end? Did you type it in for a waypoint?

Yes, that it a ß entered in a waypoint name.

here a gmapsupp.img generated with --lower-case, so you have a lot of …straße

http://www.megaupload.com/?d=J0LT749R

Chris

Thank you.

That is a very small map. I could hardly find the bbox on my device.

This is how it shows in the 60Cx.
ringel s

Well isn’t this a strange device?
-It can show a ß and it cannot.
-It can show lowercase and it can not.

What happened at Garmin to make this possible?

So I know now that translit should make an ss of it too.

It can not show rotated lower case characters.

But the question remains, why the device is not able to convert them to upper case. :slight_smile:

Chris

Ahhh now I see. Rotation is the problem.

Thank you for pointing out.

Now isn’t this nice:

transliterated

First results from translit published on http://garmin.na1400.info/routable.php

No questionmarks!

I have to add here that this will only be true for place names. The next step will be to do the road/street names too (highway=…).

At the moment there are translit results for belarus, bulgaria, kaliningrad, romania, russia and ukraina.

Please comment and express your wishes.

This tool would be useful to the wider community - do you intend to share it on the mkgmap-dev mailing list?

Can this tool also solve problems with greek, arabic and chinese charakters?

Please see my reply on this question: http://forum.openstreetmap.org/viewtopic.php?pid=47770#p47770

With greek yes. See: http://forum.openstreetmap.org/viewtopic.php?pid=47500#p47500

With arabic? I don’t know. Is arabic transliteratable? If yes please provide information(links).

With chinese? Maybe. There are some results. See: http://forum.openstreetmap.org/viewtopic.php?pid=47496#p47496

In Wikipedia gibt es eine Tabelle für arabische Zeichen.
Ich kann aber nicht beurteilen, ob die für unsere Zwecke brauchbar ist.

http://en.wikipedia.org/wiki/Romanization_of_Arabic

Thank you Walter.

For Georgia there was info in http://en.wikipedia.org/wiki/Georgian_alphabet

For place name there is not much to do as mostly all have also name:en or int_name. This is the list of names that were transliterated. The same file is displayed in three difefrent programs (on XP): wordpad, notepad and firefox.

display in different programs

Wordpad does not understand utf8 so it displays all bytes as if they were characters, notepad understand that it is utf8 and converts to the right character but then misses the font. Firefox understands everything (without telling it that it is utf8).

From all programs I copypasted the first three lines to this post. (you never how which conversions are applied using the clipboard).

ვარძია=vardzia
ქვაბისხევი=kvabiskhevi
აწყური=atsnuri

ვარძია=vardzia
ქვაბისხევი=kvabiskhevi
აწყური=atsnuri

ვარძია=vardzia
ქვაბისხევი=kvabiskhevi
აწყური=atsnuri

I would like to see comments on the transliteration results. Someone from Georgia out here?

As reported on the IRC:

For Turkish you would transliterate ö and ü to o and u - not to oe, ue as in German

Yes I do already for Turkish. I did not yet bother for German. But then the 60Cx can display Ü and Ö. Or not rotated?

At second thougth: If in utf8 the Turkish ö is coded with the same byte values as the German ö (and i think that will be the case) then if translit keeps the tables per area then the transliteration could be different. But i think that in short i will do away all areas (the .gpx files) and only use one transliterationtable for the whole world as i do now already for the 's.

About transliterating greek.

In Greece many places have already an int_name or en:name. Whats left are 652 nodes with a place and name tag that need transliteration. Well translit thinks so and it does a transliteration of the utf8 encoded string. If the transliteration result is the same as the utf8 encoded string than the name was already in ansi. Translit counted 559 name’s already in ansi. So only 90 tags were added.

For ways these counts were 7097, 5644 and 1439.

On the maps for greece you can see that very often the name is both spelled in greece and internalional.
Here some transliteration results:

name in utf8 = name transliterated
Άγιος Νικόλαος (Agios Nikolaos)=Agios Nikolaos (Agios Nikolaos)
Ευηνοχώρι (Evinochori)=Euienochori (Evinochori)
Πυλαία - Pylea=Pulaia - Pylea
Μικρόκαστρο (Mikrokastro)=Mikrokastro (Mikrokastro)
Σκάλα Συκαμνιάς=Skala Sukamnias
Σίγρι=Sigri
Ζωνιανών=Zonianon
Λιμένας Χερσονήσου=Limenas Chersonisou
Τσούτσουρος=Tsoytsouros
Λιβάδιον=Livadion
Μαρίες (Maries)=Maries (Maries)
Αναφωνήτρια (Anafonitria)=Anafonitria (Anafonitria)
Καταστάρι=Katastari
Βολίμες (Volimes)=Volimes (Volimes)
Μονή=Moni
Άγιος Γεώργιος (Agios Georgios)=Agios Georgios (Agios Georgios)
Αγία Μαρίνα (Agia Marina)=Agia Marina (Agia Marina)
Μονόσπιτα (Monospita)=Monospita (Monospita)
Κορομηλέα (Koromilea)=Koromielea (Koromilea)
Αλίαρτος (Aliartos)=Aliartos (Aliartos)
Βάγια (Vagia)=Vagia (Vagia)
Μαυρομμάτι (Mavrommati)=Maurommati (Mavrommati)
Άσκρη (Askri)=Askrie (Askri)

The weekly updated img files on http://garmin.na1400.info/routable.php now contain transliterations for 21 Areas:
albania, belarus, bulgarije, cyprus-turkish, czech-republic, estonia, georgia, greece, hungary, kaliningrad, latvia, lithuania, macedonia, moldavie, poland, romania, russia, serbia, slowakia, turkey and ukraina. Attentioen only 's with a valid place AND a name tag are transliterated when needed. (valid place |town|city|suburb|village|hamlet|).

So for instance in greece
http://api.openstreetmap.org/api/0.6/node/83945071




will show up in your garmin as " Agios Nikolaos (Agios Nikolaos). While
http://api.openstreetmap.org/api/0.6/node/83943762



will show up as “??? ??? (Agios Andreas)” because it is an island.

I now understand that such a restriction is unwanted. In the next update translit will handle all nodes with a name tag.

greencaps, do you feel confident by now about puglishing your translit program?
Or maybe even adding it into mkgmap?

The program at the moment is restricted to do 's. When 's and 's are also transliterated publication would make sense. Lambertus is the only one who uses (and tests) it. If translit does what I want it to do I will offer it to the community. In which way I still don’t know.

Adding to mkgmap is a different story. I never looked into the source of mkgmap so I do not know in which way the code could be integrated. But of course integration in mkgmap is preferred over functioning as a frontend. An extra option --translit which forces mkgmap to trow away all codepages and just take one transliterationtable (the one from translit for instance) would be nice.

Would it make sense to introduce a name:translit tag for transliterated names and let mkgmap choose one of them, e.g., mkgmap --name-tag-list=name:en,int_name,name:translit,name or (for native users) just mkgmap --name-tag-list=name:translit,name? The transliteration could be implemented as a separate preprocessing step or integrated in mkgmap so that it can be executed on the fly, when a substitution of name:translit is requested.

ISO 15924 and IANA define some language subcodes, such as sr-Cyrl and sr-Latn for Serbian written in Cyrillic or Latin, but I did not find any subcode for transliterated Russian or Greek, for instance. We could of course invent such codes ourselves, but it could be controversial to add, say, name:gr-Latn and name:ru-Latn to each Greek or Russian location in the OpenStreetMap database. (It could provoke edit wars, because not everyone is following the current version of ISO 9.)

Thats exactly what happens in Lambertus’ toolchain. The tag is added by the frontend.

Yes. Transliteration and adding the tag. But all depends on the structure of mkgmap.

I will not look at that to much.