Frontend transliterator: translit. A battle against the ??????'s

At the moment when all tables are combined (except for the thai and chineese ones) a world transliterationtable is constructed with 339 entrys.

It was time to try it on the 's. To my joy all went like I thought it would. Program translit working on osm data that contained parts of russia, ukraina, lithuania, romania and greece transliterated all as if it had separate tables for every country.

For instance the above shown way in lithuania (http://api.openstreetmap.org/api/0.6/way/27950733) would leave it as:

When I saw this I realised that the algorithm used for the nodes to determine a transliteration table by means of lat,lon’s laying in country borders was superfluous.

One size fits all

A nice demonstration of the potential of one transliteration table is this way on the border of Russia and China:

http://api.openstreetmap.org/api/0.6/way/39159352
http://www.openstreetmap.org/browse/way/39159352

If you click the links you will see that your browser has no difficulties displaying names which for the first half consist of cyrillic characters and for the second halfe of chinese. This is because it’s utf-8.

... ...

Above you see twice the same way. For the first the text is copy/pasted from a browser. For the second one from tekst in wordpad. (Copy/Pasting/Displaying utf-8 in different programs is a story on its own…).

Translit does not mind the combination of cyrillic and chinese and transliterates it all and adds the missing tag:












Edit:: well in this case adding a tag was not needed as there is already a name:en. But I found it too beautifull to not tell…

http://api.openstreetmap.org/api/0.6/way/10930885
This way from Greece treated by translit:

Hi Greencaps,

Is your program available for download ?

Chris

No. Not yet. As you can read the implementation changes and changes. It is in an early state of development.

It is now tested by Lambertus. The first results are not visible yet (I mean on http://garmin.na1400.info/routable.php ) . I have to spend more time on the transliterationtable(s). I first want to see that it runs at Lambertus like I want it to run.

After that we will see.

Are you interested in a special country/language?

Ukraina 63240172.img

Part of 63240172.img (Ukraina) displayed by GPSMapEdit. Hope that after the update of next weekend the questionmarks are gone.

It’s my fault that the transliterated names aren’t showing up yet. I simply forgot to add the ‘name:engels’ to the list of tags used for displaying the name. This is fixed now, but I’m running into a bug (nothing related to translit) that let’s Mkgmap crash on a lot of tiles. This has to be fixed before I’m running a new update (also, a new planet will be available tomorrow which I want to use for the next update).

I am sure the transliteration will fine in general because adding the Chinese name:zh_py worked fine as well.

Europe. :slight_smile:

So, is there some europe country missing ?

Chris

Well you have seen the list. Only eastern Europe.

Even the ß will not be transliyterated yet.

@Lambertus: the forum clock is one hour offtime. I post at 11:50.

Edit: In my profile checking “Daylight savings is in effect (advance times by 1 hour).” did it.

I use this perl module for transliteration: http://search.cpan.org/~sburke/Text-Unidecode-0.04/lib/Text/Unidecode.pm
Maybe it’s tables will be useful :slight_smile:

But the german “ß” ist part of latin1 charset and don’t needs to be transcripted to “ss”.

It does because what counts is if a garmin device can display it.

I think a GPSmap 60Cx cannot. Well it is difficult to find out. ß in streetnames in osm are on Lambertus’ site ss. In City Navigator its only ss. That will have a reason I think.

If you have/know a small .img file with ß’s please give me a link. I’m eager to try it out.

Thank you. I see that all has been done before.

I will spit it through but at first glance it looks to be a transliteration from two byte unicode (See the Bei Jing example on that page). But osm comes with utf8 (1 to 6 bytes). You do a conversion first from utf8 to unicode-2 before using this function?

It converts from perl’s internal unicode representation.
So the code is something like this:

use Encode;
use Text::Unidecode;
.....
$transliterated_string = unidecode( decode( 'utf8', $utf8_string ) );

Speeking for my Legend HCX:

In general the device is able to diplay all(?) latin1 characters:

But: When compiling the map, mkgmap changes all street names to uppercase
(unless you use the --lower-case option). But there is no upper case
for the “ß”, so it is converted to “SS”.

The Garmin device convertes back to lower case in the tooltips and in other fields.

If the --lower-case option is used, the street names are displayed
as A… in the map (only first letter is shown).

Chris

Chris you are talking about what mkgmap can/does. But I want to know something about the Garmin. I asked if you had/knew an .img file which contained a ß. Does not matter who put in in.

But your picture shows something very nice. Just above the Süntelstasse hint: ÄËäÜß.

Isn’t that a ß at the end? Did you type it in for a waypoint?

Yes, that it a ß entered in a waypoint name.

here a gmapsupp.img generated with --lower-case, so you have a lot of …straße

http://www.megaupload.com/?d=J0LT749R

Chris

Thank you.

That is a very small map. I could hardly find the bbox on my device.

This is how it shows in the 60Cx.
ringel s

Well isn’t this a strange device?
-It can show a ß and it cannot.
-It can show lowercase and it can not.

What happened at Garmin to make this possible?

So I know now that translit should make an ss of it too.

It can not show rotated lower case characters.

But the question remains, why the device is not able to convert them to upper case. :slight_smile:

Chris

Ahhh now I see. Rotation is the problem.

Thank you for pointing out.