Ok. Your tiles are the big areas choosable on your site. You cannot do much then.
As I told in my first post the ???'s are in the .img files downloaded from your site. (open a russian one in wordpad, scroll a lot down and you will see them). So it is a mkgmap issue. I once read somewhere that mkgmap would transliterate when appropriate and possible. But apparently it does not here.
The goal is to transliterate a cyrillic name. Not to translate it. To translate a name you need a dictionary. For transliteration only a small table (26 characters or so) to substitute a cyrillic character by a latin one. I have no dictionary for russian placenames. So transliteration is the one. I cannot put a transliteration in an int_name or name:en as it is not as such. So there has to be new tag.
About other russian mapmakers. Cloudmate is nice but not routable. I had already looked on the page you indicated but found nothing suitable.
So I want to write a program using the API to retrieve all the nodes with ‘place’ and add a transliteration if needed.
A node of a ‘place’ like following is ok as there is a ‘name:en’ tag.
The following node has no int_name and no name:en.
So I would like to add a name:trans.
<tag k="name" v="Жуково"/>
should read
and transliterated
That all can be automated.
At the moment I can already drag a rectangle on the map and download all the osm data for that area in little pieces. Investigate the nodes and add a transliteration tag if needed. I can put everything -nearly- ready to be uploaded. (well I’m almost finished). The only thing I did not do is upload a change or changes.
Ok, I understand now. Maybe the name:trans tag should be name:latin for clarity? And would it be a possibility to post this on the Mkgmap mailinglist and see if the problem can be fixed in the Mkgmap code?
That is the better aproach. Where should I do that? Sorry I dont know all doors yet.
Of course it could be name:latin. But there is much more in it.
A transliteration in Englisch looks different from one in German already.
So it would be name:trans:en and name:trans:de.
But now that I made some runs it looks as if your idea to put the transliteration in name:en or int_name is not as bad. ;-).
For an area in the Ural I get these transliterations: ( On the left the placename in cyrillic. If you don’t see normal cyrillic than change charset of your browser. On the right the transliteration english style ( I think))
Белянка >>> Belyanka
Нязепетровск >>> Nyazepetrovsk
Верхний Уфалей >>> Verkhnii Ufalei
Дружный >>> Druzhnii
Шутихинское >>> Chutikhinskoe
Осеево >>> Oseevo
Заречье >>> Zarets’e
Майский >>> Maiskii
Каргаполье >>> Kargapol’e
Долговское >>> Dolgovskoe
Бакланское >>> Baklanskoe
The name of the tag where to put this in is not important for my program. I try to finish this program without actually uploading anything. Meanwhile I will look at the mkgmap group.
Now I will make a run as to see how many places in Russia need an english translation/transcription.
Yesterday I quickly draw a rough 'frontier’line for the Russian Federation. Then there were 44.666 0.25 sized bboxes to be retrieved. After every call and investigation of the result the pogram pauses for a second. That alone counts for more then twelve hours. This morning about one third was done.
Result. 7131 places already with english translation. 14787 to be translated/transliterated. (Two/third has to be done yet.)
I now save the node id in the to do list. The next run I can use it for direct retrieve.
I’ve consulted the irc channel on this issue, they told me that most renderers internally make the same conversion that you want to do for Mkgmap, but without uploading the results to the database. So the recommended solution is to fix this in Mkgmap instead of in the OSM database.
From how I read that thread: it’s difficult to determine automatically which transliteration table you should use.
So ultimately I wish we could automatically transliterate from Japanese, Indian, Malayan, Greek, Russian, Chinese, etc to English so a readable map for the whole world would be possible… Dunno if this is possible at all, my searches all lead to transliteration from a specified language to English instead of language detection…
I found this page which seems to do a pretty solid job and is using Java, although I cannot find any source code. Perhaps a good starting point for Mkgmap? Or this python solution.
Google translate does a fair job. I entered “น้ำไหลลงเขา” and told it to detect the language and translate to English. It came up with “Water flows downhill.” Google detected Thai and translated correctly.
My program to add a missing name:en or name:trans:en or whatever tag works. As a test I let it handle less then ten nodes in a changeset. All worked ok.
Now to be sure that only places which are in a selected country are handled I need a borderline around the area.
Hmmm… that can be found in OSM data. So I wrote a module that collects the ways representing the border. Starting with a relation id recursively retrieving other relations, ways and nodes. Then build a .gpx file. All took quite a while on 60189 (Russian Federation). So I tested on smaller countries like 102879 Austria, 161033 Mongolia.
For Austria got a lot of ways. And they ‘lay on the frontier’. Ok so far. But the ways are in random order. The next way does not start where the former stopped. To make things worse the direction of the ways is not consistent. Most are from east to west (for the frontierline Germany/Austria). I need a closed curve of a country to determin if a “lat,lon” is inside that country. So work is now on sorting the ways and revert them if needed.
Until now I had no look at mkgmap because all this takes time. It is nice to see that others react. I will study all links later.
There is another solution if mkgmap cannot handle the transliteration. Mkgmap works on raw osm data I read. Those are the data files in xml format. My program (well another version) could add the missing tags to those xml files.
I posted a question about this on the Mkgmap mailinglist but did not get any response so far.
So it indeed seems like there are two options: upload the transliterated tags with name:trans:en or preprocess the data each time a new update is performed. But I’m afraid that preprocessing the entire planet file on each update will take a very long time.
Besides that, you are only transliterating cyrillic languages, there are so many more languages that need transliteration. Maybe this needs to be discussed on the main OpenStreetMap mailinglist…
I just noticed the date changed to 14/10/09 so I tried a download again. All the roads I added are still not showing. Maybe I added the tags after 14/10/09.
The easiest way to find out is to see if the changes are visible in Potlatch…if they are, then there’s a problem on Lambertus’ end. If they’re not, then the problem is at Skywoolf/reinholdM’s end.
Yes, the first time report of missing ways could be the result of bad timing, but those ways should be in the map definately by now. So if they aren’t then there’s a problem somewhere…