name:th-Latn for romanized versions of street names in Thailand?

Hello all!

I understand it is common practice in Thailand to use name:en for the romanized (RTGS) version of the Thai street name.

I am the author of StreetComplete and recently, a user from Greek complained that the app tags name:en for the romanized Greek name. In Greece, int_name is commonly used for romanized street names.
He argued that the romanized version is not “English” but is what it is, a romanization of the original name.

I reckoned that he actually has a point, so I am thinking about whether to change it so that for Thailand, int_name or better even name:th-Latn is used instead of name:en for the romanization. What do you think?

Here are the results of the research I did regarding usage of the tags for romanizations of the street signs for the different countries:
https://github.com/westnordost/StreetComplete/issues/1953#issuecomment-658408761

I completely agree with your reasoning on this topic and have been meaning to write a similar post asking the same question. My question was going to be centered around the existing name:en for the Chiang Mai 700 Year Anniversary Road, ถนนสมโภชเชียงใหม่ 700 ปี, which is currently “Thanon Somphot Chiang Mai 700 Year”. So, yes, I’m interested in discussing this further.

It seems silly to be using Mae Nam Ping as a transliteration of แม่น้ำปิง and calling it English. In English, it’s the Ping River as we all very well know. Using the word “wat” as English for temple is not correct. In the waterway situation, there are many variations in names for types of waterways, huai, huai nam, mae nam, lam nam. A similar naming variation exists in the U.S. We have creek, branch, brook, kill, river, stream, etc. The point is, we will need to come up with English equivalents for all of the Thai variants should we decide to change our methodology.

A more serious problem is that there are words in Thai that simply have no English translation or transliteration. These can be worked on as we go. Either way, it’s a lot of work.

Whether we settle on int_name or name:th-Latn is not all that important to me, however, I am willing to help make the changes to whatever extent I can.

Cheers,

Dave

name:th-Latn should be used for romanized Thai name, which is different from English name, like “Thanon Somphot Chiang Mai 700 Pi” Dave mentioned.
However, I think English name in name:en should follow the way they usually translated on local bilingual sign, which some words usually not be translated. e.g. wat, khlong, huai.

I think of name:en as meaning just that, an aproximation of English. If you check the Eiffel
Tower, there are about 40 languages tagged for example, name:de=Eiffelturm.

English is probably the closest to an international language, but to claim that seems a bit
colonial to me (it used to be French). I dislike the tag int_name.

Some changes are obvious, like ‘thanon’ should be written as ‘road’ for name:en. ‘Wat’ is
in the English dictionary and doesn’t need translation.

Hello,

name:en is used in the Thailand tagging for the name on how you would use it in witten/spoken English when mentioning that element.

So you might visit the “Science Centre for Education” in Sukhumvit “Road”, not Sun Witthayasat Phuea Kansueksa in Thanon Sukhumvit.

In Thailand, some “category” is often part of the name as printed on the sign, like in many other countries as well. We do not transliterate that, but translate into English. The “real basename” must be somehow transliterated from Thai script into the latin Alphabet. For this we use the RTGS.

There are as always and in all languages exceptions to the rule. We do not translate some words at all, because they have a specific meaning in Thailand or are tightly coupled with the name. So the museum above is opposite “Soi 61”. And as a tourist you probably visit “Wat Pho” and not the Pho temple. For such tightly coupled names you can cross-check Wikipedia, as it is handled there the same way (for English wikipedia): https://en.wikipedia.org/wiki/Wat_Pho

Similar other facilites like schools, restaurants, hotels, government offifces, plice station and so on.
It is not “Rongphayaban Songsoem Sukkhaphap Tambon”, it is “Subdistrict Health Promoting Hospital”.

All clearly English, and not a transliteration. If for some reasons the transliterated name is important, we can add it in name:th-latn. “Maenam Ping” instead of Ping River" as mentioned above.

There are certainly some tagging inconsistencies in Thailand. Dave spotted some. If you go out for sight-seeing in Chiang MAi, you probably go with “Ping River Cruise” to see the Rambo filming location, at is “River”. It is also railway bridge over River Khwae, that tourists visit.
https://en.wikipedia.org/wiki/en:Khwae%20Yai%20River

So name:en is in my opinion the correct on, because English is used for the extended naming words.

int_name is decribed in the Wiki as the name internationally used for a place (in case it differs from the locally used name).
I can only think of Krung Thep here, which all call Bangkok in international context.

the “int_name” is problematic- based on what would you decide what is an international language? Let’s use the numbers of international tourists in Thailand. Majority are Chinese, so int_name should be acutally Chinese, right?
It fails in a similar way than mixing languages in the name tag (like using parantheses).

The only way to have a neutral database is to have individual name-tags and allow the renderer to combine them in a way suitable for the desired target audience.

The Greek community used to follow this tagging style. Back in old times, when there was no other bi-lingual rendering available, I did not only create the bi-lingual Thaimap, but had also bi-lingual renderings on request of the local communities. I retired renderings for Iran and Greece in Jan-2012.

The style, as of 2010-08-17 is still availabe and used the same “name:en” we also have in Thailand:
“case when (tags ? ‘name:en’) then tags->‘name:en’ else name end AS name,”
https://downloads.osm-tools.org/bilingual/

I’m with Tom on this one - and we don’t need yet another semantic tag creeping in. My Garmin/Mapsource and other renderings of the Thai map are designed to read *name:en *tags. By adding a a relatively unused tag (for Thailand), is change for changes sake, & I cud see a lot of info no longer displaying.
Street complete is a great app and I urge you not to change Thailand coz some Greek purist thinks that way. In Thailand, name:en is best described as the common translation from whatever source and personally, I think it should stay that way.

Street complete is a great app which gives me the ability to tag streets directly in the field … if my additions are difficult to see because the “new” tagging method info is rarely displayed, than I’ll simply stop using it, and that defeats your purpose.

So to put it another way, If I use your app to tag a street, your assumptions will become that the translation is based on the RTGS system … and that may not be the case. How will the user know where the village translation came from … and lets not forget quite often you can see two different English spellings of a village - one on the sign coming is, and a different version on the sign as you leave.

So I would resist universal change on the basis of one Greek users comments and don’t fix something that’s not broken, at least not in our neck of the woods.

Rgds, Russ.

My Garmin/Mapsource and other renderings of the Thai map are designed to read name:en tags.  By adding a relatively unused tag (for Thailand), is change for changes sake, & I cud see a lot of info no longer displaying.

Russ, two things to note. Nobody is lobbying to stop using name:en. What we’re trying to decide is where to put the RTGS transliteration instead of in the name:en tag. If anything, the name:en tag will actually represent the names of the object better for English speakers if we follow this through.

Secondly, the way our Garmins decide to display names should have nothing to do with the mapping we do. The firmware used in most Garmins won’t display Thai characters because it’s old technology and English centric. It can’t even display something in ALL CAPS because it will only capitalize the first letter no matter what you do. You are a very productive mapper but IMO you sometimes allow the limitations of your Garmin device or the Garmin compatible maps you use to influence how you map things. Let’s not allow those limitations to influence how we approach this discussion.

In addition, it’s not only a “Greek purist” who is pushing for this change, it’s also some very dedicated Thailand mappers. So bear with this thread for a while and let’s see where it goes.

Respectfully,

Dave

And we have some odd names which are Thai transliterations of English words…
https://www.openstreetmap.org/node/1583925239
name=โรงแรม ซิตี้ปาร์ค
name:en=City Park Hotel

To summarize, this would be the correct tagging:


name         = ถนนสุขุมวิท
name:en      = Sukhumvit Road
name:th      = ถนนสุขุมวิท
name:th-Latn = Thanon Sukhumvit

What of these should be tagged then depends on what is visible on the street sign. If “road” is visible on the street sign, then of course to use name:en is reasonable as it contains English components, if it is a romanization completely, name:th-Latn would be reasonable.
Usually, Thanon (Soi, etc) is not translated. Specifically Sukhumvit seems to be a bit of an exception here because it is commonly referred to as just “Sukhumvit” (without Thanon), so to have an English translation to “Sukhumvit Road” sounds fine.

In reply to Russ McD:

It would be of course possible to duplicate the user’s input in StreetComplete to name:en and name:th-Latn alike. But I am not sure if people would be so happy about it.

Also, name:xx-Latn is not exactly fringe tagging. Yes, it is not really used in Thailand so far, but it is in other countries, such as Korea and Japan. This is relevant because generally, (OpenStreetMap-based) software is written to work for the whole world, not just a specific country.

So, any software that works with Korea, Japan and possibly others, and any software that works with Greece and many countries in East Europe that use Cryllic script, needs to have implemented a reasonable fallback for the display name. A reasonable fallback that takes into account the mentioned variants and more is one line of code in Kotlin (Java-like language). (?: means “take left if available, otherwise take right”)


val displayName = tags["name:"+userLang] ?: tags["int_name"] ?: tags["name:"+countryLang+"-"+userScript] ?: tags["name"]

Given that it is this trivial to implement, I would go so far as claiming that any software that does not support this is probably legacy software that is not maintained anymore they are not really trying.

Sukhumvit 7 is “Soi Sukhumvit 7” not Thanon.
In general, for Bangkok Metropolitan Administration’s sign like this, the soi (alley) that named after the main thanon (road), with number, don’t have the soi prefix both in Thai and English.
The old name that have been called before the numeric name was specified is displayed below. (ซอยเลิศสิน ๒ / Soi Loet Sin 2)

As far as I know, besides Bangkok, the English sign for thanon is mostly translated to road, while soi is sometimes translated to alley, but mostly it’s not been translated.
So I suggest, for name:en, to translate thanon to road, but not translate soi.

In addition, in Bangkok, although the side road sign won’t translate thanon, but the overhead sign that attached to the traffic light would translate it to road.
Such as https://f.ptcdn.info/352/048/000/oj6o5zeuuznRtLLRMSQ-o.jpg , while the side road sign is “Thanon Asok Montri”

I agree.

There are also cases where the word soi appears before the rest of the name and in those cases, I map it as it is shown on the sign. In some areas of Chiang Mai the word “lane” is used for “soi” in the English version. In those cases too, I think we should map it as it appears on the sign. But there are problems with that simple approach.

Notably, in my neighborhood of Nong Hoi, there are several sois that end in a Thai character, for example, บ้านช้างคำ หมู่ 11 ซอย 4ค. I transliterated that to Ban Chang Kham Mu 11 Soi 4D. I’m not sure how correct that is but it made no sense to me to leave one single character in Thai script while the rest was in English. I brought this up in this forum years ago and that helped me resolve the issue. I mention it here only because this edge case will have to be dealt with in a useful and consistent manner. This sort of naming isn’t limited to Nong Hoi; I’ve also noticed it in Chang Phuak and elsewhere.

There are no rules without exceptions :slight_smile:

Sometimes mistakes slip in, sometimes old signs persist. Like with that Lane/Alley instead of Soi. I certainly remember and have references where “Lane” is used in Chiang Mai. I am a bit uncertain regarding Alley.

Often space is limited on the street signs. So in the English label the base name is written without Road/Thanon. This will lead to inconsistencies in mapping also coming from the mappers. I think we agreed a long time ago on using the term “Road” for streets. In its full form and not as “Rd.”

For spelling mistakes on road signs (They exist, similar to village signs, schools, etc) we map both variants in name and alt_name to give people a chance to find the relevant thing using the search functionality.

Go Gai, Ko Kai, … is like “ABC” in English. Sois using this are then typically named abc in English, like Dave does.

Ok, so in the end, what would that mean for StreetComplete?

Should everything stay as it is? (always offer user to add English name which tags name:en)

Or should the user both be able to select “English” (name:en) and “Romanization of Thai” (name:th-Latn) as an additional language displayed on the street sign?

Or something else?

1770 ways with “Thanon” in the name:en
https://overpass-turbo.eu/s/Wdf

17345 ways with “Road” in the “name:en”
https://overpass-turbo.eu/s/Wdg

so it is clear that the English “Road” is used in the majority of cases.

You could warn if “Rd” is entered (certainly a mistake) or “Thanon” (not wanted in name:en as per discussion, hint to entering in th-Latn).

I am no real fan of the name:th-Latn. The RTGS can be created quite ok by software. For example using this library by Chulalongkorn University: https://pypi.org/project/tltk/
This is how Sven creates romaized entries where the name:en tags are missing.
And adding a string replace of s/Thanon (.+)/\1 Road/ would even do it right for English instead of pure RTGS.
I thought of doing this for Thaimap, bud had no time to further go into implementation.

I think what we REALLY need are names. Names in Thai script. They are the major point of the map and should be the focus in Thailand. Not some name:en, name:cn or similar.
After Facebook disappeard, HOT got their money for drawing blobs called buildings, we are left with tens of thousands of unnamed or untagged features.

Here the StreetComplete app could give a chance for entering this data with a very low effort and nearly no learning required regarding OSM.

As an alternative we could still think of a way to have Facebook asking their users. We discussed this in one of the conference calls with Facebook. If we come up with a good strategy to validate this input it might work as well.

Ok, given the vast majority of roads using “X Road”, then I can keep the StreetComplete as it is.

It does. Actually, it autocompletes to Road and warns if the user still enters abbreviations. Also, ถ. is autocompleted to ถนน, the same for ซ. for Thai.

Of course, but so is automatic generation of a name:en tag, as you write yourself. Also, the point is that sometimes the signs do not actually use correct RTGS - the surveyors are just noting down the name as they see it on the sign.

Right, this is of course possible and the first thing the users should enter. There are very few StreetComplete users in Thailand though. There is also no Thai translation of the app, maybe this is a barrier.

Well the initial question was should StreetComplete be changed to automatically tag english names with name:th-Latn.
As the App was largely designed to have a user directly enter the info they read from a street sign, then how do they decide where the translation came from ?
I see English, then its tagged as name:en !

We have a map where everything from dual carriageways to airport runways have been tagged as residential roads, a map where we have thousands of squares with just area=yes tags, a map where “tracks” need changing to the paved Provincial roads they have become, & map where numerous incorrect Bangkok edits still need reverting … and were are worried about this !
I just wish we could put the important issues first.

So, Mr. StreetComplete - if you want to make everyone happy, give them the three tagging choices ! Prefeably with a choice of default, as I for one, know which I will use every time.

I understand that you are frustrated with the situation in Thailand and the topic of which name tag to use has little importance to you when compared with the other mapping challenges in Thailand. I reckon that StreetComplete itself has no relevance to the biggest tagging problems in Thailand
However, this doesn’t change the fact that I do not intend to make such big decisions on how something is to be tagged without consulting the (local) community first.

This was a misunderstanding, then. Currently, StreetComplete offers the user to add a name in Thai (obviously) and optionally in English for a street sign in Thailand and tags it accordingly.

The initial question was if StreetComplete should be changed to instead offer the user to add a name in Thai and in Romanized Thai for a street sign in Thailand. So, that the user is going to give the name in “Romanized Thai” (name:th-Latn) would be in any case clear for the user.

The default choice is obviously Thai and this will obviously not be changed. To add another language, you always have to tap once more and select which language you want to add.

The current question is if both options should be available (English and Romanized Thai) or only English.

Well, I implemented it now like this:

Thai-Street-Name-1.webm

“en” is still shown as the second choice, “th-Latn” as the third choice.

That looks fine, well done.

Are you aware that in this case the name suggestion did suggest the name of the main road? Someone had been lazy and did not tag the Soi 48 correctly. Soi is missing in the full name. So the one in your example would then be Soi 50.

Nothing to really blame your algorithm. Any idea to make it more failsafe?