Lat/long outside Thailand could easily be filtered out. Also data with only 2 decimal places, 5 is enough.
I’ve been playing with this for a few hours, and it works well. My interest in in naming towns and villages; schools and wats only incidentally as they are easy to see from aerial.
I’ve already found a school/wat/village that I could now name, but am reluctant to mess with such an awesome data base without some input.
It’ll be useful to mark the bad ones IMO so that the EGA guys can send feedback to the source.
I have an idea, I think I can convert all of it into some data format that can be imported as a layer into JOSM. Perhaps GPX file. Then you can go through it really fast. How does it sound?
Well, I’ve done 24634 to 24672. Very tedious and only .0009% done! Any help appreciated. I’ve marked each school in the last excel file column with ‘done’ or ‘gps wrong’. Of course, that’s only on my computer.
Really fast is essential. I can do one line in 3 minutes if I don’t do all the nearby editing generally needed. That comes out to about 57 40 hour work weeks. I like OSM a lot, but not that much.
I just did a bit more research, the data is available under the Open Government License - Thailand which is almost a verbatim copy of the Open Government License UK which is covered under the Ordnance Survey Opendata so it should be fine!
I’d really like to have some professional or official clarification of this before we proceed. I can’t seem to access data.go.th at the moment, but from what I remember the Thai OGL included conditions such as not using the information for illegal purposes, something which can’t be assured under the OSM contributor terms.
I just finished going through 500 lines of this data, all of Nakon Phanom. I was able to name about 350 villages.
Only 30 had wrong GPS data, and about 100 didn’t use the village name in the name of the school. About 30 were already named, making a good crosscheck. The school names use first the village name, then a few often untranslatable syllables tacked on the end.
@Mishari: I remember we had a video chat with Dietmar a while ago. Can his house-number validation tool used for this purpose as well. That way it would only list not processed data, filtering out existing/correct one.
Splitting the data into provinces could make it easier for some as well.
If the file would be an online document (eg google docs) mapper can mark ranges as processed. Still an automatic filtering would be more reliable and less manual work.
I agree with you that we should refrain from automatic import and use it as a chance to also double check the spelling/location of nearby temples and villages. Even if the process takes a bit longer that way the quality we have at the end will be much higher.