Schools Lat/Lon

Hi Everyone,

Data.go.th has released 46,000 coordinates and other details of schools and education centers as an excel files https://data.go.th/DatasetDetail.aspx?id=8548e3ab-00bf-4eae-b29a-156a4aa52c0d the license is Open Government License - Thailand. I’m not sure if it’s compatible with OSDL but if you’re concerned, in the very least it will be useful for things like verification or deriving district names.

Best regards
Mishari

This file will load into OpenOfficeCalc. There is a wealth of data. School name, lat/long, district, post code etc.

Most schools are easy to recognize on the aerial, and the Wat next to it will have the same name usually, also the village.

A wonderful source, thanks very much!!

Could this be automated? 46,000 is a bunch of edits.

A script that put a amenity=school at each lat/long, and maybe the other metadata? Sure would be nice!

Thanks Tom. I don’t recommend automation. Thai government source of similar data have had quality issues such as flipping lat/lon.

Thanks, Mishari.

Lat/long outside Thailand could easily be filtered out. Also data with only 2 decimal places, 5 is enough.

I’ve been playing with this for a few hours, and it works well. My interest in in naming towns and villages; schools and wats only incidentally as they are easy to see from aerial.

I’ve already found a school/wat/village that I could now name, but am reluctant to mess with such an awesome data base without some input.

10°E 100°N … well outside the globe.
I don’t like automation either: many schools are already mapped, and we’d get as many duplicates.

I’ve done about 20 of these (0.0004%!), they go pretty fast- about 5 min each. 3 had bad GPS data.

Is there a source ref?

Thanks!

It’ll be useful to mark the bad ones IMO so that the EGA guys can send feedback to the source.

I have an idea, I think I can convert all of it into some data format that can be imported as a layer into JOSM. Perhaps GPX file. Then you can go through it really fast. How does it sound?

Best regards
Mishari

Well, I’ve done 24634 to 24672. Very tedious and only .0009% done! Any help appreciated. I’ve marked each school in the last excel file column with ‘done’ or ‘gps wrong’. Of course, that’s only on my computer.

Really fast is essential. I can do one line in 3 minutes if I don’t do all the nearby editing generally needed. That comes out to about 57 40 hour work weeks. I like OSM a lot, but not that much.

Thanks, Tom

I converted the data into an OSM file, you can find it here: https://www.mishari.net/wp-content/uploads/2016/05/school_data.osm_.gz

The code is here (licensed under the public domain)


#!/usr/bin/env python3

import csv
import xml.etree.ElementTree as ET

osm = ET.Element('osm')
osm.set("version","0.6")
osm.set("upload","false")
osm.set("generator","Mishari")


with open("ExportSchoolClickEdu_2.csv") as csvfile:
    
    spamreader = csv.DictReader(csvfile)
    
    for i,row in enumerate(spamreader):
    
        node = ET.SubElement(osm,'node')

        node.set("id","%d" % (-i-1) )
        node.set("action","modify")
        node.set("visible","true")
        
        try:
            float(row["Latitude"])
            float(row["Longitude"])
            node.set("lat",row["Latitude"])
            node.set("lon",row["Longitude"])
            
        except:
            continue
        
        name = ET.SubElement(node,'tag')
        name.set("k","name")
        name.set("v",row["SchoolName"].decode("utf-8"))

        subdistrict = ET.SubElement(node,'tag')
        subdistrict.set("k","addr:subdistrict")
        subdistrict.set("v",row["SubDistrict"].decode("utf-8"))

        district = ET.SubElement(node,'tag')
        district.set("k","addr:district")
        district.set("v",row["District"].decode("utf-8"))

        province = ET.SubElement(node,'tag')
        province.set("k","addr:province")
        province.set("v",row["Province"].decode("utf-8"))

        try:
            postcode_int = int(row["PostCode"].decode("utf-8"))
            postcode = ET.SubElement(node,'tag')
            postcode.set("k","addr:postcode")
            postcode.set("v",row["PostCode"].decode("utf-8"))
            row["PostCode"].decode("utf-8")
        except ValueError:
            pass
            
        
ET.dump(osm)

Hi Mishari,

That could work, or it could start World War 3 and I wouldn’t know the difference :-).

It might be a start to break the file into 76 files, one for each province.

I can draw map features, but that’s about all.

Thanks, Tom

Hi Tom,

You can download the .gz file above, extract it (you may need to rename it so that it ends in .osm) and open it up into JOSM.

Best regards
Mishari

That I can do, thanks for your patience!

Let me know how it goes.

I just did a bit more research, the data is available under the Open Government License - Thailand which is almost a verbatim copy of the Open Government License UK which is covered under the Ordnance Survey Opendata so it should be fine!

I’d really like to have some professional or official clarification of this before we proceed. I can’t seem to access data.go.th at the moment, but from what I remember the Thai OGL included conditions such as not using the information for illegal purposes, something which can’t be assured under the OSM contributor terms.

I just finished going through 500 lines of this data, all of Nakon Phanom. I was able to name about 350 villages.

Only 30 had wrong GPS data, and about 100 didn’t use the village name in the name of the school. About 30 were already named, making a good crosscheck. The school names use first the village name, then a few often untranslatable syllables tacked on the end.

Mishari’s program worked brilliantly, but for a Potlatch user it’s a bit hard. There is also an ominous warning about strongly discouraging use of the data layer! http://s1356.photobucket.com/user/TomLayo/media/Mishari%20Tool_zpsu6okimhz.png.html
is a screen shot of the north half of the data.

I got the process down to a few cut and pastes, and will share info if there is any interest.

Oops, make that 250. Anyway, starting Udon Thani now.

Tom, if you want, you can look through the data sets at http://forum.openstreetmap.org/viewtopic.php?id=54805 and I can convert them for you as needed?

I will also need a mapping the raw data and it’s corresponding OSM tags

Best regards
Mishari

I’m interested in the botanical gardens and tourism sites also.

For now, I’m good with the raw data. Most of the time is spent on fixing near by roads and such, not so much on mapping the data.

This is clearly a very long term project.

Thanks!

Regards, Tom