Hello Thailand OSM Community,
I am really glad to hear about the outcome of your meeting. We have the same goals and I'm happy to work with the community to make OSM better with high quality mapping. I acknowledge your feelings about communication and appreciate you understanding my position, but rest assured that we plan to continue openly sharing our work through forums and lists.
For those not on the import wiki where many questions about process have been asked I have added the questions and answers to our wiki page to make it easier to follow
For the test run we are happy to do the following:
Define an area in Thailand where you plan to import. This area has to be communicated to us and agreed on beforehand.
We plan to start importing from the provinces in the south of Thailand. As mentioned previously in the import list we have created some sample data for Yala to share. To make it easy I have pasted it below.
Phase 1 - Generating Road Masks. These are the masks of the initial output from machine learning. These are not processed to a vector format. We have offered to share our tile service to anyone. Sample 1 | Sample 2 |Sample 3 | Sample 4
Phase 2 - Creating Road Vectors. This is the processed road vectors merged with current OSM data locally -These .osm files that have NOT been edited or validated by people and would not be uploaded as is. Please download the package here.
Phase 3 - Human Validation. This is the final output validated by mappers twice in order to be uploaded to OSM. Here are screenshots with the machine generated roads highlighted for human mappers to validate:
Sample 1 | Sample 2 | Sample 3 | Sample 4
We also ask that you choose an area having high-resolution Bing imagery available. As we don't get access to the DigitalGlobe imagery you are using we want to have a way to verify the accuracy of your work.
Luckily it's only a matter of weeks before DG refreshes the imagery as they mentioned
But in the mean time we think Songkhla could work since Bing is good. iD-Songkhla, Using JOSM - Bounding box for Songkhla (minlat="7.0751953" minlon="100.546875" maxlat="7.1630859" maxlon="100.6347656")
We ask you to restrict the import to higher road classes. So don't import agricultural and rice-field tracks.
We can do this, but I humbly ask you to consider letting us give you everything our machine generates before we add it to OSM? This way you can see exactly what the ML is picking up.
This will give you a much more accurate view of what we would like to do going forward. Going through and deleting roads our machine generates would be fairly easy for us, but it is time consuming and counter productive and you might just see the quality is high enough to accept.
From my experience with OSM communities agricultural roads are least likely to get drawn so I think this is where Facebook could potentially help. Having the road network was by far one of the most helpful features to give people context of where they are mapping both for remote mapathons and field mapping.
If after the sample you feel differently I am happy to re-visit deleting agricultural roads.
We also ask you to only import roads where your algorithm has a high confidence level of the geometry and classification.
Yes, I totally agree with you here. We only add roads with a high confidence level. While we do have algorithmic help for tagging, our editors manually check each edit for quality in geometry and classification. We have done a great deal of of research and training, and have discussions for each task to make sure we are uniformly mapping and tagging. In regards to tagging we will ultimately rely on the community to guide us.
Using DG's Vivid+ imagery and have found we can get a 30% increase in road coverage for Thailand. Because imagery in OSM is much older in most areas we are able to pick newer roads with this imagery. Here are some examples of places we have been able to get road masks for in Thailand compared with OSM's current imagery. Image 1, Image 2, Image 3, Image 4
Here is an example of how we plan divide the tasks for the country. The colors indicates the current density of roads going from blue to red for high density areas. Of the 77 province boundaries we plan to start in the Southern Region in the province of Yala. Image
Please let us know if you agree on the test area in Songkhla so we can prep the data for this area and share it with you before adding it to OSM.
After that I agree to wait for your feedback before proceeding to import directly to OSM and planning out a strategy for scaling up.
If there is another meeting I am happy to join or come in person as I would love the opportunity to meet the community in person
Drishtie Patel on behalf of the OSM at Facebook Team
Last edited by DrishT (2017-04-07 21:17:41)
Being an active contributor in the deep south, I'm very keen for you to get started with Yala, Pattani, Narathiwat and I know there are very few mappers down there.
I've glanced at the samples you've provided, seems to look ok, waiting to hear people's ideas on how we move forward on this specific case.
A strong requirement was that we have proper imagery to verify the geometry of the detected roads. This is coming from the AI and is probably the thing the most difficult to fix if it is not up to our standards So this is something we have to carefully evaluate.
Yala does not have Hi-Res Bing imagery. Only Landsat images. Mapbox z17 imagery seems to be there. Compare the area west of the city center to see the difference.
As you are familiar with the area, please suggest a bounding box for Facebook to start with.
We have created some sample data for Songkhla where Bing is good. Using iD editor -Songkhla, Using JOSM - Bounding box for Songkhla (minlat="7.0751953" minlon="100.546875" maxlat="7.1630859" maxlon="100.6347656")
Please take a look at our sample .osm files here and let us know what you think.
Drishtie Patel on behalf of the OSM at Facebook Team
I had a look at tile67.
I opened the data in JOSM and moved the new data to a separate layer.
JOSM validator seems to be happy with it, so at least no obvious mistakes in the data.
The areas I have checked seems to have a proper geometry. As this likely depends on the underlying imagery it is up to your validators to spot bad geometry before importing the way into OSM. Here we obviously have easy to process imagery for your AI.
How certain are you regarding the positional accuracy of the imagery you are using?
For example around this coordinates:
Google has Digital Globe 2017 imagery, which sounds like the image set you are using.
We can see that the road here is actually drawn south of the position. So was some offset applied to the position or do you have even another image set?
Bing Images quite agree with the digital globe imagery regarding the position of the roads. The north/south way on the very left is a bit off. This might result from differences in the orthorectification applied.
Bing imagery was taken March 2013.
Mapbox images of the same area show a significant offset of roughly 30m in an east/west direction.
The few existing GPS tracks in that area support the alignment of Bing/DG, but does your verification process includes a step to check for positional errors? You should add a verification step for each tile to ensure that the used imagery alignment is plausible.
Some of your highway=service roads could have additional tagging to clarify what kind of service road it is.
This one looks much like parking aisles:
Also I am still not convinced how useful it is to have highway=track like these:
I understand the "usefulness" is highly subjective, so waiting for feedback from Mishari as he said they want such ways for cycling.
Here you classified a road as "unclassified":
I might have tagged this as a highway=service as it only serves as a way to reach the National Library and maybe another building in the woods.
This is probably a general corner case of tagging and not specific to your import. But we will just see a higher quantity of these.
highway=service around something looking like dormitories.
What is the opinion of others on such corner cases? Would it make sense to leave a note in the area saying this is a corner case and needs local survey? This is what a typical mapper probably would do.
Here around the aqua culture I would have used highway=track. If clearly visible adding a surface=paved tag.
Here it might be an unpaved residential road. With all those houses and driveways certainly not agricultural.
This highway=unclassified does not really interconnect villages or neighborhoods. Could be a residential if it is a housing estate. Otherwise service.
Positive thing: You did not connect roads from housing estates with outside roads where there is obviously is a wall and validator warns about a nearly connection.
Why is there a new node at the end of a way Johnny created? That one is not connected to the road network:
In the past you had added surface tags to ways. Your sample data does not contain any. Is it planned to do this in the future?
So a lot of points. Mostly of the "minor" category. Think these edits are as good or bad as the average remote mapper.
Regarding tagging: Do we want to have import=yes tagged on all those ways? You should certainly refer to a wiki page with details of your import in the changeset comment of the upload. This should also clearly explain why you are using an "disallowed" source of DigitalGlobe. Having that source tag seems to be justified as the geometry might differ to the imagery available to the general mappers and avoid confusion.
Another point: I strongly recommend to add the attribute upload="never" to the <osm> element in your sample data to prevent accidental uploads while the community does the review.
Thank you for your and your team's contributions to OSM. I looked at tile80 and was generally happy with the mapping. I did note that some of your edits were higher risk than I would have been happy with, that is they joined up visible road sections with obscured sections of road. For example at 7.0948845 100.6209229 you show a residential road passing between two buildings. My reading of the Bing image is that a footpath at best is there and perhaps no continuous path at all. I am curious how you made your decision. Do you have access to better images, is your AI considering factors that I haven't thought of or is your process resulting in a higher risk of false positives than I am comfortable with?