Test Drive - AI-assisted road import by Facebook

Riding my bike thru the province west of Phetburi, I was faced with an annoyingly bad quality of the map.
Many roads were unpaved, some of them actually tracks. Some “residentials” did not have any house along it.
The worst case I encountered can be seen on this photo (click for a larger version):

According to the map, there were “unclassified” roads along both sides of the canal, which I already changed (you may need to look into the history of the ways):
http://www.openstreetmap.org/way/514899120/history
http://www.openstreetmap.org/way/514899118/history

I suggest that the tagging policy be revised.

  • If a minor road may be unpaved, add the surface=unpaved tag in case of doubt. Finding out that it has a concrete or asphalt cover while expecting an unpaved road is not as bad as the other way round. Note that we assume a residential/service/unclassified etc. to be paved, a track not to be paved if no other information is given.
  • Public roads may end in an agricultural track (and that track may then connect to another public road). In case of doubt, cut the way into pieces with a highway=track inbetween.
  • If you are not sure if it is an agricultural/forestry track or a public road, prefer highway=track.
  • The traffic on “residential” is mainly caused by the people living in that road (or other people visiting them). “unclassified” roads lead to such “residential” roads. So, many of the current “residential” should actually be “unclassified”.
  • If there is no house along the road, it is not a residential.

The first point is most important in my opinion.

Dear Facebook team (Digital Globe AI team) … I wonder if anyone else feels like me … in that this massive project has just fucked up a lot of data integrity and quality (in Thailand, at least) in the name of rapid progress to meet your needs whatever they may be.

I mean in real life, I ride a lot in rural Chaing Mai & Chiang Rai provinces, somewhere I doubt you ever visit. Last weekend, to circumvent a village traffic jam, I head down some “residential roads” … and of course, soon ended up in a muddy quagmire in the middle of an orchard. Not a bloody house in sight ! Definitely not residential and not even been marked as unpaved !

Back on the computer, I see loads of residential roads, running through orchards and fields. Some do connect villages, but are clearly a dirt surface, but in which case our Wiki guidance would say tag as an “unpaved, unclassified road”.
All of which show the AI tag and those dubious user names … RVR006 and the like.

Well Dear Stephanie in Menlo Park, California … why did you do this ? Why not look at the road, zoom in, check its surface, see it runs through an orchard… then mark it as a track, or at the very least, give it a dirt surface.

So where do we go from here … our rural mapping in our region can no longer be trusted, and how long before the data becomes the norm in car navigation devices, and these poor people in their family saloons end up getting stuck in orchards and farmyards.
And then what, the word goes out … “don’t use OSM maps, they are a bunch of crap”. Years of hard human input by dedicated Thai mappers remains useless.

I see in many places, ways connect in illogical places, and simply end in the middle of nowhere because your AI systems dont realise the trees obscure short sections, and simply stop.
Your validation can’t work, because to correct things properly … well that would take a lot of time … as much as we take plotting in the first place.

And the accuracy, is arguably just too much … I often see a straight road, which we all can see is straight… having a zillion nodes after its been input by you guys … trust me, when a road is plotted, its much easier to make it straight! A human would see that and not use all those unnecessary nodes, which can only serve to slow and clutter the OSM dataset.

And just what is your motive … you seem to enjoy inputting every road your system generates … but with a complete lack of care for the type of road, its name, its surface, or if it connects. When a human maps, we favour plotting the roads that connect the towns, or go somewhere. We mark the through (main) roads through villages as unclassified, in the hope that navigation systems prefer them … with you guys, apart from a few exceptions, everything just gets banged in as residential.
And that’s not good for the poor residents in these side streets watching lorries get stuck every day … more bad press for OSM.

Stephan takes the time to individually show you how you are failing us … I just see so much bad mapping, Im too overwhelmed to try.

Should I lobby the DWG on reversing EVERY changeset attributed to this “team” ? I’ll await your comment before reposting this in more visible areas of the forum.

Russ, your experience seems to be worse than my worst experience. That area west of Phetburi was the worst part. Another almost such bad area was near Surat Thani (also such an AI import).

My impression is that for you the most important point is the “unpaved issue” too, i.e. roads which should have the “surface=unpaved” tag (or rather “highway=track”) but don’t.
That issue occurred to me far more often that last year, i.e. before the AI imports.

The “residential issue” does not exist for me on bike, my map style renders unclassified and residential equally.
Because I know that things look different for other vehicles, I improve the data when I see such issues while integrating my data collected on ground. But let’s also not forget that there are some other mappers who use residential far too often.

But I must also say that the other some 2000 km were not such bad. As I’ve just started integrating my data, I cannot yet tell the reason - I could have preferred roads mapped by other means, cycled in region where the AI team hasn’t started work yet, where things were easier for their algorithms, …

If it turns out that the bad experience is typical for their quality, also I’d vote for a revert.

Interestingly, I was contacted offline by another mapper. who revealed the FB AI import for Egypt was completely reverted because it was so bad !

As a bit of background on the problems in Egypt (I did a bit of the analysis of what had been added but not the revert):

What seemed to happen there that any long straight feature with a colour differential was being interpreted as a “road”. The effect of this was that pipelines, canal banks, and (in residential areas) walls were being detected in error as roads. There was also the same sort of problem that you’re describing - everything was either “road” or “not a road”; there was no thought given to residential / track / tertiary / whatever. Where roads went through residential areas (for example a few houses around a dusty square) the roads were missed completely. Generally speaking, though, the Egypt additions were far worse than the worst of the Thailand ones (unless anyone knows any different).

With regard to Thailand, in this case I’d suggest the usual - if a mapper has done something in error, tell them about it, tell them what they’ve done wrong and tell them what they need to do it fix it. Obviously in the case of brand new mappers we need to be helpful and tolerant of people who are still learning, but these accounts have been around for nearly a year now and have made a significant number of edits.

At least 2 of the accounts who edited the ways mentioned earlier have been active within the last 24 hours, but no-one has tried to contact them via changeset discussion comment since October. I’d be surprised if they monitor this forum, so I wouldn’t assume that they are even aware of a problem.

What I’d also do is monitor current edits by the mappers in question and raise issues as soon as they appear. Obviously some things will only get picked up when you’re actually on the ground, but it’s often possible to say that something looks “unlikely” (e.g. road vs track) without a visit.

Best Regards,
Andy (from the Data Working Group)

@SomeoneElse: DrishT - the author of this thread - is our contact to the Facebook team.
You are right, contacting individual members directly may speed up the information flow to them.

The “wall issue” in Egypt is different from a “wall issue” in Thailand: often new neighborhoods built near towns have a wall around them, and there is only single entry to the neighborhood.
I found residential roads from such a neighborhood connected to a track (which of course was tagged as an unclassified) nearby. The wall was not ignored.

I sent following message to the Facebook Team members who were active in the Phetburi area:

The people contacted are:
VLD003
VLD009
VLD010
RVR007
VLD011
VLD007
RVR002
RVR009

Hello,

Thank you as always for your feedback. Happy New Year! I hope you’ve all had a lovely holiday season. Sorry for the delay in response we are just getting back into work mode :slight_smile:

I’m sad to hear you are seeing issues. I hope you see we have been responsive to all requests and quickly make edits whenever anyone in the community asks. In some cases the requests often conflict each other so it not as easy to made these decisions. I understand your frustration on roads and routing especially having travelled through many rural areas in Africa, South America and Asia including Thailand! So let’s see how we can make this better. Our goals are the same of making OSM better and we are super grateful for your on the ground knowledge.

As far as quality and data integrity, while our roads are generated by AI our team edits each task 3 different times using the Tasking Manager to ensure various team members are editing consistently and at a high level of quality. We also use quality checks created at Facebook and open source ones such as JOSM, OSMCha, OSMOSE and Keep Right. Additionally our Quality Analysts continuously check live data.

There is quite a bit to digest from your e-mails so let me try to summarize so I can work with the team to make sure we can help solve this. The two main issues are.

  1. Too Many nodes on straight roads - We are looking into this and will make changes accordingly.
  2. Road Tagging - The tagging is all done by the editing team and not automated and we are cognizant of the limitations of remote mappers, so try to follow the best practices we’ve learned from the local community and OSM guidelines.
    1. Tracks - Earlier we were told by some community members not to use this tag, however some have found it useful. After studying local edits, we decided to keep a minimum amount of Tracks for roads that are mostly agricultural or forestry.
    2. Residential - In an effort to make sure we do not neglect homes, we chose to mark any roads which serve as an access to housing, without function of connecting settlements. Sometimes this mean we are making the decisions if we see 1-5 buildings on that road. I understand things change on the ground and maybe those building do no actually exist as we see in satellite imagery. In those cases we are super grateful you can re-tag the roads based on your knowledge. If you have a different suggestions to tackle these roads please do share and we will be happy to take that into account and make changes.
    3. Unclassified - As seen from most of the local mapping in Southern Thailand unclassified is used to link villages and hamlets. This is why we mark minor roads of a lower classification than tertiary, but are not used to access houses unclassified roads.
    4. Paved/Unpaved - As much as we would like to add this tag it is difficult to do so with high accuracy so as a remote mapper. As the local community if you have some group guidance on this it would be great for us to follow. We assume based on OSM guidelines that the use of the road is what determines the Highway tag and the quality is then set in this case as either paved or unpaved. Is this the same way you see roads?

The best part of OSM is the community working together to make the map better. If your rules for tagging are different from the way we are doing so please may you provide some examples (way IDs and how you would tag them differently) and so we can learn with you and make the map better especially with your local knowledge. I think reverting thousands of really good edits due to some tagging concerns would be a loss for OSM.

For holidays and weekends you can always also e-mail my team directly osm@fb.com as a faster way to reach us.

Best,
Drishtie

Actually (and this applies to all sides of the debate here) I’d suggest that changeset discussion comments are a better way forward than a private email address. That way everyone in the community can see what potential issues there are and what is being done to address them.

Thank you for your response, DrishT and colleagues. I still welcome your project and your efforts, and hope to be able to provide some hints leading to an improved quality of the data.

I’ll collect some more examples when I’ll work on my data from the holidays.

Meanwhile, you can take a look at
https://www.openstreetmap.org/#map=14/8.5383/99.0690

The 5 “circles” there look like small settlements in a plantation.
I came from North to the settlement in the north-east, via bad tracks, some of them are not yet on the map.
Some roads in that settlement are paved, some are not which can be well detected in the imagery (Digital Globe Premium Imagery).
The “roads” to the west and east should be tracks.
The road leading out of it towards the south was somewhen paved, but most of the asphalt cover has been lost (which cannot be detected in the imagery).
The road in the center of the area leading east towards the major road #4037 is a wide asphalt road (unclassified, or perhaps even tertiary; I did not see kilometer stones or DRR signs there). With the imagery, you can see that (assuming you come from the east) it turns north/right just before you connect it to a track, and at the next junction west/left.
Almost all other “roads” in that area should be tracks.
And it is easy to see that you missed some parts of the circular roads in the settlements.

It will take a few days till I’ll edit that region. My last updates were with the GPS data and waypoints from Dec 7, and I visited the region described here on Dec 23.

Hello Bernhard,

This is a great example and we are in complete agreement on this. Look forward to more examples :slight_smile:

Appreciate your feedback.

Thanks,
Drishtie

By the way, when I wrote the message above, I opened that region in a web browser and started iD. In order to see the roads in the imagery, I had to move some nodes of the ways, because the ways are shown with very thick lines in iD by default (don’t know if or how to change that).
Maybe that could be a problem for your team also? In JOSM, the lines are thinner and the imagery can easily be seen without moving any nodes away.

During the last days I mapped along the coast south of Hua Hin. A very touristic region, hence many mappers have been there already and contributed many data. Consequently, the imports did not contribute so much and the amount of problems was less, while north of Prachuap Khiri Khan.
Well, south of Phrachuap Khiri Khan it was … terrible. But not due the imports. A user did an enormous amount of work in 2013/14, and got many things wrong, apart from highway classification also waterways (generally layer=-1, i.e. below the surface of the earth), and also connected them to roads and railways.
When you learn from such data, you can but learn it the wrong way.

I’d like to add a couple of examples with import issues which you asked for:
way id(s) original classification → correct classification
514017526 unclassified → track
513728209 residential → track
514040248+514040250 joined
514037787+514037788 joined
514332683 service → track
514345749 residential → unclassified
513730178 too many nodes

Individual mappers have to be ultimately responsible for their own individual edits. It’s great that there’s also a direct contact within the team that’s directing these mappers, but one of the strengths of OSM is the ability of individual mappers to communicate among themselves.

W makes the lines thin in the ID editor. Would it be possible to collect Mapillary or OpenStreetCam data to help the remote mappers?

More careless work from the FB import team…

Not sure if a one-off, but good evidence of the imports destroying our core Highway data through either carelessness of lack of vetting. I have corrected it.

Actually, we can’t see from this picture what kind of road it is. We can only see it isn’t paved. I agree that residential roads should have houses alongside, at least from time to time (access to rural houses will not always have dense buildings along).

If the track is legally accessible, why would you suggest to interrupt the road network at this spot, is this about surface or ownership or access restrictions?

+1, although what is a track depends a lot on the local legislation and context. In rural settings, most roads, also “normal public roads” are typically serving agricultural and forestry purposes. The decision for track is if they only serve for these purposes.

So for today’s examples of the AI mess, recently I was in Ban Pong near Phayao. A couple of times my GPS showed me the main 1193 hwy seemes off from the true position. Back on the Computer, and with the benefit of multiple GPS traces and newer imagery, we can see that the Bing image it was traced from is “off”, but the newer DG-Std image is correctly aligned.
However, I also see mass of residential roads have been added by Micheal (VLD011) without checking for image alignment, so now in addition to the Hwy, now we have to correct all those.

Furthermore, in just that one screenshot above, working downwards, I see …

1/. The road below the fuel station is actually a track leading to fields. Thats a market above it, not a house.
2/. The school on the left has its access road draw as residential, and for some reason, does not continue in a loop back to the road. Local mappers would know this and draw it as a service road, with some even adding a permissive tag.
3/. Further down on the left, we see a residential loop going around the back of the village Health Centre… of course, this would also be a service road. The health centre is not, and does not look like a house, and besides, the access road passes across the front of the building too … guess the AI missed that one.
4/. And opposite the health centre, we have the District Office … also marked as residential and with only one section of the access road drawn… in reality its a loop, and of course should be service road.

So my point yet again is … if in one small area, we find so much bad mapping, just how much damage has been done to the whole of the Thailand Map. Corrections that local mappers will never have the time to rectify.

I will also send this post to the team osm@fb.com, but please, I have not got time to comment to individuals, or post in every changeset affected as some OSM purists might suggest.

I can spend an hour a day flagging these issues, or get on with the more worthwhile job of inputting data. I know the DWG sees these posts so please, how do we go about reversing all the “imported roads” so as we local mappers, at least have a chance to enter the data in correctly. Can someone help me write a script that deletes all roads marked as import=yes ?

Is the problem really “all import=yes” roads? Ultimately it’s a human that presses the button to add the data, and I’d expect that different people will have different quality thresholds. To take an example from elsewhere, in an African semi-import by several mappers it was clear that there was one whose quality threshold was lower than the others (and in that case adding lots of duplicates). After engaging with everyone involved there we ended up reverted only that one contributor’s additions and letting everyone else continue - it got rid of most of the problems and left most of the valid additions.

If you have not got time to comment on changesets, then the DWG (just like most OSM mappers all volunteers also) are unlikely to have be able to analyse the type and scale of the problems, and other mappers in Thailand who may not frequent this forum won’t know either. It’s great to see the examples posted here (and a link to the actual data would make those much more useful), but without specific details we simply don’t know who the main offenders are, and without any comment to reply to we don’t know their side of the story.

With regard to problem editors, what we’d like to be able to do is:

  1. See comments made about a particular mapper’s work

  2. See that mapper either not respond or do the same sort of thing again.

  3. If they continue creating the same problems despite being told about them we can then take action against that mapper.

With regard to problem data, the wiki page at https://wiki.openstreetmap.org/wiki/Change_rollback is a summary of the options available (perl revert scripts and JOSM). The two main problems with any approach are:

  1. Identifying the data to be reverted.

  2. Deciding what to do with data that has been edited by other mappers afterwards (undo all their changes, keep all their changes, or review after the revert)

The approach with the perl revert scripts is generally “throw all problem data at the script, and then subject to (4) above, handle any remaining problems”. With JOSM you’d go backwards through all changes and interactively handle (4) as you go. Depending on the individual situation either of the two main options (or even some other approach) may be better.