How to report/correct a long list of inconsistencies in hierarchy?

Hi!

We’re currently play around with openstreetmap data and found quite some inconsistencies between administrative areas and the corresponding label nodes. The inconsistencies make it really hard to automatically build a trustworthy hierachy.

We found 53 relations where the admin_level of the relation and the area differs.
We found 1024 relations where the place of the relation and the area differs.

Here is the Google Spreadsheet with the two tables and all the inconsistencies:

https://docs.google.com/spreadsheets/d/1yu1-3F8VYF8hA4ZuE_P_bIyuFpz_XZAbWRjlrDd8sdU/edit?usp=sharing

In my opinion areas and corresponding label nodes should always have the same admin_level and place and whenever there are multiple administrative levels there should be a relation for each one of them, even if they cover exactly the same area (correct my when I’m wrong).

From what I can see most of the time the correct admin_level is at the area. Now my Questions:

Can I just start editing them or will I break things by doing so? - I’m afraid because it affects some really popular areas/nodes.

I guess you will have to contact either the previous mappers or the communities in which those cities are placed. The fact that e.g. the admin level for London is “2” for 4 years means that it is either the value that the community wants or that nobody noticed because they look at the relation only.

IMHO, you should not change the value if you are not familiar with the local habits. Just inform the local mappers, so they can make the right decision.

Please do not start editing without a proper knowledge of each case, and check what the error is.
My opinion is that the info should’nt be redundant, and that it implies no admin_level on nodes, but there are other opinions.

Example, Buenos Aires.
node 81590481 and boundary relation 1224652 both represents the same element (the city), and are linked by role “label” in the relation.
According to same people the admin_level=2 refers to Buenos Aires being the capital of the country.
The label of the country Argentina is in another node with place=country http://www.openstreetmap.org/node/249399280
What admin level should be in this “Argentina” node? 2 as it is the label of a country? none?

Same case with Montevideo.

You have the same type of problem with other capitals and role “admin_centre” (see Rome).

The heirarchy is implied by the boundaries relations, not by an admin_level value.

@AmbiWeb - perhaps you could explain what problem this is causing you?

Hi!

Let me explain the problem in detail:

We started to work on a travel guide based on openstreetmap data. To build a valuable website we need to build a hierarchy for each interesting place on earth so the user can navigate using this hierarchy. The task here is basically reverse geocoding, as it is done by a number of tools and even by the search on openstreetmap.org.

We started by extracting all the administrative areas (polygons/multipolygons with boundary=administrative and admin_level IN (2 … 11) as described on this page: http://wiki.openstreetmap.org/wiki/Tag:boundary%3Dadministrative

The next step was that we tried to actually build the hierarchy. In our case it is really important to know if an area is an country, some subnational administrative area (state, disctrict, region), a populated settlements (city, town, village, hamlet) or an subarea of such an settlement (borough, suburb).

Now lets look at two examples in areas where the inconsistencies cause problems:

Stadium Australia
http://www.openstreetmap.org/search?query=Stadium%20Australia#map=17/-33.84712/151.06339

This is how it is currently geocoded:

This is how it should be geocodes:

Anfiteatro Parque Centenario
http://www.openstreetmap.org/search?query=Anfiteatro%20Parque%20Centenario#map=19/-34.60593/-58.43672

This is how it is currently geocoded:

This is how it should be geocodes:

The problem here is that the information about the place can be present at two different locations (relation or the label node). Quite often the important information to build the hierachy is only available at the label node. In addition most of the tags are also only available on the label node.

The current inconsistencies make impossible to build a trustworthy hierarchy for quite some areas. And as pointed out even the geocoding on the openstreetmap website fails horribly because of those inconsistencies.

Hope this helps to get a better understanding. I am quite aware that I can not fix this kind of stuff without involving the local communities and it should definitively be fixed in the local communities. However I just don’t have the time to contact a lot of peoples to get them fix this.

My current idea is:

  1. Verify with you that this is really a problem that should be addressed and/or come up with a better algorithm for reverse geocoding.
  2. Have someone challenges for Maproulette (http://maproulette.org/) to get this fixed in the local communities.

I would really love to read your thoughts on this.

Best Regards
Tobias

what are the results if you ignore the admin_level of the nodes?

For the case in Buenos Aires, please post also in the Argentina subforum.
Anyway, I have never seen a written address with “Comuna …” so maybe it should’nt be used to this. The “Comuna” subdivisions is something new.
Also the Comunas (admin level 6) are subareas inside the city (admin level 8), so the hieriarchy is not strict.

In Uruguay there are political subdivisions (called Municipio, which should be 4<admin_level<8), but that for some cases a Municipio have several towns inside, in other cases there are cities which do not belong to any Municipio, ant there are cities that have several Municipio inside them.
Probably you haven seen the problem in OSM, because the Municipio are political entities (third level of government) but nobody have drawn yet in OSM, as they are useless, and they are not used in geocoding, but it will raise the same problem you see in Buenos Aires.

The problem maybe begins with using a fictional admin level for the city.

Some part of the problem should be solved by the algorithm.

For example, in Uruguay there are 6 canonical form of addresses. See page 39 of http://ide.uy/sites/default/files/Modelo%20de%20direcciones%20geograficas%20del%20Uruguay%20ed01_00.pdf
Also the UML model is in page 42

It is also the standard about how the addresses should be constructed here. (secc 8, page 55). For this territoty (Uruguay) no algorithm could get the right resulting address if it does not follow those rules. It could get aproximated results, but not the right one.