Fixing OSM <> Wikidata mismatches

I started a project which makes full planet GeoJSON extracts from OSM, mixing it with Wikidata information, for example to get up-to-date population. The project’s website is: https://github.com/hyperknot/country-levels

I have collected the conflicting data in two issues. Considering this is almost 5000 regions, I believe the consistency isn’t bad.

Wikidata ID mismatch
https://github.com/hyperknot/country-levels/issues/2

ISO code mismatch
https://github.com/hyperknot/country-levels/issues/1

I’m fixing these manually now. Can you help me in some parts I don’t understand?

For example there is Daman and Diu region in India, ISO code IN-DD:
https://www.iso.org/obp/ui/#iso:code:3166:IN
https://www.wikidata.org/wiki/Q1158197
https://www.wikidata.org/wiki/Q66710
https://www.openstreetmap.org/relation/1953042
https://www.openstreetmap.org/relation/10087799

I cannot figure this one out. I’ll try to work on fixing the easier ones.

For the avoidance of doubt, if a wikipedia article was used as a source for wikidata, you can’t use wikidata as a source for OSM, as their licence does not permit it. You may of course be able to use the original source that was used to add to wikipedia in the first place - you’d have to look at the licences associated with the sources in wikidata for that.

@SomeoneElse, I’m not adding any data, if ISO codes then I’m always looking up the official iso.org. The only thing I’m changing is link from OSM > Wikidata and from Wikidata > OSM.

Here is an other conflict, city / county of Kent, UK.
https://www.openstreetmap.org/relation/88071
https://www.openstreetmap.org/relation/172385
https://www.wikidata.org/wiki/Q21694674
https://www.wikidata.org/wiki/Q23298

two-tier county GB-KEN Kent
https://www.iso.org/obp/ui/#iso:code:3166:GB

You also have to be careful that wikidata (especially where sourced from Wikipedia) is not merging things which are conceptually separate on OSM. In particular administrative regions having the same name as a place (town,city etc) often only have one wikidata entry.

Kent in ISO undoubtedly refers to the administrative entity not the ceremonial county.

In my experience from many big and small regions, Wikidata is keeping things more separated then OSM. There are many administrative levels “between two admin_level” which do not have an OSM relation but have an ID on Wikidata.

Now, back to Kent.
“Ceremonial” county is https://www.wikidata.org/wiki/Q23298 <> https://www.openstreetmap.org/relation/88071
“Non-metropolitan” county is https://www.wikidata.org/wiki/Q21694674 <> https://www.openstreetmap.org/relation/172385

Which one should have ISO3166-2: GB-KEN? I guess OSM is correct, that the non-metropolitan has the ISO, and Wikidata is wrong. Is it always the non-metropolitan which has the ISO in the UK?

I guess from the fact that Medway has an ISO, it must be the one which doesn’t include Medway, that is non-metropolitan.

I would strongly advise that you desist from editing both wikidata and OSM when you clearly do not understand English administrative structures. There is a significant risk of degrading the data in both sources. UK wikidata tags on OSM have generally been added by very experienced wikidata contributors who are often also long-standing OSM contributors (e.g., Edward Betts & nyuriks). I would suggest asking them questions regarding wikidata items first.

I agree and I know that OSM started in the UK and this is exactly why I was so surprised that there are so many mismatches between OSM and Wikidata.

On OSM I’ve edited 19 relations in the UK only, feel free to checkout and review my changesets: https://www.openstreetmap.org/user/zsero/history
On Wikidata I’ve done 45 edits related to OSM relation ID, feel free to check them out and review: https://www.wikidata.org/w/index.php?title=Special:Contributions/Hyperknot

At the end I arrived at a state where all councils match on ISO codes and GSS codes, except “Bournemouth, Christchurch and Poole” which doesn’t have an ISO code.

Having said that, there are clearly issues, where on one changeset we don’t agree about Wikidata id, with a local, expert OSM user: https://www.openstreetmap.org/changeset/83271335

I’ll post my last comment on that changeset here as well:

There is a whole category of “English unitary authority council (Q21561328)” on Wikidata, with 59 items in total, none of which have GSS or ISO codes or OSM ids.
Query here: https://w.wiki/MU2

The “unitary authority of England (Q1136601)” category on the other hand, contains 57 items, 55 with valid GSS and ISO codes and OSM ids.
Query: https://w.wiki/MU3

So it seems to me that there is definitely an effort on Wikidata to keep these items separate one with ISO and GSS codes ones without. Since OSM clearly refers to the admin region with ISO and GSS codes, I believe they should link to the matching Wikidata ones.

The “unitary authority of England (Q1136601)” class is for the unitary authorities as territorial entities. So if your OSM object represents eg geographical information about the boundaries or territorial extent of the authority, and/or which places or territories it may include, then it is a Wikidata item from this class that you should be linking to.

The “English unitary authority council (Q21561328)” class appears to have been created for councils as employers and service-providers. I’m not entirely sure why somebody thought this was necessary. But evidently somebody did. (Amongst other things, it makes for quite a tricky judgement as to which item the Wikipedia article for the authority should link to, and probably breaks sitelinks between Wikipedia and Commons, as well as links from items in the class above to external sites). But anyway, unless your OSM object is specifically for the council as an employer or a service-provider, items in this second Wikidata class are probably /not/ the items you want to link to, and it is the items in the above previous class that are the ones that you do probably want.

Hope this helps.

Thanks a lot for replying in detail Jheald!

Wikidata seems to be all aligned now, https://w.wiki/MU3 and https://w.wiki/MU2 shows this better than I could describe here.

You are still getting this wrong.

In OSM we have use the on the ground rule and local knowledge should always be put ahead of remote armchaired changes.

There is a single unitary authority, that is Shropshire Council, those are the words which appear on bus stops, correspondence, the Council Offices. There is no Shropshire Unitary Authority.

I have no interest in Wikidata, categories or care what an ISO or GSS code is. I am a mapper who is keen to ensure that my area is mapped accurately by what is On The Ground.

All I am saying is that Q21694759 is a duplication and is therefore incorrect and should be deleted. It does not even point to the English wikipedia page.

If you have no interest in Wikidata, then why do you care about which Wikidata ID it maps to?
If you do have interest in Wikidata, then why don’t you follow the advice of Jheald, who was kind enough to join our forum to help with this issue. He has 450,000 edits in Wikidata and recommends us to use one of the categories.

@Jheald very useful comment, explains a lot.

@zsero: can you please discuss what you are doing on the correct channel for discussing changes in the UK talk-gb. There are several people there who take a great interest in ensuring the ISO & government codes for entities mapped on OSM are accurate.

Do you mean I should open a thread here? https://forum.openstreetmap.org/viewforum.php?id=5
Or talk-gb is a different place?

Here: https://lists.openstreetmap.org/listinfo/talk-gb