Open data for localities

Hello everyone. I am Aleksandar Matejevic editorial lead at Microsoft Open Maps Team. I have a question regarding this dataset https://www.scb.se/en/services/open-data-api/open-geodata/localities/

Since it is open, are there are no issues with licenses we would like to add these polygons to the OpenStreetMap and connect them with places nodes that are already present on the map. In this way we would improve geocoding, the map will look better and we would have population numbers also. We would keep the ref for each polygon so it could be easily to run checks with new datasets in the future.

Does this sound OK to the community? Any suggestions are welcome.

Thanks,
Aleksandar Matejevic

SCB data is potentially a source of errors as well. SCB has places which only exist for statistical reasons. You could in fact get highly misleading geocoding results instead.

This means that even if you do the obvious job here - look for missing place names - you would potentially end up adding those places which should not be in OSM.

This seems to be “tätorter” from SCB.

Do you have an osm- or geojson-file with the polygons in question so that we may review them?

How do you plan to tag the polygons?

How will you conflate with existing polygons for landuse=residential, retail, commercial, industrial etc? In particular for larger localities such as a town which consist of several connected localities/suburbs?

How many of these localities are already mapped in OSM with a place=* node?

How exactly will these polygons improve geocoding, given that most localities already exist as a place=* node? Do you have an example?

@Wulfmorn - Thanks for the inputs, the idea is to add these as boundary=census, and them to create places relations with nodes that exist on OSM and are matched with the name of the polygon from SCB. If these two can not be combined, relation will not be created.

@NKA can you open the geopackage file from the SCB in qgis or you still need me to export it to osm or geojson?
We would add these as boundary=census since these are statistical polygons, not places polygons, so there will be no redundancy with residental, retail, commercial or any other boundary area. We do realize that towns consist of suburbs, but as far as I saw in this dataset, there are none created, there are just larger polygons for cities, towns, villages…
I have found ~1800 out of 2011 polygons that match with place tagged on a node on OSM and approximate is just because there are more place nodes that fall just outside of created polygon so this number can get even higher. Example https://ibb.co/Y2KrhMQ

As for how it will iprove geocoding, go and check for Källbacken on https://www.openstreetmap.org/ it will give you Neighbourhood Källbacken, Knivsta, Knivsta kommun, Uppsala County, 741 92, Sweden BUT Källbacken is neighborhood of Alsike. What currently happens is that NBH point is connected to the closest town, which leads to wrong geochain. Example: https://ibb.co/GtSvyXt

This would be fixed with adding census polygons and creating relations. Also, houses, POIs, streets, etc within Källbacken will be in correct geochan of Alsike as town. And imagine that there is a division between Kommun, then it is an issue on even higher level. Polygon better defines the boundary then algorythm and makes geocoding more precise.

I believe users here need to see the polygons in order to have an opinion about them.

I think the problem with this proposal is that there is no one-to-one relation between these polygons and place names. One polygon may contain several place names of the same “level”. Also, the natural reach of place names may go far beyond these polygons. SCB has created these polygons based on how close households are grouped together, not on how any areas are named.

In your example Källbacken may well be part of Knivsta although it is within the Alsike polygon. In fact, all addresses in Källbacken have Knivsta as its addr:city.

So I think for many places, these polygons will obscure geocoding rather than improve it.

A real improvement would instead be if you could coax them to release addresses as open data.

Can you point me to that dataset or this is stil not published? If not, do you have any contact?

They require substantial payment and likely extensive licenses :slight_smile: