Im trying to do a dump into tableau of the central america data on openstreetmap to create some custom visualisation.
For that, ive converted the .pbf file on central america from openstreetmap into an excel form and now tried to do a simple left join in sql between the dataset files containing points (which has the longitude and lattitude coordinates) and the other dataset containing multipolygons.
However when i do the left join, im using the osm_id across the 2 files to make the join (this seems to be the only unique identifier i could find in the 2 datasets)
On executing the script, the join finds no pairings.
I also tried doing an inner join and got no results.
From what i can see, it seems it cannot find any links between the 2 datasets. Is the osm_id not the right field to be using to make the join?
Is it actually possible to join the 2? If so, how?
Im trying to do this because i want to pull in some info from the multipolygons dataset around, but since that one doesnt have coordinates i have to join it with the points dataset to get the longitude and latitude.
For context, the points dataset has about 800k rows, and the multipolygons dataset has about 1.4million rows.
I’m not familiar with the joins you’re trying to do, but I suspect you may be skipping a step. A multipolygon relation would typically contain ways, not nodes. If you look at one of the multipolygons, you’ll likely see references to the member ways. You’d need to look up those ways first, and then look up the nodes that compose those ways. Trying to jump straight from a relation to nodes won’t work for multipolygons.
You also need some logic to compute the areas described by the multipolygon. This logic isn’t simple and has to handle several possibilities.
I think tools like Osm2pgsql do this for you. https://wiki.openstreetmap.org/wiki/Osm2pgsql
the “points” file is the only one with x and y coordinates which also contains ‘osm_id’ field.
The “multipolygons” file has both an ‘osm_id’ field and an ‘osm_way_id’ field.
However when i do a join between the 2 osm_id fields across points file and multipolygons file i dont get a match.
You said before i was missing a step and a table?
Do i need to bring in another one of those tables and do a join with osm_way_id first before trying to join with the osm_id?
I think this is the source of your confusion. While these fields are labelled the same, they’re not the same IDs.
There are three different object types in the OSM database: nodes, ways, and relations. Each of the object types has its own ID numbering starting at 1. For example, you can have node #1, way #1, and relation #1, and these all refer to different objects in the database.
For the multipolygon relations you’re looking at, each relation has its own ID (“osm_id”) within the relation numbering. This multipolygon relation contains a number of member way objects, each referenced by its ID within the way numbering (“osm_way_id”). Going further, each of these ways contains an ordered list of the nodes that compose the way, each referenced by its ID within the node numbering. Therefore, if you want the nodes that ultimately compose a multipolygon, you’ll first need to do a join to look up the ways that compose the relation, and then look up the nodes that compose those ways.