Forest import

I would consider to use the forest mask from the Forest Centre of Finland as an alternative source for the import. People taking part on this discussion probably know that in Finland even the topographic database of the National Land Survey of Finland does not have forests. Forest it the default and everything that is not classified to be something else is supposed to be forest.

The documentation of this product is in https://www.metsakeskus.fi/sites/default/files/tietotuotekuvaus_metsamaski.pdf. This is my translation of the process:

Parcels with an area of max. 1.5 hectares and which contain buildings have been removed from the total area.
If parcel is bigger than 1.5 hectares an area of 0.5 ha has been removed around the buildings.
Features which are something else than forest in the topographic database of NLS has been removed (fields, buildup area, lakes etc.) Railway areas has been removed as cadastral parcels. Roads and power lines, if they are not formed into cadastral parcels, has been removed by the width according to the corresponding feature class.

The forest mask is certainly not perfect but the process feels very reasonable.

The Finnish description of the dataset follows:

Kokonaisalasta on poistettu muun muassa kokonaan palstat, joiden
pinta-ala on enintään 1,5 hehtaaria ja joilla on rakennuksia. Suu-
remmilla palstoilla rakennusten ympäriltä on poistettu 0,5 hehtaaria.
Maastotietokannan aluemaiset kohteet, jotka ovat metsätalouskäytön
ulkopuolella (esimerkiksi vesistöt, pellot, taajama-alueet), on poistet-
tu sellaisenaan kokonaan. Rautatiealueet on poistettu kiinteistöinä.
Kiinteistöiksi erottamattomat tie- ja sähkölinja-alueet on poistettu
luokan perusteella määritellyn leveyden mukaan. Lisäksi aineistosta
on poistettu pienet ilman omistajatietoa olevat palstat sekä geomet-
riaprosesseista syntyneet pienet kaistaleet.

-Jukka Rahkonen-

Another option could be to use these vector datasets https://aineistot.metsaan.fi/avoinmetsatieto/Metsavarakuviot/Maakunta/ but I am not sure if the license is compatible with OSM. And if I look at the data of my own forests I think it is all too detailed for OSM and most of the parcels should be merged before import.

I’m sorry to hear this.

P

Yes, there are fields in National Land Survey Topo data and I presume those can be imported quite easily. However, there shoudl be some kind of filtering: OSM could have newer information (like residential area in former agricultural areas.

P

Those are licensed CC-BY-4.0 and I think it is compatible with OSM.

https://www.metsaan.fi/yleistietoa-avoimesta-metsatiedosta

P

I’m afraid this is not just about numbers.

You cannot import auto-generated low-quality data into OSM full stop. If 100% of the Finnish community want auto-generated low-quality data then they need to find themselves another project, perhaps “OpenAutoImportMap” or so.

The Finnish community will never be able to manually rectify this huge amount of data; future members of the Finnish community will despair and run away because it is much more work trying to fix your data than to create it from scratch.

Your attitude towards Zaltys - essentially “if you criticize my import then you have to commit to tracing the forest manually in the areas where you object” is not acceptable.

An import can be good if it forms a basis for the community to work on; if there is a commitment and a plan to make it good. But your import is a capitulation: “We’ll never be able to map this so we might as well import it so that the map at least looks nice and green”. But with this attitude you are gravely harming OSM’s prospects in Finland.

What you can do is this:

Run your algorithm to prepare parcels of auto-generated forest, and then find mappers who are willing to take such a parcel, manually verify and correct it against imagery, and then upload. If nobody is interested in doing it in some areas, then those areas remain without forest until someone does. (And if mappers just take your data and upload without fixing obvious problems, their contributions are reverted for foul play.) This will be a slow process but at least one that generates quality data.

What you can also do is this:

Set up an openstreetmap.fi tile server that generates map tiles which draw forests based on your auto-generated data. Then if anyone wants a map that looks nice and green at the expense of correctness, they can go to openstreetmap.fi, without having to pollute OpenStreetMap with “guessed” forest data.

What you can not do is:

Leave your buggy data in OSM and/or upload more of the same.

Best
Frederik

Ok.

Any suggestion how to remove all those forest plots? I saw some perl scripts done by you, but I don’t know Perl.

Rgs,

P

So when are those forest plots removed? It is now nearly impossible to draw new roads, tracks, paths (in Potlatch) because they get connected to the borders of all those forest plots where a very thin line between them is left open (where the finnish national map shows a path).

You should consider using JOSM as editor for OSM. In JOSM you can disable “snap to line” by holding Crtl while placing a node.

I have been travelling quite recently. I haven’t got any feedback/help for deleting processes. So it seem obvious that I have to plan and excute it myself (and I’m willing to do it). However, it will take a little bit longer.

I have idea just remove all those forest plots which are imported by me. Unless someone has been edited those.

P

Hi,
I appreciate your willingness to remove your edits. Unfortunately I cannot help you with that. Although maybe this particular large import did not work, I hope that you can find some other way of importing useful data, that will improve the map!
Terveisin

What is the current state about this? I have run into a lot of erroneous forests created by this import in my new neighborhood in Oulu and started cleaning them up, which requires a lot of fiddling around with multipolygons.

But after thinking about it again, I was wondering whether manual fixing is the right way here. Should I just leave these - although they interfere with other edits - so that they can be cleaned up automatically?

Terveisin
Farad

I need to prepare custom scripts to make reverts: existing tools will leave some features to database.

My plan is that those custom revert script won’t touch if somebody have been updated features. So, if you clean or modify those forest plots, I won’t revert (aka delete) those.

P

What is the update on this?

Hi!

I have prepared Python scrit for reverting and tested it. If you like help me for testing (and further develop the script), I can share it. Or I can put into github repo or something.

P

github repo would be great, since then the algorithm itself could be discussed

farad

Hi! :slight_smile:

I’m quite new at OSM and have read about this import yesterday. And I’d like to quickly point out a way how to use the data in a way which might be appreciated by the community. Sadly, I won’t have time to do this myself, so this is just supposed to be a quick idea for anyone interested.

Firstly, I agree that this whole import needs to be reverted because what you did, posiki, is overfitting of your data (https://en.wikipedia.org/wiki/Overfitting).

But in my opinion, such a large amount of data is a huge chance to improve OSM if it is used in the right way. Such an import should not cause any harmful interference with past or future edits of the community. After reverting the first import, you could follow these steps:

  1. Vectorization of the pixel forest data: Initial forest patches just as you did, posiki.

  2. Creation of new forest patches, only based on new points at the center of each line of the initial forest patches. Like this, you effectively average each two neighboring points. This eliminates most of the pixel structure without increasig the amount of points. And you avoid overfitting: You use a minimum amount of points to represent the forest shape at the accuracy of the underlying data set. This sows the accuracy of the data to future mappers and makes future improvements as easy as possible because they need to move only a few points for corrections.

  3. Only cut out highways of type tertiary and above. Like this, you avoid pseudo mapping of small roads and paths with small blank lines in the forest as you did before. Thus, future mappers won’t run into accidentally merging your forest points with their new highway points.

  4. Cut out all existing OSM areas except of forest with 3m buffer (buildings, landuse, lakes, sportsgrounds etc.). Cut out forest without buffer. Like this, you avoid sharing points with any existing OSM object except of forest. Thus, future mappers can change these without being bothered by freeing them from your forest.

  5. Deletion of all forest patches below 2300m². The LUKE data set’s pixels are sized 16m x 16m. So you should at least delete all patches below an area of 3x3 pixels, 48m x 48m, approx. 2300m² to avoid messy small forest patches.

  6. Test import and visual inspection at randomly chosen areas.

  7. Community Discussion. Adding additional image processing steps if needed.

  8. Full import.

In my opinion, OSM needs to welcome and professionaly process external data to keep up with e.g. AI based map services in the future. That’s why I want to thank you, posiki, for caring about importing this data. I hope I could help a little with this.

Cheers,
smartsmartie

Hi!

Just note: that this forest removal will happen in next following months. Some result can be seen already in https://osm.org/go/03JQguR-?layers=N

Again, sorry about all bad feelings what this forest import has done.

P

Well,

I will revert my forest import. And leave with rest of the community figure out what to do afterwards

P
ps. sorry to answer sooner on this…

I would like to say good job @posiki ! The algorithm i quite useful for creating data which then can be imported in parts by mappers.

I fully agree that this should be done manually and as long as the forest polygons focus on large enough forest, then I think it should be quite doable to import for people.

For example the norwegian N50 import started in 2014/2015 and is closing in on being finished now.
https://wiki.openstreetmap.org/wiki/Import/Catalogue/N50_import_(Norway)

This data source is old, but still valid. So manual review is required to adjust stuff during import.

Another import we are doing in Norway is import/merging of 4.3 million buildings https://wiki.openstreetmap.org/wiki/Import/Catalogue/Norway_Building_Import

  • 14 people have imported 50% of the data so far in the last 6 months, so large imports like these are fully doable, as long as the data is easily available in human manageable areas.

Some tips to make it easier to import and deal with for other people:

  • Reduce smoothing of edges → easier to deal with polygons for users coming into OSM
  • Is it required to stop the forest when you reach a road? I see no problem with letting the road run through the forests.
    → Continuous forest across highways makes it much easier to correct roads when better imagery arrives.
  • Focus on large forest areas first → Easier to review → Quickly adds the forest-look to the map.

Good luck! I hope you find a good method to do this in the finnish OSM community, maybe some of us from Norway could join in when the data is easily reviable and importable.