ODbL in Thailand

@amai: Please stop with conspiracy theory here. There is a legal mailing list for in-depth discussion of the new license. Right now there is no license change. By agreeing to the new contributor terms you give the OSMF the possibility to change the license. It’s not needed to be ODbL. Might also be a CC 4.0, but this would also the agreement to the new CT.

I don’t know whether you are a lawyer specialized in international copyright law. But based on the assumption of some lawyers our data is not protected. So some US company COULD just take the data, use it and ignore the share alike (because you can’t copyright on facts).

PD would make the data freely available (creative-commons suggested this), but with this we also loose the share-alike. And a lot of people have trouble with the imagination that some “evil” company could “abuse” the data entered by volunteers without giving back to the community.

In my opinion it’s the right way to have clear license situation also including the commercial use of our data. OSM is also about commercial use.
For non-commercial nearly no one would need OSM. In Thailand the data provided by Google still has greater coverage than OSM. And it’s allowed to be used even commercially while complying to Googles TOU. For private nc use there are even more alternatives.

So why doing OSM? Sure we want it to be the best and most comprehensive data provider in the world.
When companies get interested in our data it’s an appreciation of our work and quality.

Please read some infos from creativecommons, the guys who invented the current CC-BY-SA.
http://creativecommons.org/weblog/entry/26283
They state that our data is not protected due to the problem that facts are not covered by copyright. They admit that they need to consider database law.

Stephan

@stephankn: Please stop odbl or new CT advocacy here!!

There is and has been no democratic decision of any board or even all members which was in favour of even asking people for a change. That is just a decision by a minority.

As for the license/ct itself:
There is no know lost case anywhere which would have disturbed anybody from within OSM. There is no proof of odbl to prevent that anyway.
And also it’s not theory… As for me I am 100% sure I don’t get income related to OSM, for the odbl drivers we know opposite.
And also it got confessed that their business would benefit, that is not a secret.

The given link shows up a comment from the author, and not in favour of odbl. In any case “that facts are not covered by copyright.” is a weak, and at best local argument (countrywide…) . When I create map data that is pretty much like doing an artwork, taking a picture. And in Germany (yes, I am German) that is clearly covered by copyright, even if it just shows “reality”. Even I if would take a picture from your face I would be the single copyright holder :slight_smile:

But back to the local focus: From a neutral point of view the benefit of any license switch is disputable, the problems are obvious and guaranteed.

amai

Please provide proof of why “those” would benefit for their business.

CreativeCommons suggested to use CC0 for scientific data. Wouldn’t that be the best option for their business? They could do whatever they want without any need to give back to the community.
But they decided against what’s best for a business and said they wanted a license that provides better guarantees of share-alike than the existing one.
Sounds not very logical, Not if you accuse them of trying to cheat the community.

In real life it most times the most likely thing that’s true. Extraordinary claims require extraordinary proof.

For German law I’m quite sure that mapping facts in a database is not covered by copyright law. There is a lack of threshold of originality.
In Europe our work is protected by database law. A thing that is missing for example in the US.
That idea was not brought up by the OSMF but from CreativeCommons (see link, that’s why they work on 4.0 regarding this) and the lawyers from ODbL. Both parties not related to those “evil” OSMF board members you talk about.

Why should it be a bad thing to make money with OSM data? Are you jealous because you don’t?
Think of Linux. Companies like Redhat or SuSE. They made millions out of the voluntary work of others. Think of apache. Millions of installations world-wide. What do they give back to the community?
The community happily contributed because the GPL gave them the certainty their work stayed free (freedom, not free beer).

I like the idea of my work being useful. Making the world a better one :slight_smile: If some company uses free data and the quality of the product is high enough that people pay for it, be happy that you have been part of the community creating the data. And because we have free data we can always take the data and make a similar product available free of charge.

On the legal mailing list had been a long discussion on how the share-alike of ODbL could be circumvented. They found that there is no practical way to do so. No danger from that side.

Cause their customers don’t want to share what they pay for… Or other way round: less customers want non-exclusive content…

Same for a simple picture of my or your face - shot within 2 seconds, but copyrighted.
… Note that a database which cannot be used to produce works out of it is quite useless…

And note that I don’t reproduce “facts”! A phone number (http://www.osmfoundation.org/wiki/License/Why_CC_BY-SA_is_Unsuitable) is a fact in any representation, I create a small excerpt from reality filtered by various technical tools and hard work. So my mapping is quite different compared to the quoted phone/address book.

Even if I would be - it wouldn’t count as an argument!?
Funny, if ones doesn’t agree to odbl people show up with paranoia, from working for companies in the same business, being jealous, want to become lonely hero and other interesting things.

Ok, actually you are right that is not the perfect forum for that.
However it is important that people understand what is to come, and the odbl map for Thailand would miss many many ways.

amai

The phase 4 of the license change is in effect now for about a week. That means only users that have agreed to the contributor terms can continue editing. Mails went to all mappers that had not responded yet.

For Thailand most data is already relicense-ready as the majority of mappers agreed to the new contributor terms.

From the top 100 contributor (for the ease of counting, used last editor on nodes) the majority responded and agreed to the new contributor terms. These 100 mappers contributed about 99% of the data.
Only two actively voted against relicensing their data. Unfortunately amai still fears the license change. If he does not relicense then 4% of the nodes and 12% of the ways are lost.
2% of the nodes (0.7% of ways) are accounted to mappers that did not respond yet.

The statistic is available on my server:
http://downloads.osm-tools.org/check-odbl-th/check-odbl-th-20110625.html

Another interesting figure might be that the amount of data in Thailand doubled since the beginning of the year. Certainly a lot of this is based on armchair mapping. Hope people on the ground catch up and improve the geometry with data like street names and POIs. Please don’t forget to add the names also (mainly) in Thai script. You can contact me if you need assistance in doing so.

Keep on mapping,

Stephan

Hi stephankn,

thanks for keeping us up to date. A lot has changed till my last post. It’s amazing and i hope it will go on like this.

Would say that we still need some more Thai people joining us, but i am confident this will happen over the time. Still a lot of work to do in Thailand. But OK, we all like the country and like doing some useful work in our spare time. And it is too hot to hurry here :slight_smile:

Greetings,
WanTan

Another update.

The statistics now lists edits from anonymous users as declined.

The License Working Group suggests now to start replacing non-odbl data in case you have the same data. So delete the affected nodes/ways and replace them with your surveyed data.
In case you don’t have local knowledge please do not delete data as we want to keep a usable map.
LWG announcement: http://lists.openstreetmap.org/pipermail/talk/2011-July/059458.html

ODbL statistics are here:
http://downloads.osm-tools.org/check-odbl-th/

The script now also generates a filter line to be used with osmfilter.
I prepared extracts that can be used as a separate layer in JOSM to find spots that need to be reworked. The JOSM licensechange plugin helps, Potlatch can also display the license status.
Decompress with 7zip then open in JOSM.

File listing last edits of anonymous users:
http://downloads.osm-tools.org/thailand-anon-20110723.7z

File listing last edits of all declining users:
http://downloads.osm-tools.org/thailand-declined-20110723.7z

Here you can see it on an overview image:
http://downloads.osm-tools.org/thailand-declined-20110723.jpg

The Bangkok area has a lot of streets that are drawn from aerial images. It’s easy to replace this. A residential area is very obvious on aerial images. Please pay attention as some images might have an offset. Always download the GPS tracks of the area you are editing and correlate the image with GPS tracks. Be especially careful in Bangkok as the GPS tracks can suffer from multipath effects.

I have included links to the Where Did You Edit (WDYE) service. You can see the region where the user is mapping.

http://downloads.osm-tools.org/check-odbl-th/check-odbl-th-20110802.html

Edit: corrected link

Until July I only replaced objects having a licence change issue when I touched them in the course of my mapping. But now there are the LWG statement about replacing non ODbL data and the JOSM Plugin LicenseChange. I started to search and replace such objects since begin of August. To be precise it’s not just a replacement, it’s an improvement: taking into account Bing offset and adding more objects with higher accuracy than the deleted objects. By the way it was raining cats and dogs here and I finished adding the collected data from my last trips and reached 500,000 nodes last modified.

The province Khon Kaen was my first target. The most data to be replaced were about 170 ways with 2800 nodes of one decliner traced from Bing. In the bounding box for the province there are now less than 20 nodes left which I can’t replace. Just deleting them could be regarded as vandalism even if it’s a small number. Some might still agree.

The JOSM plug-in from Frederick helps a lot. It’s fast and checks the whole history not only the last change. With this I found that I had added “long ago” the Thai name to some of the hospitals added by someone not agreeing yet.

Now I’m going to replace issues in the whole Northeast. Again from the major decliner about 1000 ways with about 12,000 nodes traced from Bing. A lot but easy to do.

I can’t resist to add that with JOSM and the plug-in it’s a “pleasure to hunt down” the licence change issues. I think it’s better than waiting until a worldwide tool will delete the data. A mapper with local knowledge is better in assessing the issues case by case. And it’s easier to replace existing data than to “re-add” them. The transition will be smoother. But the major advantage for me as mapper is that I don’t have to care about licence change issues any more when editing in a cleaned area.

Happy mapping from a happy mapper
Willi

Edit: corrected link to plugin

Northeast is done. There are licence change issues left which I couldn’t resolve. May be others who have touched the objects can do that.


|                  Total   Loss    Loss possible   Reduction possible
Relations:           201      1                0                   10        
Ways:             37,739     16               19               9 (121) (*)
Ways (km):        65,375    428              207           597 (5,832) km
Nodes untagged:  493,185    609              261                  216    
Nodes tagged:      8,724      1              282                    5    

(*) 9 ways still have nodes of non agreeing users. From the other 112 ways the nodes have been deleted but the non agreeing users are still in the histories. Thus these ways still get flagged by the plug-in.
The figures are inaccurate due to the use of inaccurate boundaries and ways crossing the boundaries.

Thanks for the new table. It’s great that the old tables are still available. Comparing the tables from July 23rd and August, 19th shows that the definitive loss is down from 85,000 (4.77%) to 70,000 (3.67%) nodes. Major decliner down from 65,000 (3.66%) to 51,000 (2.71%) nodes. And anonymous edits down from 19,000 (1.06%) to 17,000 (0.92%) nodes. Recently I recorded GPS tracks and characteristics of the highways 1009 and 1192 (Doi Inthanon) which where created anonymously and replaced them. As far as I’ve seen the most anonymous edits are in the Northern Region. Unfortunately the link W doesn’t work for anonymous.

Would it be possible to count the untagged nodes (just nodes of ways) and the tagged nodes (assume mainly POI and some other) separately?

Hi,

I uploaded another update of the data.

http://downloads.osm-tools.org/check-odbl-th/check-odbl-th-20111121.html

If you followed the OSM announcements you might already know that the date of the license switch might be the 1st April next year.

So up to that date we should try to replace anonymous creations and data of disagreeing or non-responding users with more accurate data.
With the availability of bing imagery in large parts of Thailand it should be possible to also have accurate road geometry even in situations where GPS has problems. The curvy road to Doi Suthep is one example where Bing is a lot better than my GPS tracks.
Likely it’s the same on other parts.

A quite useful tool is the license check plugin in josm.

For anonymous edits this is not working.

Would it help if I provide updated extracts containing the untouched anonymous data? Like the ones i published before?

Stephan

PS: Willi, what is the purpose of the different counting? Would require some additional parsing of the data but not too complicated.

Untagged nodes are mainly nodes of highways or other features which can be traced easily and quickly from Bing satellite images. Even drawing an average line from several GPS traces is faster at least by a factor of 10 than adding tags to POI’s especially in a foreign language. As I’ve written earlier in this thread in August I replaced non ODbL data in the Northeast, mainly data which one decliner had just traced from Bing. It took me about 2 weeks to replace about 14,000 untagged nodes without going somewhere. But replacing the tagged non ODbL nodes would require travelling and more time for editing.

Thus I think counting tagged and untagged nodes separately would tell more about the effort needed.

Today I got a list of changesets where the user accepted the new contributor terms but wanted to stay anonymous.
Good news: all nodes, ways and relations listed as “anonymous” belong to a user who agreed to the contributor terms. That’s about 17.000 nodes that are safe.

I updated my statistic script to reflect this change.

Frederik also put an interactive map online that uses his odbl history service to trace back the history of objects. All data belonging to users who not agreed to the ct is highlighted on the map. We have the next three months to do a survey of these areas. Let’s make the data even better than before.

http://tools.geofabrik.de/osmi/?view=wtfe&lon=100.45877&lat=15.18415&zoom=7&overlays=overview,wtfe_point_harmless,wtfe_line_harmless,wtfe_point_modified,wtfe_line_modified_cp,wtfe_line_modified,wtfe_point_created,wtfe_line_created_cp,wtfe_line_created

I did a graph to show the development of the safe nodes in Thailand:

Over 70 percent of the data is ready for the transition to the new license. A huge amount of nodes and ways might be ready but a simple analysis will not tell, so history needs to be evaluated for accurate numbers. We also have nearly 80 people who once edited in Thailand but never answered to the request to relicense data.

Do you know one of these? Please try to contact them. Might be great if they decide to accept the new contributor terms.

Stephan

To keep you updated:

As of the quick lookup based on data of this morning there are 34 users who did not respond and 2 who actively declined new CTs.
http://downloads.osm-tools.org/check-odbl-th/check-odbl-th-20120307.htm

I did run a detailed counting of the ODbL status and currently we will have a loss of less than two percent. Still some of these elements carry valuable tags.

Total nodes in DB: 2387515

  • safe for ODbL: 2375273 (99.49%)
  • tainted: 12242 (0.51%)

Total ways in DB: 176494

  • safe for ODbL: 174192 (98.70%)
  • tainted: 2302 (1.30%)

Total relations in DB: 1015

  • safe for ODbL: 1008 (99.31%)
  • tainted: 7 (0.69%)

Stephan

edit: inserted more detailed statistic, corrected wrong percentage calculation

We have three more weeks, still about 2000 ways left which will get lost.

Here is a diagram to show the countdown. I’ll try to update it daily.

If you extrapolate you’ll see we are not ready till the end of the month. So please increase your effort in remapping.
It would be a shame if we lose valuable tags that are edited on top of tainted ways just because we did not replace the way with a better clean one.

Certainly there are areas which do not provide aerial images and where the GPS tracks had not been uploaded. Here you could try to reach the mapper who did not yet respond. They all did receive a message from OSMF already. Could have happened the email is no longer active. Maybe you know them from another forum and contact them that way.

The statistics on http://downloads.osm-tools.org/check-odbl-th/ now only distinguish between clean and unclean, but take into account the full history of the elements.

Stephan

I do not think that we will lose so many ways.
Look at the typical scenario: a dissenting user traced a few minor roads from Bing. He connected the minor roads to major roads, thus adding nodes to the major road. What will be the consequence? Those additional nodes of the major road will be lost, not the whole road. Correct me, if I am wrong.
Another example: someone created a road and added a name tag with English content. amai corrected that and changed “name” to “name:en”. What will happen? The “name:en” tag will be lost. Since also the “deletion” of the original name tag is an edit by amai, also that deletion should be undone, and we will find the English road name again in the name tag instead of the name:en tag. Correct me, if I am wrong.
The only problem I see is with the roads created by dissenting users. When I use the OSM inspector and select “Ways created” only, the problem is not such big.

It depends on how clever the switch bot will be programmed. Internally a way has nodes and tags. Both in union form a version of a way. As they are not treated independent in respect to object version it all depends on the logic of the bot.
In the most simple case the last “clean” version would be used. So all subsequent edits on nodes and tags reverted.

The better approach would be to treat tag and node edits independent. So in your example only the extra nodes would be lost. I hope it will be implemented this way but it is still work in progress.

In this easy case yes. But what happens to edits built on top of amai? A tag can be changed back and forth. And again it depends on how the algorithm is implemented. Personally I would favor to keep the latest clean version of a tag.

Even assuming that would be done, it still is causing trouble. We’ll end up with lots of tags that need to be fixed. Currently we see the problem highlighted by the plugin in JOSM. Easy to spot and fix.
In the future it will be hidden in all the data.

So doing it right before the switch saves time later.

The rebuild working group is trying to figure out the implementation details.
http://lists.openstreetmap.org/pipermail/rebuild/2012-March/000099.html

Yes. I’m personally working on these problematic ways for many hours already, others do as well. So the number of affected ways is down already.

I tend to be on the safe side and have a rather pessimistic view regarding the loss of data. So all we could do now to make the data ready for the switch is a plus. I wish your optimistic assumptions come true, but I certainly feel better if we do not have to rely on it.
Quite often it is also possible to really improve the data. I had remapped data from old yahoo images where meanwhile the layout of streets and buildings have changes. So Remapping is also quality improvement.

Stephan

The OSMF announced a schedule for the database cleanup.
http://www.osmfoundation.org/wiki/License/Rebuild_Plan

On the 27th the database will go into read-only mode until cleanup is done.

In Thailand we’re quite lucky compared to other countries. Here measured today only 0.8% of the ways will be degraded.

Still the majority of tainted nodes/ways is related to the armchair mapping of amai. We have another five days to reduce the effects of the database cleanup. I won’t get this easy again to spot the problems.
A good time to start the sprint…

Stephan

It’s not yet clear whether it’s the 27th or later, but expect the deadline for remapping approaching fast.

We have 2467 tainted nodes and 758 tainted ways left.

Having a closer look at the ways:
in north-east only harmless edits left.
On Samui will be more damage. As neither GPS tracks nor imagery is available no more remapping possible.
Similar on Samui.
Phuket will also lose a few ways.

Most severe damage still in Bangkok. It still has tagged ways that build on top of tainted ways.

The latest editions of the statistics provide a link to overpass to fetch tainted ways. Copy URL and paste in JOSM “Open location”. You could also save the OSM file to hard disk and open in any other editor.

http://downloads.osm-tools.org/check-odbl-th/

Let’s do the finish…

Stephan