You are not logged in.

#26 2017-06-16 15:10:20

zstadler
Member
Registered: 2012-05-05
Posts: 348
Website

Re: Automated edits for branches

I may have misunderstood what you meant by "Sticky?".

I thought you're asking about making this thread sticky in the forum. That, I assume, has nothing to do with templates...

Offline

#27 2017-06-16 16:01:44

SwiftFast
Member
Registered: 2017-04-10
Posts: 451
Website

Re: Automated edits for branches

This is what I meant.

The first post of this thread links to the templates which are used by the bot to edit the branches.

Offline

#28 2017-06-18 12:45:05

tdctdctdc
Member
Registered: 2017-05-23
Posts: 25

Re: Automated edits for branches

I'd like to help with the other templates, starting from supermarkets.

I want to be sure I understand the "syntax" and the script operation correctly. Is this correct?

{ tags: { 
	"name": "מגה", "name:he": "מגה", "name:en": "Mega",
	"name:ar": "", "name:ru": "", "brand":"Mega",
	"shop":"supermarket", "website": "..."
	}, alt_find: [], variants: [

		{ tags: {
			"name" : "מגה בעיר", "name:he":"מגה בעיר", "name:en": "Mega Bair"
			}, alt_find: ["Mega Ba'ir", "Mega Ba'Ir", "Mega BaIr"] 
		},

		{ tags: {
			"name" : "מגה בול", "name:he":"מגה בול", "name:en": "Mega Bool" 
			}, alt_find: [] 
		}
	]
},

Does the script consider case ("Bool" vs "bool")?
Does the script match the variant before the main brand ("Mega bool" before just "Mega")?

Offline

#29 2017-06-18 19:51:13

SwiftFast
Member
Registered: 2017-04-10
Posts: 451
Website

Re: Automated edits for branches

The syntax is standard JSON and your sample appears fine. I should add a blank template to make this easier.

The script checks all matches. If something matches both a variant and a brand, variant wins. If something matches two unrelated branches or two variants, errors are thrown and my manual intervention is required.

Matching is completely case and whitespace insensitive. bool and Bool are considered the same.

I am considering writing the "matching algorithm" explanation in a more friendly manner.

Last edited by SwiftFast (2017-06-18 19:53:26)

Offline

#30 2017-06-18 19:57:16

SwiftFast
Member
Registered: 2017-04-10
Posts: 451
Website

Re: Automated edits for branches

For your specific example, Mega Bool has a "fuzzy match" for mega but a full match for "mega bool", so it is matched to mega bool.

The main (Mega) brand template tags are then applied, and then mega bool's tags are applied on top of them, overriding some of them.

Last edited by SwiftFast (2017-06-18 19:57:45)

Offline

#31 2017-06-19 18:12:38

tdctdctdc
Member
Registered: 2017-05-23
Posts: 25

Re: Automated edits for branches

What about the type of the POI? Is it checked/matched as well? Or only the name is matched and all other tags are copied?

For example, if we have amenity=restaurant, name=Hapoalim - will it match to Bank Hapoalim's template?

Offline

#32 2017-06-19 19:06:34

SwiftFast
Member
Registered: 2017-04-10
Posts: 451
Website

Re: Automated edits for branches

if a POI's amenity=* or shop=* does not already equal the matched brand's amenity or shop tag, then the POI is left alone and a warning is put in a log for me to manually inspect.

I put considerable thought into these edge cases, but the documentation is probably a bit tricky to read as of now.

Last edited by SwiftFast (2017-06-19 19:06:55)

Offline

#33 2017-06-20 10:06:47

tdctdctdc
Member
Registered: 2017-05-23
Posts: 25

Re: Automated edits for branches

Added some supermarkets to the templates.

In anticipation of possible questions:
* I extracted the most common names from an Overpass query for shop=supermarket on the central part of Israel.
* I tried to find "official" names on the supermarkets' websites, Wikipedia or the most common results on Google.
* Whenever I couldn't find an official English name, I used the Hebrew Academy's transcription rules (http://hebrew-academy.org.il/wp-content … ion2_1.pdf) - Ba'Ir, Bul, BaShkhuna, etc..
* Russian names are based on my personal transcription rules, which may not be optimal smile
* Shufersal Big - according to news articles, this variant is canceled and converted to "Deal", so added "Big" as alt_find for "Deal".
* Tiv Ta'am "in the city" - on their website, the branches in Hebrew are named with only a "סיטי" suffix, so that's what I used.
* Yesh - it appears (from their website, I have no personal knowledge) that the brand is Yesh BaShkhuna. It has a variant - Yesh Hesed. There doesn't seem to be only "Yesh".
* Eden Teva Market - on their website, the branding is "Eden Teva" only, without the "market", so that's what I used.
* When searching for "Mister Zol", I came across multiple news articles that it was sold to and became "Coop Shop", so I added it as alt_find.

Offline

#34 2017-06-21 09:17:25

tdctdctdc
Member
Registered: 2017-05-23
Posts: 25

Re: Automated edits for branches

Added pharmacies.

Offline

#35 2017-06-21 18:45:13

SwiftFast
Member
Registered: 2017-04-10
Posts: 451
Website

Re: Automated edits for branches

Thanks for the effort! I'll review the changes and run the bot as soon as time permits

Offline

#36 2017-06-22 12:43:34

tdctdctdc
Member
Registered: 2017-05-23
Posts: 25

Re: Automated edits for branches

What do you think about using the banks templates for ATMs as well?

Offline

#37 2017-06-23 15:38:06

SwiftFast
Member
Registered: 2017-04-10
Posts: 451
Website

Re: Automated edits for branches

What do you think about using the banks templates for ATMs as well?

It sounds reasonable. Do you think this algorithm change is sufficient?:

if (amenity=bank or amenity=atm), attempt regular bank matching rules.

Offline

#38 2017-06-23 15:58:45

SwiftFast
Member
Registered: 2017-04-10
Posts: 451
Website

Re: Automated edits for branches

Some of your alt_finds are redundant: e.g. "SuperPharm", "Super Pharm". There is a normalization for template values, which ignores upper/lower case spaces, dashes.

Last edited by SwiftFast (2017-06-23 16:02:18)

Offline

#39 2017-06-23 16:14:17

tdctdctdc
Member
Registered: 2017-05-23
Posts: 25

Re: Automated edits for branches

Some of your alt_finds are redundant: e.g. "SuperPharm", "Super Pharm".

I just added all the spelling/formatting variations that I saw being used, I wasn't aware that spaces and dashes are ignored. Redundant is better than missing, right? smile

Offline

#40 2017-06-23 16:19:45

tdctdctdc
Member
Registered: 2017-05-23
Posts: 25

Re: Automated edits for branches

Do you think this algorithm change is sufficient?
if (amenity=bank or amenity=atm), attempt regular bank matching rules.

Perhaps it needs to be adjusted to use operator=* instead of name=* for ATMs? It seems to be the recommended and more commonly used way.

Offline

#41 2017-06-23 17:11:12

SwiftFast
Member
Registered: 2017-04-10
Posts: 451
Website

Re: Automated edits for branches

I updated the templates you added. I Removed redundant alt_finds. And replaced http with https where possible.

Offline

#42 2017-06-23 18:42:36

SwiftFast
Member
Registered: 2017-04-10
Posts: 451
Website

Re: Automated edits for branches

Excellent job, @tdctdctdc. I tried a test run and it seems to work well, and I can see no major problems. Would you like to make further changes before I run this?

Offline

#43 2017-06-23 18:57:27

SwiftFast
Member
Registered: 2017-04-10
Posts: 451
Website

Re: Automated edits for branches

It seems some people are not making the proper distinction between pharmacy and clinic, e.g.:

https://www.openstreetmap.org/node/1110299067

https://www.openstreetmap.org/node/1110299067

https://www.openstreetmap.org/node/1078636168

I wouldn't let the bot automatically assume those are pharmacies. Some of them are probably not. Perhaps we should drop the Makkabi and Clalit pharmacies for now.

Last edited by SwiftFast (2017-06-24 09:36:54)

Offline

#44 2017-06-24 13:46:08

tdctdctdc
Member
Registered: 2017-05-23
Posts: 25

Re: Automated edits for branches

Would you like to make further changes before I run this?

I have no more changes for now, go ahead and run.

I wouldn't let the bot automatically assume those are pharmacies. Some of them are probably not. Perhaps we should drop the Makkabi and Clalit pharmacies for now.

If you think it's best. I agree about Clalit - since they don't have a separate brand for their pharmacies, there is a potential for errors. As for Maccabi - I think if the POI is named "Maccabi Pharm" - it's very likely it's really the pharmacy and not the clinic.

Offline

#45 2017-06-24 19:44:06

SwiftFast
Member
Registered: 2017-04-10
Posts: 451
Website

Re: Automated edits for branches

Changes applied to supermarkets and pharmacies.

Offline

#46 2017-10-19 21:19:23

SafwatHalaby
Member
Registered: 2017-04-10
Posts: 451
Website

Re: Automated edits for branches

I think the algorithm needs several improvements and simplification before further runs. It's a bit bulky, and it cannot handle certain cases (e.g. AM:PM is either marketplace or convenience, and Clalit can be a clinic or a pharmacy).

Last edited by SafwatHalaby (2017-10-20 05:42:08)

Offline

#47 2017-10-20 09:14:59

tdctdctdc
Member
Registered: 2017-05-23
Posts: 25

Re: Automated edits for branches

You could exclude problematic cases (like Clalit and AM:PM) completely, or perhaps apply only name changes and not touch the existing value of amenity=*.

Offline

#48 2017-10-22 09:35:23

SafwatHalaby
Member
Registered: 2017-04-10
Posts: 451
Website

Re: Automated edits for branches

There are more issues. e.g. the bot insists on modifying names that shouldn't be modified and I manually intervene. And the code is hard to follow.

Since brand editing volume is low, a human can manually track them. I'm considering a simpler CSV scheme, where the bot a generates a CSV with changes and with suggested default values according to templates, and the humans decide which ones to copy. The decisions are saved in the CSV and fed back to the bot.

This is just one of several ideas I'm brainstorming. I might just fix the code.

Last edited by SafwatHalaby (2017-10-22 09:35:38)

Offline

#49 2017-11-15 07:30:08

SafwatHalaby
Member
Registered: 2017-04-10
Posts: 451
Website

Re: Automated edits for branches

I think I pinpointed the most problematic issues with the current tagging scheme.

Currently, the bot has "brands" (e.g Shufersal) and sub-brands (e.g. Shufersal Sheli, Yesh). Brands are added to the "brand tag". A sub brand, if present, is added to the "name" tag, otherwise the brand itself is added to the name tag.

This creates two major problems:

  • The sub-brand occupies the name tag, meaning that adding a unique name overrides the sub-brand if both a unique name and a sub-brand are present.

  • The brand is often not present on the ground. E.g. Mercantile or Yesh are "sub-brands" of Discount and Shufersal, respectively, and yet the ground truth makes no mention whatsoever of Discount or Shufersal at these branches. So only the "sub-brand" has physical meaning, the rest is cooperate management stuff unrelated to OSM. Furthermore, brand ownership is dynamic on the long run. Sometimes brands/sub-brands are transfered between companies. e.g. if the Mercantile brand is ever transferred to some other bank, we'd have to re-tag the "brand" from "Discount" to whoever the new owner would be, despite the ground truth remaining the same.

I suggest ditching the "sub-brand" concept altogether. There's just brands on the ground. Examples:

  • brand=Mercantile, name=Mercantile (or a unique name if known)

  • brand=Shufersal Sheli, name=Shufersal Sheli (or a unique name if known)

  • brand=Delek, name=Delek (or a unique name if known)

  • brand=HaPoalim, name=HaPoalim (or a unique name if known)

Offline

#50 2017-11-15 19:43:41

tdctdctdc
Member
Registered: 2017-05-23
Posts: 25

Re: Automated edits for branches

I think we should look at this from a business/marketing point of view, and make the following distinctions:

  • Entities that are owned by the same parent company, but are not really related (Shufersal Sheli vs Yesh). They don't have the same "branding" - name, logo, tagline, colors, corporate identity, etc... In my opinion, they can be separated to different brands.

  • Entities that are clearly sub-brands (Shufersal Sheli vs Deal vs Express; Bank Hapoalim vs Hapoalim business) can remain as-is.

The case of Mercantile vs Discount bank is a difficult one. On one hand, it fits the definition of a sub-brand (name, logo, etc). On the other hand, it appears to be a separate company (yes, controlled by Discount, but still separate).

The argument about re-tagging Mercantile if transferred to a different bank is weak in my opinion. Same could be argued if Shufersal Extra branches would be sold to Rami Levi, for example. Of course it would need to be re-branded and re-tagged.

Offline

Board footer

Powered by FluxBB