Automated edits for branches

The syntax is standard JSON and your sample appears fine. I should add a blank template to make this easier.

The script checks all matches. If something matches both a variant and a brand, variant wins. If something matches two unrelated branches or two variants, errors are thrown and my manual intervention is required.

Matching is completely case and whitespace insensitive. bool and Bool are considered the same.

I am considering writing the “matching algorithm” explanation in a more friendly manner.

For your specific example, Mega Bool has a “fuzzy match” for mega but a full match for “mega bool”, so it is matched to mega bool.

The main (Mega) brand template tags are then applied, and then mega bool’s tags are applied on top of them, overriding some of them.

What about the type of the POI? Is it checked/matched as well? Or only the name is matched and all other tags are copied?

For example, if we have amenity=restaurant, name=Hapoalim - will it match to Bank Hapoalim’s template?

if a POI’s amenity=* or shop=* does not already equal the matched brand’s amenity or shop tag, then the POI is left alone and a warning is put in a log for me to manually inspect.

I put considerable thought into these edge cases, but the documentation is probably a bit tricky to read as of now.

Added some supermarkets to the templates.

In anticipation of possible questions:

  • I extracted the most common names from an Overpass query for shop=supermarket on the central part of Israel.
  • I tried to find “official” names on the supermarkets’ websites, Wikipedia or the most common results on Google.
  • Whenever I couldn’t find an official English name, I used the Hebrew Academy’s transcription rules (http://hebrew-academy.org.il/wp-content/uploads/Transcription2_1.pdf) - Ba’Ir, Bul, BaShkhuna, etc…
  • Russian names are based on my personal transcription rules, which may not be optimal :slight_smile:
  • Shufersal Big - according to news articles, this variant is canceled and converted to “Deal”, so added “Big” as alt_find for “Deal”.
  • Tiv Ta’am “in the city” - on their website, the branches in Hebrew are named with only a “סיטי” suffix, so that’s what I used.
  • Yesh - it appears (from their website, I have no personal knowledge) that the brand is Yesh BaShkhuna. It has a variant - Yesh Hesed. There doesn’t seem to be only “Yesh”.
  • Eden Teva Market - on their website, the branding is “Eden Teva” only, without the “market”, so that’s what I used.
  • When searching for “Mister Zol”, I came across multiple news articles that it was sold to and became “Coop Shop”, so I added it as alt_find.

Added pharmacies.

Thanks for the effort! I’ll review the changes and run the bot as soon as time permits

What do you think about using the banks templates for ATMs as well?

It sounds reasonable. Do you think this algorithm change is sufficient?:

if (amenity=bank or amenity=atm), attempt regular bank matching rules.

Some of your alt_finds are redundant: e.g. “SuperPharm”, “Super Pharm”. There is a normalization for template values, which ignores upper/lower case spaces, dashes.

I just added all the spelling/formatting variations that I saw being used, I wasn’t aware that spaces and dashes are ignored. Redundant is better than missing, right? :slight_smile:

Perhaps it needs to be adjusted to use operator=* instead of name=* for ATMs? It seems to be the recommended and more commonly used way.

I updated the templates you added. I Removed redundant alt_finds. And replaced http with https where possible.

Excellent job, @tdctdctdc. I tried a test run and it seems to work well, and I can see no major problems. Would you like to make further changes before I run this?

It seems some people are not making the proper distinction between pharmacy and clinic, e.g.:

https://www.openstreetmap.org/node/1110299067

https://www.openstreetmap.org/node/1110299067

https://www.openstreetmap.org/node/1078636168

I wouldn’t let the bot automatically assume those are pharmacies. Some of them are probably not. Perhaps we should drop the Makkabi and Clalit pharmacies for now.

I have no more changes for now, go ahead and run.

If you think it’s best. I agree about Clalit - since they don’t have a separate brand for their pharmacies, there is a potential for errors. As for Maccabi - I think if the POI is named “Maccabi Pharm” - it’s very likely it’s really the pharmacy and not the clinic.

Changes applied to supermarkets and pharmacies.

I think the algorithm needs several improvements and simplification before further runs. It’s a bit bulky, and it cannot handle certain cases (e.g. AM:PM is either marketplace or convenience, and Clalit can be a clinic or a pharmacy).

You could exclude problematic cases (like Clalit and AM:PM) completely, or perhaps apply only name changes and not touch the existing value of amenity=*.

There are more issues. e.g. the bot insists on modifying names that shouldn’t be modified and I manually intervene. And the code is hard to follow.

Since brand editing volume is low, a human can manually track them. I’m considering a simpler CSV scheme, where the bot a generates a CSV with changes and with suggested default values according to templates, and the humans decide which ones to copy. The decisions are saved in the CSV and fed back to the bot.

This is just one of several ideas I’m brainstorming. I might just fix the code.