Add uBlacklist support, advanced blocking rules, regex matching...
It seem that a significant amount of ".it" domains consist entirely of spam, most likely due to them being offered for free. Such spam domains have appeared at least once in Kagi search results.
User level: Since Kagi cannot tell if such domains are spam (at the moment), one way to prevent these domains from appearing in search results of users is to allow users to block all domains whose suffix is ".it". This should be a feature, since the landscape of the internet will change and the domains that are favored by spammers will shift as time passes. The user adding that domain suffix to their blocklist clearly finds that blocking certain false positives is worth the reduction of spam in their search results.
Service level: Just like uBlock Origin has certain "default" blocklists that it suggests to new users, Kagi can also curate blocklists they believe will be useful to the users. I'm uncertain as to the specifics of how this should be implemented, since that depends on Kagi and the community.
mesaoptimizer We should probably just support ublacklist format
https://github.com/iorate/ublacklist#description
such as
https://raw.githubusercontent.com/arosh/ublacklist-stackoverflow-translation/master/uBlacklist.txt
Supporting the uBlacklist format would be fantastic! Adding the ability to export in the same format would be nice too, I maintain a GitHub repo of my uBlacklist rules.
In addition to the .it domain spam, I've recently started seeing .pl spam following the same pattern of randomized domain names.
Steps to reproduce:
Try various queries? I don't have one on hand that's causing this result at the moment...
Even when coming from the US and searching in English, you may see spammy results from various .pl domains, including:
- ngmd.roofmasters.pl
- ejslxb.poradnik-kuchenny.pl
- ubl.stacjakomputerowa.pl
- mithc.lkstworkow.pl
- lwz.tomexplast.pl
- rvlir.moto-arena.pl
- rdgep.karczma-raznawozie.pl
- wgvwys.warsztat-kulinarny.pl
Expected behavior:
Results shouldn't include spammy .pl domains.
Debug info:
Firefox 103.0.1 on macOS 12.5. Kagi server US-EAST.
I'm experiencing the same issue. There's a planned feature to add uBlacklist blocklist support, which includes blocking entire TLDs.
I ran into the same .pl random domain spam earlier today. Being able to block TLDs would be fantastic
Am I right that as a workaround, until TLD blocking is implemented, it is possible to create a Lens that excludes certain TLDs from search results?
Currently Kagi can't compete with extensions like uBlacklist, when it comes to domain blocking. Trying to handle large amounts of domains is currently very difficult due to the lack of support for bulk editing and removal of domains. The ability to import public rulesets, similar to Subscriptions in uBlacklist, would make the sharing and editing of domain blacklists/whitelists much easier.
For full compatability with existing blacklists, Kagi would need support for Match Patterns and Regular Expressions. These two features by itself, without the support for importing lists, would already make Personalized Results in Kagi a lot more powerful.
Currently it is not uncommon to have large amounts of repeated domains in the Personalized Results, such as:
pinterest.com
pinterest.at
pinterest.be
pinterest.ca
pinterest.co
pinterest.ch
pinterest.cl
pinterest.de
...
Instead of manually adding these domains, users could import public lists like this. Lists like this could also be simplified to a single entry using RegEx:
/^https:\/\/\.pinterest\./
These options would make it much easier to handle large amounts of URLs.
We added Kagi support for ublacklist.
This was one of the most popular feature requests, and we shipped support today. Looking forward to your feedback.
georgee very thanks commit at uBlacklist.
I wait uBlacklist for Safari 8.7.2 (perhaps).
Because uBlacklist for Safari 8.7.1 not list kagi.
I just installed uBlacklist and it works great with Kagi, however, when viewing an image in fullscreen, blocked sites still appear in the modal (left and right of the current image). It would be nice if blocked sites could also be removed from that list.