Hi mesaoptimizer and @Vlad ,
mesaoptimizer If we are filtering out instead of downweighing, yes. Downweighing is not as intuitive to implement as compared to filtering out, given that you retrieve a tiny block of search results from external indexes instead of every search result. I'd say filtering out is the easier implementation to go with for now.
but doesn't filtering and downweighting have the same effect for kagi?
- if kagi filter out 10 results out of 20, kagi "need" to fetch other 10 result to add to current search
- if I downweight 10 results out of 20, all or some of them could fell below the limits that kagi calculate to presents me "acceptable results", and kagi has to add them the same
mesaoptimizer You could add a user option to exempt the top 10,000 sites from this filter, for example, but the Kagi user base most likely don't use these sites.
I'm not so sure about that, and I don't agree in general to this posture: we're saying here that "obviously" kagi users will not need the "most famous" sites if they've ads. But, as kagi doesn't display help messages to a user that search "how to kill myself", I don't also think that kagi has to treat differently for me some results over others, based on popularity.
That said, I've also this desire.
Vlad I welcome further discussion on the topic. How would you implement this feature?
Even a simple implementation can be very complex.
It seems like some kind of rule based system where you can say
if website has MORE than XX TRACKERS and is MORE than 100,000 in TRAFFIC, then REMOVE it from results.
where capitalised keywords are actual rules the user can modify.
My opinion, in sequence, are:
1) rule out the "popularity" dimension: for me it's really a fragile decision
2) introduce the "TRACKER_NUMBER" threshold as an input on user preferences
3) if user sets this parameter (say: 5) then you'll retrieve the same results as before, but every result that have more than TRACKER_NUMBER trackers, are shown below every other "good" result. So something like: first, the group of "good results", ranked appropriately; then, the group of the "bad result", with the relative ranking mantained.
4) these results could be highlighted in some way (a background, something like this), to help users to distinguish them.
What is the sense of this proposal?
a) kagi don't have additional costs
b) user can have an immediate feedback for "how is spammy my result list", and could act accordingly. For example, if all these results but the first two will be "bad", the user could try another search, or refine search terms, etc
c) you can gather user feedbacks, before moving to a further step that will touch the monthly costs.
After this "middle ground", you could look at the valorization of this parameter on the entire user base, and we'll could think if:
- this is sufficient
- users are asking for filtering, but paying more
Does that helps? I hope so