2

One of the biggest complaints I have about Kagi search quality is the result snippet. Choosing between long and short doesn't make sense, length is not an indication of quality. It's quite frustrating to search with google to find out that Kagi actually gave the same result but shown a snippet of poor quality that did not attract your attention.

One example of snippet where long is actually worse than short:


(the query was "rllib fine tuning")

My suggestion is to do some simple heuristics for better matching and chose automatically which one to show the user, like taking the snippet with more distinct matches. I know there is no ideal solution, but anything will be better than selecting by length.

This could probably be improved by cheap and open-source language models in the future, but any heuristics should be enough now.

  • Vlad replied to this.

    Browsing6853 It is not immediately clear how is the shorter snippet in the example superior to the longer. I am sure there are also cases when longer appears to be more useful than shorter.

    Can we create a dataset of at least 10 examples and an objective way to measure the usefulness of the snippet so that we can extrapolate the heuristics to determine its utility?

      Vlad Here are some examples (10 examples is too much...)



      ("chocolate LDL cholesterol" - long)



      ("quantum computing decoherence rsa" - short)



      ("mastodon make account private" - short)



      ("android ios planned obsolescence" - long)

      I found an inconsistency on the highlight too (I think I already reported it before? not sure):


      ("crispr odin" - check the bold "Cas9")

      Given the results are the same, I would just consider some metric over the matched words (exact matches or synonyms, whatever), like the number of different query word matches as I said before. If the user added some specific word, they probably want to be sure this word exists in the result before clicking it.

      • Vlad replied to this.

        Browsing6853

        I would just consider some metric over the matched words (exact matches or synonyms,

        Using this longer snippet would win 3/5 times in your examples? (three middle examples) Is that the outcome you would be ok with?

          Vlad In the examples I added a note of which snippet was better (2 longs + 2 shorts + 1 whatever). The real rate of bad snippets is actually much lower because most of the results just have the same snippet for long and short.

          Nonetheless, I depend a lot on the snippets (and I think plenty of people too), it's literally the biggest part of the content in a result page, and seeing something that important being only divided between long/short in a premium service is very disappointing.

          • Vlad replied to this.

            Browsing6853 It is only because quality of snippet is subjective. You have seen in the above example if we apply your idea (more matches for example) that you would not always get the snippet you wanted.

            Preference between longer and shorter is an actuall needl (it was posted as a feature suggestion on this site if I rememebr correctly and since it was easy to do we did it).

            I like the idea of having 'default' mode that automatically picks the 'best' one. We just need to decide the criteria (algorithm) for describing what the 'best' means.

            Happy to hear your further thoughts on this.

              Vlad Oh I think I explained badly the idea, the heuristics are about counting the number of distinct matches.

              First case: [chocolate, LDL, cholesterol] vs [chocolate, cocoa, cholesterol] = 3 vs 2 (cocoa probably just matched with chocolate)

              second case: [quantum, computers, computing] vs [quantum, computers, decoherence] = 2 vs 3

              third case: [mastodon, make, account] vs [making, account, private, mastodon] = 3 vs 4 (title should be counted as match so hard-to-miss matches like "mastodon" in this case is not a penalty)

              fourth case: [planned, obsolescence, android, ios] vs [planned, obsolescence, ios] = 4 vs 3 (although maybe "samsung" could match with android too).

              In all cases, this heuristics give what, in my opinion, is better.

              A basic implementation would be just taking the size of the set (non-repeated) of matched words. An improvement would be ignoring synonymous as different words ("chocolate" = "cocoa") and ignoring stop words not critical to the measurement. Another improvement would be using word embedding to select the one with the closest snippet (probably overkill too).

              • Vlad replied to this.
                No one is typing