76
18 days later

Language and region can be orthogonal and 2-dimensional. The upsides are well enumerated already.

The downside is an exponential increase in complexity since let's say the medium term plan is to support 10 languages and 20 regions, that's 20x10 or 200 possible combinations. And that would still be very far from the Catalan case, which would probably require thousands or tens of thousands of language-region pairs to service (if implemented based on descending popularity of the pair).

Maybe a trial implementation could be done with just 3 languages and 3 regions? And then evaluate the real world pros/cons a year later?

4 months later

Hello.

I found a few other discussions concerning language support, but to avoid mixing, I thought it was better to create a new thread.

I am in switzerland, I work in english as a programmer and my wife is japanese, so let's say I change language often.

I did some thinking, and there are three parameters I'd like to be able to manipulate:

  • query language
  • response language
  • region

The query language is important because it cannot be detected reliably in a lot of situation. For example a lot of french words are the same as in english, or some japanese kanji are the same as chinese/taiwanese.

The response language is also of course important, and it should be a different setting that the query language. In many situation it will be the same, but sometimes not, for example if I search for french lessons in japanese, I query in french for japanese results.

Finally the region. This settings already exists, but I am just mentioning it for completeness.

Also, the browser language is irrelevant for query and results, and is only important to detect UI language default.

I would love the following features:

  • syntax to specify query/response language in a query, like: l:fr for response language, ql:fr for query language and r:ch for region.
  • the ability to add language lenses separately from existing lenses, a language lense would include the three settings as list, for example: query language: raise fr, en, block de. response languge: only fr, region: world

    kuon Thanks for a thoughtful suggestion. Having to pick just one, what would be the most important improvement you'd like to see ik Kagi first?

    • kuon replied to this.
      Merged 2 posts from Language lenses.

        Vlad Result language is the most important and is a blocking feature required for me to use kagi for non programming searches. I would love to see the following, in order of priority:

        1. result language selection filter, only this language appears in the result, a text syntax is OK, like l:fr.
        2. a dropdown menu for selecting the result language, like 1, but with a GUI
        3. Ability to create a language "lense", with result languages (plural, this means include/exclude/order rules) and region (could be at the same place of the region dropdown, and that dropdown would be split in two).
        4. Query language specifier, also in language lense. This would fix some edge cases where for example the query is in japanese, but the characters are interpreted as chinese. This is really edge cases, but I think it is important to think about at some point. I don't have real example at hand, but I've been bitten by this and I know it is an issue, but I agree it would be better with more context.

          kuon

          result language selection filter, only this language appears in the result, a text syntax is OK, like l:fr.

          Thanks for prioritizing. How would this look in UI (as l:fr will not easilly be dicoverable)?

          • kuon replied to this.

            Vlad I do design a lot of "prosumer" application, like medical database search tool, and we faced the same problem with "syntax discoverability". A few solution we found:

            • display a placeholder with a few syntax example
            • display a small unobtrusive tooltip "try l:fr to search in french"
            • mention it in the example on your homepage and in documentation

            We found that search syntax was faster and not harder for users, even doctors or nurse with absolute hatred for computers.

            Now I agree that the UI is also important, and in this case, I would just put a "language" drop down, between region and order, and in this dropdown, put a message "you can organize this list from your preference here"

            Another thing I can share, is that many websites (github and gitlab pop in my mind), mix "text query" and "ui query". Basically, if you click a label, it adds "label:foo" as text in the search input field, but when you try to delete or edit it, it acts like a single characters. We found no way to edit a query set with the UI without quirks or frustration, so what we did in our last app is support syntax in the search query, like l:fr and also have a language dropdown, and if you select a language in the UI, it is not coupled or reflected in the search input field, but there is a visual indication that UI is doing something to the search. Then in the search result, we would write something like "language requested in the query: fr, but en was selected in the UI. We used en for this search, do you want to search again in fr? And if you click that, the UI reset and l:fr is left in the search field.

            I hope it is not too blurry, as I am in a plane right now 🙂

            I wish you the best.

            a month later

            I’ve been privately exchanging with Vlad about it but I want to point out that, currently, using Kagi in Belgium is problematic for French-speaking persons because despite my language being set to French, every results, including Wikipedia cards, are in Dutch.

            This is sometimes close to being a joke as looking for "Belgique" (French name of Belgium) in Kagi gives me the following wikipedia card:
            https://nl.wikipedia.org/wiki/Belgique

            Most websites, including Google, solves this by considering Belgium as two separate regions: Belgique (FR) and Belgïe (NL).

            Duckduckgo solves this by following either the prefered language or the browser language but not relying on the region to assume any language.

            I consider this a blocker before switching my family to Kagi.

              Hi ploum,

              We currently do have some code specifically for handling the Belgium French/Dutch situation by checking your browser language. I think it definitely stands to be improved, the current effect may be too subtle.

              It would super helpful if you could give an example of the language codes your browser is sending to us, so I can check how the current code is handling your requests - if you are familar with using the Network Inspector, it is the Accept-Language header that should be found on most requests.

              Or, it seems a site like http://www.ericgiguere.com/tools/http-header-viewer.html should be able to show you, like so:

              Regardless, we'll see what we can do! Thanks 😺

                My browser send the following:

                fr,fr-FR;q=0.8,en-US;q=0.5,en;q=0.3

                As far as I can tell, lot of computers are configured the same way in France and in the South of Belgium. I’m not sure that many browser send a fr-BE code (not even sure that it exists). As far as I know about this subject, fr-FR is used in France, Belgium and Switzerland. Those are considered the same language.

                Also, be aware that IP location is mostly not helpful: a French-speaking ISP may have its datacenter in the Dutch region. Also, almost 15% of the population lives in Brussels, which is a bilingual area.

                  7 days later

                  In multilingual regions, users might prefer a specific language but fall back on the rest if there are too few matches.

                  A concrete use case when lowering a language would help:

                  • While searching in Cyrillic in Ukraine, I prefer Ukrainian news and versions of sites, and want to only see russian when relevant Ukrainian results end.
                  • While searching in English, I expect English and occasional Ukrainian results, but nothing in russian.

                  Can something like raising/lowering domains be implemented for entire languages? As a global option, a lens, a sub-region, etc. This should also apply to things like Wikipedia infoboxes.

                  Merged 1 post from Raising/lowering entire languages.

                    z64 I'm in a similar situation (English & French-speaker in Belgium), but I want all UIs in English so my browser sends en-US,en;q=0.9. I also get plenty of results in Dutch, which I nearly never want, and which is a dealbreaker for my usage.

                    Just throwing a slightly different idea into the discussion:

                    1. Allow users to input all the languages they speak and want to have results for into preferences
                    2. Block language results not matching any of those languages (maybe optional)
                    3. Make region selector just select the regional aspects then (like e.g. preferring shops from that country, amazon.<localtld> vs amazon.com, disambiguation for Paris Texas vs Paris France and things like that) and no language related settings anymore.

                    Quick workaround that work moderately well for my use case is to add site:ch if I want swiss results as the &r=ch (selecting switzerland region) usually doesn't work well as about 70% of results are from France.

                    For example:

                    This is especially important when I want to order physical objects, as I want to avoid international shipping.

                    When you do research from Brussels (Belgium), most of the results are in Dutch.
                    But Belgium being a trilingual country (French/Dutch/German), getting only the Dutch result is not right.

                    As a native French speaker, I always have to change the website language to either French or English.

                    It would be neat to see Kagi taking that into account, so that results are not Dutch only.