RoxyRoxyRoxy There's quite a few practical cases where this comes up, and there's a broader sense of "clash" with (at least my) intuitions about how to interact with a search engine.
On a practical point, usually it's an issue where you're doing some kind of "wide-net" or "deep" search. As an example, let's say you want to return every page from the FT that has mentioned sandwich or sandwiches (this isn't totally contrived, I have actually performed this search). site:ft.com ("sandwiches" OR "sandwich" works mostly fine, you get a couple of pages of results and you can adjust with time range operators to walk the entire list. It's a little annoying, since in this case you know you are mostly getting signal - and most else can be resolved with -site:markets.ft.com or whatever - so the noise-reducing function of limiting search pages just gets in your way, but not a huge deal, you don't lose anything you want and you can be reasonably thorough in making sure of that.
We can then consider searches like site:ft.com "small caged mammal" OR "smallcagedmammalness" -site:markets.ft.com and it seems like it works again, not meaningfully worse than another search engine. I am pretty sure all the relevant results are in the first two pages, even, which is certainly better than I'd expect elsewhere. But, unlike other popular search engines, I can't look ahead, see if there are any remaining relevant results or if it's just junk that makes it past the filter with e.g. the title of an article in a sidebar. I also can't as easily loosen the search, drop the quotes and see if there's any other string that might be used to refer to the same concept of "small caged mammal". Sure, I should expect that doing so would involve wading through a bit of trash, but I might reasonably want to make that tradeoff and continually evaluate how much more time I think is worthwhile to invest in searching as I go.
You could certainly do so by repeatedly thinking-up and appending other terms to include or exclude, or through the moving time window as before, but this is pretty much the same kind of drudgery, only a bit more labour-intensive. It gets worse the more terms you add to the query, each of those potentially coming with multiple forms itself.
For my example it's not too bad, but for anything more substantial I think you can see how it could get extremely unwieldy. Imagine trying to search for other discussions of the concept outside the FT or the ONS, especially in reference to their usage, or about analogous topics in other national statistical authorities. Using specific quotes obliges you to invest a lot of time and creative mental energy to cover your bases, while dropping them makes it easier for bad results to push out the good. Lenses mitigate this, substantially in some cases, but take up-front time to create and have their own limitations.
(As a note, this is a huge issue when searching for research, at least on statistical methodologies and applications, because there you have a huge variety in the number of ways substantially similar concepts can be phrased and a wide-net search is always going to return a lot of "close but no cigar"-type results.)
Which leads to the clash with my intuitions on search engines. When we use queries to refine searches, I see it as about tuning the signal-to-noise ratio, accepting that I often won't have a perfect query that returns only what I want while also excluding only what I don't want. If in doubt, having the option to cast a wide net and then skim through pages of results to find the handful that are what I was actually looking for is a good fallback, I can choose to increase my false positive rate to decrease my false negatives (as a proportion of results actually surfaced). If I have only a query to use as a method of adjusting the type I/type II errors ratio, the time and effort input to get "excellent results" can be dramatically larger than in the case where I can simply sample more of the results.
There's another point about how the lack of observability of results beyond the two-page horizon makes it harder to know how to refine a query, but the thrust of my argument is that the current limit on results is of a kind with skimming pages of mostly-garbage. The mitigations (lenses, less slop) make Kagi better in many cases, but there's a big tail of searches where it can be painful to try and use.
I apologise for how the length of this post and the , I wanted to demonstrate how friction arises - for me - even in simple, constrained cases, and how things start to blow up and become unwieldy. I have thought quite a lot about this, because to me it is, by far, the biggest issue I currently have with Kagi (~40% to ~50% of my searches, and about 75% of my research-ish queries, are still with Google because of this).