LLM suitability audit for default lenses

blightbow

This is a two-pass Claude Code assessment regarding the value of the default lenses under the v1 search API. A custom MCP tool exposing the lens_id parameter and a list of default lenses was used for the assessment. The second pass was an adversarial review to challenge the findings and methodologies of the first agent.

The original discussion started on the kagi-api Discord channel, and Margret asked me to repost it here with added detail. Considering the volume of feedback I've done my best to break it down for online readability.

First agent

Headline: only 3 of the 8 enumerated lenses earn their slot

Verdict Lenses

Valuable (hard to replicate with query operators) academic, pdfs, kagi documentation

Replicable with site: union (lens adds no value) forums

Harmful (surfaces tag pages instead of question pages on specific queries) programming

Low utility / misnamed news 360 (no-op on fresh news), fediverse forums (returns Reddit + HN, not fediverse), usenet/archive (returns modern blogs, not archives)

The programming finding is the most striking: it actively degraded a query that asked a specific technical question, surfacing StackOverflow tag-listing pages instead of question pages. Recommending it as an LLM-facing slug is a foot-gun.

The two misnamed lenses (fediverse forums, usenet/archive) don't deliver what their names imply — naming them as "always-available" invites the LLM to reach for the wrong tool.

Agent's recommendation

Trim the enumerated catalog in the Field description to three slugs: academic, pdfs, kagi documentation. Keep shareable IDs and full URLs as accepted forms (that's the durable value — user-curated stored profiles can't be replicated with operators). Keep the generic-behavior caveats (case sensitivity, silent fallback, workflow caveat, intersection/override precedence) since those apply to any slug value.

The agent estimates this removes ₈₀ words from the Field description, eliminates two foot-guns, and concentrates the documentation budget on the lenses that earn it.

Second agent (adversarial of initial findings)

The adversarial round substantially shifted the picture. Three of the eight verdicts overturned.

Revised verdict table

Lens First agent Adversarial review Reason for overturn

forums drop keep Lens reaches Discourse instances, Facebook Groups, vendor forums (Swift Forums, discourse.nixos.org) that an LLM-written 3-domain site: union doesn't reach

programming drop confirm drop Narrow-utility band rather than "harmful across the board" — works for general/beginner queries, degrades on expert-niche queries; hard for the LLM to predict which

news 360 drop keep First agent's GPT-5 query was already news-saturated at baseline, masking the lens. On cold-news queries (Tigray humanitarian, Argentina labor reform) the lens shifts to actual news outlets (Al Jazeera, Reuters, argentinareports.com)

fediverse forums drop confirm drop Activates only on fediverse-native vocabulary queries (!community@instance syntax), not paraphrased queries — brittle in a way agents can't predict

usenet/archive drop confirm drop Doesn't reach groups.google.com or archive.org Usenet collections even on Usenet-native topic queries

academic keep confirm keep The 3-domain operator union returns preprints; the lens reaches peer-reviewed venues an LLM wouldn't enumerate

pdfs keep drop filetype:pdf actually outperforms the lens — the lens is a soft prior with site biases, not a strict PDF filter (returned GitHub markdown when asked for PDFs)

kagi documentation keep confirm keep Stable across multiple Kagi-self-referential queries (search, Orion, Assistant)

Net change: first agent had 3 keeps / 5 drops; adversarial review has 4 keeps / 4 drops, with forums and news 360 moving keep, pdfs moving drop.

The methodological signal is the bigger finding

The reviewer's note that the value-assessment loop has not converged is the more important takeaway. Two rigorous rounds moved 3 of 8 verdicts. That tells us something about the parameter itself, not just our process:

Lens value is query-shape-dependent. Whether a lens helps depends on whether the baseline is already saturated with the lens's target class. The LLM caller would have to reason about query shape to decide whether to engage the lens — which adds to the cost of using it.

The enumerated catalog needs more than one query per lens to assess fairly. Single-query verdicts demonstrably mislead. The reviewer recommends 2 queries per lens with deliberate phenotype variance (general vs niche, saturated vs cold baseline).

Doc-trimming pressure biases toward "drop it." The reviewer specifically flagged that running the value assessment in the same pass as drafting docs is methodologically unsound — drafting pressure creates a bias toward exclusion. The first agent's forums and news 360 verdicts both swung that way.

blightbow

Here is the feedback on exact queries executed requested by Margret:

Round 1 — Initial value-assessment UAT

Lens: forums

Query: how do experienced developers handle deeply nested async error handling

Baseline (no lens) first 3 URLs:

stackoverflow.com/questions/79168642/how-can-i-handle-and-debug-errors-in-complex-asynchronous-applications

stackoverflow.com/questions/57844527/error-handling-with-deeply-nested-async-functions

stackoverflow.com/questions/40849282/catching-errors-from-nested-async-await-functions

With lens first 3 URLs:

forums.swift.org/t/concurrency-async-await-actors/6516

news.ycombinator.com/item?id=44416221

news.ycombinator.com/item?id=31050650

Verdict context: Real composition shift away from Stack Overflow toward Discourse + HN — scope tracks the lens name.

Lens: programming

Query: rust tokio select macro cancellation safety

Baseline (no lens) first 3 URLs:

stackoverflow.com/questions/74547855/how-to-assess-cancel-safety-of-select

stackoverflow.com/questions/73939262/how-to-prevent-cancelling-futures-in-nested-select-macros

stackoverflow.com/questions/75346665/is-there-a-way-to-avoid-the-usage-of-tokioselect-macro

With lens first 3 URLs:

stackoverflow.com/questions/tagged/rust-tokio?sort=votes

stackoverflow.com/questions/78912186/subscribe-and-listen-to-dynamic-number-of-tokiosyncbroadcast-channels

stackoverflow.com/questions/tagged/rust?tab=Newest

Verdict context: Lens swapped two specific question pages for tag-listing index pages — quality went down for a specific-question query.

Lens: news 360

Query: openai gpt-5 release announcement

Baseline (no lens) first 3 URLs:

en.wikipedia.org/wiki/GPT-5

openai.com/index/introducing-gpt-5/

openai.com/index/introducing-gpt-5-5/

With lens first 3 URLs:

en.wikipedia.org/wiki/GPT-5

openai.com/index/introducing-gpt-5/

openai.com/index/gpt-5-new-era-of-work/

Verdict context: Top two URLs identical, third swapped one OpenAI vendor page for another. No news outlets surfaced — effectively a no-op on this query.

Lens: fediverse forums

Query: mastodon instance defederation policy debate

Baseline (no lens) first 3 URLs:

en.wikipedia.org/wiki/Mastodon_(social_network)

mastodon.social/@rodhilton/110317964109228557

fediversereport.com/defederation/

With lens first 3 URLs:

reddit.com/r/Mastodon/comments/zf5529/united_federation_of_instances/

reddit.com/r/Mastodon/comments/14y5pti/in_terms_of_fediverse_social_media_what_exactly/

news.ycombinator.com/item?id=36351978

Verdict context: Lens returned Reddit (not fediverse) + HN. The actual fediverse source (mastodon.social post) appears in baseline, not in lens output. Name oversells the scope.

Lens: usenet/archive

Query: linux kernel scheduler history BSD comparison

Baseline (no lens) first 3 URLs:

en.wikipedia.org/wiki/Comparison_of_BSD_operating_systems

en.wikipedia.org/wiki/Comparison_of_operating_system_kernels

en.wikipedia.org/wiki/Linux_kernel

With lens first 3 URLs:

en.wikipedia.org/wiki/Comparison_of_BSD_operating_systems

kernel-internals.org/sched/scheduler-evolution/

dev.to/satorutakeuchi/a-brief-history-of-the-linux-kernel-s-process-scheduler-the-very-first-scheduler-v0-01-9e4

Verdict context: Lens replaced two Wikipedia hits with technical archives, but no actual Usenet / groups.google sources surfaced. Useful nudge toward specialist deep-dives, but not the scope the name promises.

Lens: academic

Query: retrieval augmented generation benchmark hallucination reduction

Baseline (no lens) first 3 URLs:

genai.stackexchange.com/questions/2662/what-are-the-most-reliable-strategies-to-reduce-hallucinations-in-retrieval-augm

arxiv.org/abs/2505.04847

arxiv.org/abs/2404.08189

With lens first 3 URLs:

link.springer.com/chapter/10.1007/978-3-031-89063-5_42

link.springer.com/chapter/10.1007/978-3-031-93681-4_3

pmc.ncbi.nlm.nih.gov/articles/PMC12540348/

Verdict context: Clean shift away from StackExchange + arXiv preprints toward Springer book chapters + PMC. Scope-correct.

Lens: pdfs

Query: CRDT survey distributed systems consistency

Baseline (no lens) first 3 URLs:

en.wikipedia.org/wiki/Conflict-free_replicated_data_type

stackoverflow.com/questions/34192283/what-is-crdt-in-distributed-systems

dev.to/fedekau/crdts-and-distributed-consistency-part-1-building-a-distributed-counter-22d3

With lens first 3 URLs:

dspace.mit.edu/bitstream/handle/1721.1/145983/3447865.3457963.pdf?sequence=1&isAllowed=y

crdt.tech/papers.html

crdt.tech/implementations

Verdict context: Real PDF (MIT dspace) + PDF-curator domain. Scope-correct shift.

Lens: kagi documentation

Query: how to use lenses

Baseline (no lens) first 3 URLs:

blinksandbuttons.net/how-to-use-camera-lenses/

youtube.com/shorts/foRehwsihQY (contact-lens tutorial)

allaboutvision.com/eyewear/contact-lenses/how-to/contact-lens-beginners-guide/

With lens first 3 URLs:

en.wikipedia.org/wiki/Kagi_(search_engine)

help.kagi.com/kagi/features/lenses.html

help.kagi.com/kagi/getting-started/

Verdict context: Dramatic, scope-correct rescue from camera + contact-lens content to Kagi product docs.

Cross-cutting probes

Probe 1: lens + agreeing site: operator

Query: site:news.ycombinator.com rust borrow checker frustration with lens_id=forums

First 3 URLs:

news.ycombinator.com/item?id=46453062

news.ycombinator.com/item?id=42160501

news.ycombinator.com/item?id=42485536

Verdict context: Lens + agreeing operator compose cleanly; results stay on the agreed domain.

Probe 2: lens + conflicting site: operator

Query: site:reddit.com retrieval augmented generation hallucination with lens_id=academic

Result: No results returned. Empty-hint frontmatter named the lens filter as the most restrictive.

Verdict context: Intersection semantics confirmed — academic scope and reddit.com don't overlap, and the empty-hint diagnostic correctly identifies the lens.

Probe 3: operator-only redundancy check

Query: CRDT survey distributed systems consistency filetype:pdf (no lens)

First 3 URLs:

en.wikipedia.org/wiki/Conflict-free_replicated_data_type

arxiv.org/pdf/1707.01747

arxiv.org/pdf/2310.18220

Verdict context: filetype:pdf alone leaks Wikipedia at #1 and misses crdt.tech (which the pdfs lens surfaced twice). The lens captures PDF-curator domains, not just URL extensions — not fully replicable with operators.

Probe 4: research-shaped query on most-useful lens

Query: mixture of experts routing collapse training instability with lens_id=academic

First 3 URLs:

engrxiv.org/preprint/view/4504

engrxiv.org/preprint/view/4764

ui.adsabs.harvard.edu/abs/arXiv:2510.11370

Verdict context: engrxiv + ADS-indexed arXiv. The academic lens delivers a source mix that would be tedious to assemble by hand with site: clauses.

Round 2 — Adversarial review probes

Round 2 of the lens_id value assessment. All searches issued with limit=3 on the default search workflow. URLs reproduced verbatim from response bodies; ordering preserved.

Lens: programming (confirm with scope note)

Probe 1 — beginner-shape query

Query: getting started with python decorators

Baseline first 3 URLs:

https://stackoverflow.com/questions/12046883/python-decorator-can-someone-please-explain-this

https://stackoverflow.com/questions/17330160/how-does-the-property-decorator-work-in-python

https://realpython.com/primer-on-python-decorators/

With lens first 3 URLs:

https://stackoverflow.com/questions/71333063/how-to-create-a-decorator-with-decorator-attributes

https://stackoverflow.com/questions/12032481/creating-a-decorator-class

https://stackoverflow.com/questions/48966973/proper-use-of-decorators

Verdict context: Lens returns relevant SO Q&A, not tag-listing pages. Beginner queries are fine.

Probe 2 — language-overview query

Query: go language goroutine channels overview

Baseline first 3 URLs:

https://www.reddit.com/r/golang/comments/1iysrny/how_would_you_introduce_goroutines_and_channels/

https://medium.com/@ashutosh7379947493/goroutines-and-channels-in-go-a-beginners-guide-a8acac799212

https://medium.com/@senthilrch/a-comprehensive-guide-to-goroutines-633f380a0def

With lens first 3 URLs:

https://stackoverflow.com/questions/41754198/am-i-using-channels-incorrectly-with-golang

https://stackoverflow.com/questions/35262837/go-goroutine-with-channel-strange-result

https://stackoverflow.com/questions/22621514/is-scalas-actors-similar-to-gos-coroutines

Verdict context: Lens swaps Reddit + Medium tutorials for SO Q&A. Tradeoff, not harm.

Probe 3 — first-agent reproduction (expert-niche)

Query: rust tokio select macro cancellation safety

With lens first 3 URLs:

https://stackoverflow.com/questions/tagged/rust-tokio?sort=votes

https://stackoverflow.com/questions/78912186/subscribe-and-listen-to-dynamic-number-of-tokiosyncbroadcast-channels

https://stackoverflow.com/questions/tagged/rust?tab=Newest

Verdict context: First agent's tag-listing degradation reproduces. Failure phenotype is real for expert-niche query shape.

Lens: forums (overturn)

Probe 1 — Discourse-bearing community

Query: nixos flake systemd service activation debugging

With lens first 3 URLs:

https://discourse.nixos.org/t/help-getting-started-with-flakes/23102

https://www.reddit.com/r/NixOS/comments/1aers38/difficulty_switching_nixos_configs_from_channel/

https://www.reddit.com/r/NixOS/comments/176huw1/anyone_using_rustdesk_and_have_a_systemd_script/

3-domain site: union URLs (site:reddit.com OR site:news.ycombinator.com OR site:stackoverflow.com):

https://www.reddit.com/r/NixOS/comments/1rnq33u/please_help_help_help_im_getting_crazy/

https://www.reddit.com/r/NixOS/

https://www.reddit.com/r/NixOS/comments/1psy5xe/how_to_write_system_testssmoke_tests/

Verdict context: Operator union misses discourse.nixos.org. Lens reaches it. Operator cannot replicate.

Probe 2 — multi-platform community

Query: homeassistant zigbee2mqtt sonoff coordinator stick recommendation

With lens first 3 URLs:

https://www.facebook.com/groups/HomeAssistant/posts/3467140986890580/

https://www.facebook.com/groups/HomeAssistant/posts/4259892597615411/

https://www.reddit.com/r/homeassistant/comments/12bcllt/best_zigbee_dongle_for_new_ha_user/

Verdict context: Lens reaches Facebook Groups — outside any LLM-enumerable site: allowlist.

Lens: news 360 (overturn)

Probe 1 — cold news / non-US topic

Query: Tigray famine humanitarian access

Baseline first 3 URLs:

https://en.wikipedia.org/wiki/Famine_in_northern_Ethiopia_(2020%E2%80%93present)

https://theconversation.com/starvation-as-a-weapon-of-war-how-ethiopia-created-a-famine-in-tigray-268395

https://theconversation.com/the-war-in-tigray-the-makings-of-a-man-made-famine-and-what-can-be-done-165406

With lens first 3 URLs:

https://en.wikipedia.org/wiki/Famine_in_northern_Ethiopia_(2020%E2%80%93present)

https://www.aljazeera.com/features/2026/1/23/hunger-death-devastation-no-respite-in-tigray-a-year-after-us-aid-cuts

https://www.aljazeera.com/news/2021/6/10/un-official-says-350000-people-in-famine-in-ethiopias-tigray

Verdict context: Lens swaps 2x TheConversation (analysis) for 2x Al Jazeera (news outlet). Real shift.

Probe 2 — regional politics

Query: Argentina Milei labor reform

With lens first 3 URLs:

https://en.wikipedia.org/wiki/Economic_reforms_of_Javier_Milei

https://www.argentinareports.com/argentina-closer-to-passing-controversial-labor-reform-after-senate-passes-bill/4197/

https://www.reuters.com/sustainability/sustainable-finance-reporting/mileis-labor-reform-drive-sets-up-clash-with-argentina-unions-2026-02-10/

Verdict context: Surfaces regional outlet + wire service. Confirms lens engages on cold/regional news topics.

Lens: fediverse forums (partial overturn)

Probe 1 — topic-about-fediverse phrasing

Query: lemmy community migration kbin

With lens first 3 URLs:

https://en.wikipedia.org/wiki/Lemmy_(social_network)

https://www.reddit.com/r/LemmyMigration/comments/145fdyg/were_moving_to_rkbinmigration_the_redditors_guide/

https://www.reddit.com/r/LemmyMigration/

Verdict context: No fediverse-instance content. First agent's failure mode reproduces.

Probe 2 — fediverse-native syntax

Query: "!selfhosted@lemmy.world" OR "!asklemmy@lemmy.ml"

With lens first 3 URLs:

https://pawb.social/post/41542173

https://lemmy.world/c/selfhosted

https://lemmy.world/signup

Verdict context: Native-syntax query activates the lens. Reaches pawb.social + lemmy.world. Brittle activation pattern.

Lens: usenet/archive (confirm)

Probe 1 — Usenet-native topic

Query: comp.lang.lisp pitfalls common lisp newsgroup

With lens first 3 URLs:

https://en.wikipedia.org/wiki/Lisp_(programming_language)

https://en.wikipedia.org/wiki/Common_Lisp

https://stackoverflow.com/questions/1159208/can-i-use-common-lisp-for-sicp-or-is-scheme-the-only-option

Verdict context: No Usenet archive content despite explicitly Usenet-native topic.

Probe 2 — operator pre-narrowed

Query: site:groups.google.com comp.lang.lisp (with lens)

With lens first 3 URLs:

https://groups.google.com/g/news.lists/c/6PRP7-PGJiU

https://groups.google.com/g/comp.lang.lisp/c/1Z-iMKGaii8/m/pEPOLt4qAXUJ

https://groups.google.com/g/comp.lang.lisp/c/j2Ri5Fu__sc

Verdict context: Results come from the operator; lens adds no observable rank boost beyond what site: already provides.

Lens: academic (confirm)

Operator-replication probe

Query: site:scholar.google.com OR site:ncbi.nlm.nih.gov OR site:arxiv.org retrieval augmented generation hallucination benchmark

Operator-only first 3 URLs:

https://arxiv.org/abs/2505.04847

https://arxiv.org/abs/2401.00396

https://arxiv.org/abs/2502.19209

First agent's lens probe (for comparison) returned 2x Springer book chapters + PMC.

Verdict context: Operator returns preprints only; lens reaches peer-reviewed venues. Different qualitative mix.

Lens: pdfs (overturn)

Operator-replication probe

Query: kubernetes operator pattern best practices

With lens first 3 URLs:

https://github.com/cncf/tag-app-delivery/blob/main/operator-wg/whitepaper/Operator-WhitePaper_v1-0.md (markdown, not PDF)

https://github.com/ramosITBooks/DockerKubernetesIstioELKBooks/blob/main/K8S/Kubernetes%20Best%20Practices.pdf (PDF)

https://github.com/arpanpathak/kubernetes-operator-pattern (HTML repo)

filetype:pdf first 3 URLs (same query, no lens):

https://www.cncf.io/wp-content/uploads/2021/07/CNCF_Operator_WhitePaper.pdf

https://builders.intel.com/docs/networkbuilders/kubernetes-operators-automated-lifecycle-management-technology-guide.pdf

https://arxiv.org/pdf/2411.16639

Verdict context: filetype:pdf returns 3/3 actual PDFs from authoritative sources; lens returns 1/3 actual PDFs with GitHub bias.

Lens: kagi documentation (confirm)

Cross-property probe

Query: orion browser configure

With lens first 3 URLs:

https://en.wikipedia.org/wiki/Kagi

https://en.wikipedia.org/wiki/Kagi_(search_engine)

https://help.kagi.com/orion/

Verdict context: Lens scales across Kagi properties (search + Orion). Stable behavior across product surface.

Cross-cutting notes

Saturation observation: The news 360 baseline-vs-lens diff for the Tigray query is the clearest single demonstration that round-1's GPT-5 query masked the lens's effect. Both produce news-adjacent results, but the lens specifically substitutes wire/news outlets for analysis publications.

Operator-replication asymmetry: forums beats its operator union (Discourse misses), pdfs loses to filetype:pdf (PDF strictness), academic differs in venue mix from a scholar/.edu/ncbi union. The lens-vs-operator comparison should be table-stakes for every keep/drop decision.

Activation phenotype: fediverse forums engaging only on native-syntax queries is the most unusual finding — none of the other lenses showed query-shape-dependent activation in these probes.

Total KagiSearch calls round 2: 15.

nobodywasishere

Something important to keep in mind is that our results are not always deterministic (same search done a few minutes later can produce slightly different results), so with any of these, I'd recommend performing multiple searches to assess quality.