TL;DR: LLMs are effective at detecting themselves. I think their accuracy could be vastly improved by further training them for this specific usecase and by prompt engineering. Cost may be a prohibitive factor of implementing this approach, although good practices could alleviate this quite a bit. If good practices alone prove to be insufficient, limiting how many sites are scanned monthly or by search may help. An additional subscription or limiting it to the highest tier may also be options in this case. Additionally, this approach may introduce additional latency, but most of what solves costs solves this as well. I believe this would overall greatly improve search quality.
kirkmc I've just done some (very - sample size 5) limited testing and it appears that even with a very naive and off-the-shelf approach, AI can recognize itself quite well (4/5; 0 false positives). Here's what I did, specifically:
1.) I tested some search results as I already had an example in mind and that's the intended end use case. I used Qwen-235B (Thinking) for all of these tests. Probably a bit overkill. The following two sites were tested - text only:
It got both of them right.
2.) Gave the AI an old document of mine. It got it right that it was written by a human.
3.) From this point on I used two instances of Qwen. I'll refer to these as agents 1 and 2 respectively. Anyhow, I tasked agent 1 with generating a short-medium length article about the differences between ibuprofen and acetaminophen (paracetamol). Agent 2 was then asked to ascertain as to whether or not the text was AI generated. It answered correctly that it was. I then shut down both instances and spun up two new ones. Agent 1 neo was then asked to write a "short-medium article about the differences between ibuprofen and paracetamol in the style of a small-web blogpost". Additionally, it was informed of the situation and was told that it was generating synthetic data for this experiment and that it was to sound human-like and to cover its tracks. Agent 2 neo was then of course asked to determine if the text was AI generated. It answered correctly. Then, for the final experiment, I restarted agent 1's instance again. Agent 2 was then tasked with humanizing the text it was given using the insights it gained from detecting that it was AI generated. It successfully fooled agent 1 neo2.
From this it was clear that LLMs can definitely be used to detect themselves. They are not perfect, although it was promising that no false positives showed so far - downranking a human site for is much worse than not detecting an AI site in my books. It's clear that they can definitely be fooled if enough effort is put in, but I highly doubt that AI slop mills would ever do that - something also pretty promising. Additionally, it's important to note that this was text only. I've noticed that a hallmark feature of AI slop sites are their stock images, something that could also probably be detected and accounted for in the final verdict.
I believe that any sort of prompt engineering or post-training would greatly improve detection accuracy. For example, the AIs seemed completely oblivious to the fact that they love their em and en dashes (—). Sure, it's used sometimes, but rarely outside of books and academia. Informing Agent 1 neo2 about this would've made this experiment 5/5. This video does a great job of explaining some other signs, at least from my experience:
Note that, especially for the subtler signs, lots of fine tuning would likely be necessary if even possible.
As for existing solutions like originality.ai, I think that they aren't a good fit for kagi due to potential data privacy issues. Also, it's likely to be more expensive in the long run and they're mostly made for documents, not websites.
While we're on the topic of cost:
- "The average Google user searches three or four times per day or about 100 times per month."
- The average search yields about 30 results without expanding it further (which almost no one does)
Assuming a worst case scenario where all of the searches are unique or result caching isn't implemented, this could bring the cost per user up to $9 a month (way too high when kagi's lowest tier is $5/month). Of course, that's using cost estimates from kagi assistant for Qwen-235B (Thinking) + assuming a worst case scenario. Cheaper models would likely work equally as well if sufficiently trained - you could even try distillation as it should work well in this scenario as far as I understand (https://en.wikipedia.org/wiki/Knowledge_distillation). This could probably bring the cost per evaluation down to around $0.001, so a total cost of $3 in the aforementioned worst case scenario. Training would also cost of course.
If caching doesn't alleviate costs, some other options to consider may be limiting scans to only the first 10 or so sites (most users click the first three or so links anyways) or making it something the user has to do manually akin to site summarization currently.
Personally, I'd be happy with paying a small fee (up to $5/month) or upgrading to a more expensive plan for such features, but I can't speak for everyone.
Search speed may also suffer with this approach as the LLMs need time to scan the sites. I suppose displaying some sort of loading icon may be an option? Like, it first displays the "raw" results to the user and only then starts retrieving scan results/scanning the sites.
To conclude, I think using LLMs to scan for AI generated text has great potential. My small experiment has shown that they seem to pick up on it quite well even when not specifically post-trained for it or instructed on how to do it. While they can be fooled with some amount of effort, I doubt that most slop sites would go to such lengths - especially as kagi hasn't got that big of a user base. Costs could be an issue, although I find this unlikely if good caching and the right model are applied. This should also reduce latency. Overall, I believe that implementing this sort of scanning would greatly benefit search quality. If I want an AI answer - I'll just ask AI. I hate it when I get AI slop in search results as it's not credible and wastes my time.