I'm not sure if this is a "bug" or a "suggestion" or what, but it's on my mind and it's frustrating and I don't have a solution so I don't think it's a "feature" I'm asking for.
So I watched Freya Holmer's video about her experience searching for information about the .glb file format from the perspective of wanting to write a parser for the 3D modeling software they're writing. tl;dw - they searched "glb file format" in Google and 80% of the first page was AI generated marketing pages. Bing was worse.
I immediately wanted to see how Kagi handled the situation and the results definitely left me wanting. It's better, but marginally. The side card has a bunch of good information (because it's just Wikipedia), and there's a few more real results (including the random r/godot post), but it's still mostly AI generated marketing.
"glb file format"
To be clear, I understand that "glb file format" is a high level prompt that you'd expect to get introductory results for, but it's an inherently technical topic and the actual spec was the 20th search result.
It's easier to list the non-generated links from the results that might be useful (in the order they appear):
That's it. I also get there's always been the largely unhelpful "what is this random file format" pages, but at least those you knew wouldn't be helpful, the "GLB File: A Comprehensive Guide" marketing pages are much harder to know to avoid.
"glb file format spec"
I can see arguing that "if you want to implement the format you want the spec, so search for it." There are definitely more useful results for "glb file format spec" but they're hidden amongst all the same AI generated marketing pages.
The previous links are higher up here, again in order, links omitted for previous entries:
I think mostly I expected to have real world information and not a wall of marketing slop. I get that it's a hard problem to solve, and being able to adjust rankings is a great tool in helping make search more useful, but it's unreasonable to expect each individual to derank/block each site that markets using AI.
One thing Freya notes in their video though is the lack of any author in the "blog" posts. Some of that comes down to "corporate marketing blog post" type stuff, but you still see authors listed in older pages, even if it's just "company team."
I had the thought "I guess we'll just have to assume anything published after 2022 is AI slop" and tried filtering by date (the "date" input in the advanced search dialog is misleading, btw, "before date" is really "start date" even though it sounds like it should be "end date") and I actually had some surprising results!
The "correct" date range (to 2022)
This is maybe it's own bug but... all the results are undated, and still have the same AI marketing pages that showed up in the unrestricted search results.
Fully bounded ranges
(1990 - 2022)
These results have a lot more forum posts, which are going to be more all over the place, but at least humans explaining things.
(1900 - 2022)
The difference between the 1990
results surprised me, it feels a lot more like old Google to me. It's much less forum driven and more company blog focused, but they're obviously human written and tend to have something useful about them, and at least often provide an actual concise meaningful overview of what a glb file is.
I think a medium between the two would be nice, but honestly, I feel a lot more comfortable in the 1900
results. They provide enough of an introduction to what the .glb format is about without going too off the rails like the 1990
results do.
Unfortunately it loses out on having access to Wikipedia and still useful but recent results (like the Just Solve the File Format Problem wiki which was recently updated putting it in 2024)
I hope that provides the shape of my frustrations and hopes and why I feel like it's a "bug" in the search space.