The following might seem as an absurd feature to consider at the moment as it doesn't seem like the filetype:
operator has had a lot of content indexed for it yet (or perhaps it's trusting only the URL, rather than the "Content-Disposition" header or similar metadata). So, until the search quality for filetypes is improved this feature request is likely not worth immediate consideration.
Essentially, the addition of a filesize:
operator that allows filtering the document length by either a lower or upper bound would enable one to better distinguish a false or sample of a document from a document that actually contains the desired content. Same thing could apply to image sizes, text/html
documents, etc. Perhaps a better name would be content-length:
since that way it directly correlates to the value returned from the HTTP response headers at the time of indexing.
This suggestion is really just an additional search operator that would likely only augment the filetype:
operator and is in essence a "google-dorking" feature (ref: https://www.exploit-db.com/google-hacking-database) that allows better control of the search results whenever kagi is at the point that its indexer starts crawling "interesting" things. Therefore, I'm not sure how many others would actually use it until they know it's actually available.
As mentioned, it is likely not too useful until Kagi indexes more filetypes (such as filetype:pdf
) or MIME types (application/pdf
). But, when/if that does happen the ability to apply comparison operations to the size returned from the "Content-Length" HTTP header would allow one to distinguish documents that contain a lot of content versus documents that contain only a subset of their desired content.
Specifically for my needs, I tend to have to dig up ancient documentation or random standards/specifications. As a lot of the beginning results (from the other search engines) are 4-page water-marked samples, I tend to have to scroll through the results hoping to find a home directory or .edu site that is hosting the actual document or source code that I'm looking for. Being able to specify a minimum size would allow me to give up much faster than before.
Actually, it would also be pretty neat for Kagi to show the size (and date) that was indexed in the Kagi "Website ranking adjustment" popup that can be viewed for an indexed result. (I can create this feature as a separate request if it's easier to track that way).