One of the most useful resources when developing in "non-mainstream" languages (e.g., Nix) are the existing public repositories. The most common case is the Github search bar (for example, this search query). Currently there are two main issues:
- We depend on having a Github account
- This does not search the other code hosting providers (e.g., Gitlab, Codeberg, etc), much less self-hosted repositories. Admittedly we probably don't want to burden the last ones with crawlers.
Finding code being used out in the wild can be really useful for libraries/languages with limited documentation. I understand that this may be too compute-intensive for it to be viable, but it may still be worth discussing.
The simplest implementation I can think of would entail filtering first by language (e.g., only .nix files in the example before) and fuzzy-finding the closest matches in these files. I presume it being fuzzy would reduce the compute burden. At the moment this feature is implemented in each of the main code hosting sites.