The llama3-70b should have a quality similar or slightly supperior to claude3-haiku but is cheaper and a lot faster when using the API from groq. (+300t/s there)
https://wow.groq.com/
Groq doesn't support the new Gemma2-27b model yet, which should be at a similar level to llama-70b (or maybe better) and be even faster.
It's possible to try for free at groq.com, I think it's quite insane to have this speed at the instant answers.