Add AI image/music generation in Assistant

smartbeach

As others have said, I don't think image generation is a good fit as a general concept.

The comparison with summarizing text isn't fully fare in my opinion. For me, search is about finding what you're looking for, not creating it. By summarizing a page I can find the information I'm after quicker than reading the whole thing, for example.

The closest thing I can think of when it comes to image generation is something like "What does X look like?" to find out whatever you're looking for looks like. For music/sound it could be something like "What does X sound like?" and for those kinds of use cases I suspect generative AIs are too immature and would risk hallucinating too much. If one can have references like the assistant it should be fine though. When image and sound generation is more reliable it might be a better idea though.

But a general content generator doesn't fit in my opinion. Like, why not build an image editor into Kagi? Because it solves a different problem, just like an image generator.

I don't hate the idea in general and I wouldn't boycott Kagi if it was implemented, but I'd rather see focus and effort spent on things related to searching and finding information rather than competing with GenAI services directly. If it's low effort to implement and it doesn't affect what I pay I don't mind it.

mkhalili

In my case I was looking to replace my ChatGPT Plus subscription with Kagi Assistant, which is why I’m testing Ultimate. I’d rather give my money to Kagi and support its mission, than OpenAI... and I was under the impression that Assistant was fully-featured, offering more models and more privacy. (Ofc I may have missed something.)

Image generation isn’t critical to my workflow, and I’m OK if it’s not included in Assistant. I also understand that having this feature could be excessive for a search engine, and focus is important (‘do one thing and do it well’!).

But then I think the limitations of Assistant vs using the models from their original providers, should be clearly signposted on the promotional material for the Ultimate package. If only for the sake of managing expectations.

Vlad

mkhalili Assistant provides all models from OpenAI (except image generation) plus all models from almost all other providers. So unless you want to generate images/music (which is what this thread is about) you should be fine. Did I misunderstand something?

mkhalili

Vlad No that’s clear now and I’m fine with it. I agree that Open AI’s image generation probably shouldn’t be included, given Kagi’s focus.

My only proposal was to mention in the promotional copy around Kagi Assistant, the fact that Open AI’s image generation doesn’t come with it. So people know this functionality isn’t there before diving in.

vijexa

I too was considering replacing my chatgpt subscription with an upgrade to kagi ultimate, but lack of image generation is somewhat of a deal breaker. In my case, image generation is something that I use maybe once in 1-2 months, so it doesn't make sense to have a subscription to midjourney or to both kagi ultimate and chatgpt, and I quite like that with chatgpt I get it included. To me it makes sense that kagi assistant would have image generation, but I interpreted kagi assistant to be a better, more privacy-focused alternative to chatgpt with better access to a search engine, not just an evolution of a search product. Perhaps I was mistaken in this interpretation.

Shayden

vijexa same here. Without image generation it feels like an incomplete package compared to what other sites offer. It would also be nice to be able to have a service for private image generation. If I make a presentation and need an image for it, there is not too much of a difference between searching for it in a search engine or using an Ai assistant to generate something. It should be added in my opinion. I wouldn't mind if it was an add-on to the ultimate plan for 5$ more.

Pictor

I would downvote if I could. I would prefer Kagi NOT do this.

Xytronix

As Vlad is experimenting with different AI providers I would like to highlight my personal favorites and alternatives:

Favorites:

Recraft AI [https://www.recraft.ai/docs#features]
Flux AI [https://docs.bfl.ml]
Ideogram [https://developer.ideogram.ai/ideogram-api/api-setup]
Replicate [https://replicate.com/luma/photon]

Expensive:

Gettyimages [https://developers.gettyimages.com]

Alternatives to consider:

DALLE3 by OpenAI
Midjourney
Adobe Firefly

Future relevant:

https://labs.google/fx

Xytronix

Video Generation

Synthesia [https://docs.synthesia.io/reference/introduction]
Sora (OpenAI) [https://sora.com]
Pika [https://pika.art/api]
Runway ML – https://runwayml.com/api

No API

invideo AI [https://invideo.io]

Xytronix

Reve Image (Halfmoon) #2 AA, #1 imgsys)
FLUX.1 [pro] (v1.0) #5 AA, #2 imgsys)
Midjourney v6.1 #6 AA)
RealVisXL V4.0 #4 imgsys)
Playground v2.5 (Aesthetic Model) #28 AA - standard, #5 imgsys - aesthetic)
ColorfulXL-Lightning #6 imgsys)
Juggernaut XL v9 #7 imgsys)
Image-01 #7 AA)
Midjourney v6 #10 AA)
Ideogram v2 Turbo #11 AA)
Stable Diffusion 3.5 Large Turbo #12 AA)
Proteus #9 imgsys)
Mobius #10 imgsys)
Fooocus (Quality) #11 imgsys)
FLUX.1 [fast] #19 AA, #8 imgsys)

(and the recent new released models; 4o native, 2.0 Flash Exp, Ideogram 3.0, ...)

hcurtiss

My wife and I pay for Kagi Ultimate principally for family uses -- and these days mostly for access to ChatGPT. We just added an Ultimate plan for our 15 year old, and in a couple years will add her younger sister too. There are frequently times when it's fun to create AI images using the ChatGPT app on its free tier. For instance, we took a shot of a squirrel standing proudly on top of a rock and we added an American flag. It was epic and is now hanging in my shop. I took a picture of my daughter's shoe and had AI make it look like the puppy had eaten it. Had her totally going! All of this is to say, casual image generation will, I think, become more and more integrated into casual use of AI tools. I don't think I'd add another $10/month for every one of us, but I'd certainly pay Kagi's costs on an ala carte basis. Charge us, and make sure we know the costs up front, and I'll be just fine. Likewise, I would almost certainly pay $10-15 more per month to add that functionality to everyone in our family plan.

destiny188

I would love to have image generation integrated as a "side feature". It's starting to become good enough that it's actually useful in some situations.

As you're probably aware Google just released their new model "nano banana" which is very impressive, accessible via API and quite cheap at 3-4 cents per image.

Banebe

I created an account just to post this. (Well, that is what this forum is for, I concede.)
Me and 3 friends are contemplating upgrading from a family plan to ultimate.
The initial push was because of having access to newest full models.
But not being able to use it multimodal functionality for the output but only for the input is a big bummer and what is currently suggesting we might should stick with Kagi family and get a separate subscription with any of the providers, which includes multimodal output.
Same as many others I would prefer to spend the money with you instead of a tech giant.
"Do one thing and do it well!", as mkhalili put it, resonates a lot with me.
But you are already in competition with OpenAI etc. pp. for the Ultimate subscription so if it is feasible to provide that functionality, I think it would be a good option for many people.
As the assistant usage is limited anyway, I dont think that additional costs from more usage should be a strong argument.

carl

@Banebe You are deliberately abusing the terms of Kagi and yet make demands on how they should provide their service? The best solution is probably for them to close the account of you and your "friends".

Thibaultmol

This might have been glanced over in the recent announcement. But the new Research Assistant has image generation and editing as one of it's available tools.
https://kagi.com/assistant?profile=ki_research

At the time of writing this it's using Gemini 3 pro image preview with fallback to openai's image gen (gpt-image-1) in case gemini is down and flux kontext pro for editing images (but will soon also be gemini 3 pro).

I think it's best that we close this as image gen is probably the biggest of the audio/image/video gen things that people want to do. You're free to create seperate video and music kagifeedback's as those are both very different things.

TomatootamoT

This should be marked as done right?
But i also think it should be transparent as to how many images the user can generate or the cost to generate.

« Previous Page