Sure and limit yourself at the starting point, people underestimate how much limiting these tools are, they're trained a on a fixed set can only reproduce noise from here and there
> they're trained a on a fixed set can only reproduce noise from here and there
This anti-AI argument doesn't make sense, it's like saying it's impossible to reinvent multiplication based on reading a times table. You can create new things via generalization or in-context learning (references).
In practice many image generation models aren't that powerful, but Gemini's is.
If someone created one that output multi-layer images/PSDs, which is certainly doable, it could be much more usable.
If image generation is anything like code generation then AI is not good at copying layout / art style of the coder / artist.
Using Visual Studio, all the AI code generation is applying Microsoft's syntax style and not my syntax style. The return code line might be true but the layout / art / syntax is completely off. This with a solution that has a little less than one million lines of code, at the moment, which AI can work off of.
Art is not constant. The artist has a flow and may have an idea but the art will change form with each stroke with even removing strokes that are not fitting. I see as AI generated content lacks emotion from the artist.
Image generation is nothing like AI code generation in this regard. Copying artist style is one of the things that is explicitly quite easy to do for open-weight models. Go to civitai and there are a million LORAs trained specifically on recreating artist style. Earlier on in the Stable Diffusion days it even got fairly meanspirited - someone would make a lora for an artist (or there would be enough in the training data for the base model to not need it) and an artist would complain about people using it to copy their style, and then there would be an influx of people making more and better LORAs for that artist. Sam Yang put out what was initially a relatively tame tweet complaining about it, and people instantly started trying to train them just to replicate his style even more closely.
Note, the original artist whose style Stable Diffusion was supposedly copying (Greg someone, a "concept art matte painting" artist) was in fact never in the training data.
Style is in the eye of the beholder and it seems that the text encoder just interpreted his name closely enough for it to seemingly work.
Putting it in the context of an anti-AI argument doesn't make sense. AI was everywhere, like in photoshop brushes, way before it became a general buzzword for LLMs or image generation. I'm not anti-AI but that it can come up with a limit set based on its training data it simply is the truth. Sure one can get inspiration from a "times table" but if you only see 8s and 9s multiplied you're limiting yourself
> If someone created one that output multi-layer images/PSDs, which is certainly doable, it could be much more usable.
This reminds me, if you ask most image models for something "with a transparent background", it'll generate an image on top of a Photoshop checkerboard, and sometimes it'll draw the checkerboard wrong.
For an artist, the starting point is blank page, followed by a blur of erased initial sketches/strokes. And, sources of inspiration are still a useful thing.