You are viewing a single comment's thread from:

RE: Virtual Cityscapes with a new Stable Diffusion model

in #aiartlast year

Pretty cool fine-tuned model there! Do you think the similar building style and colours across the images is from your prompt(s) or the training data?

Might have to try it out and see what it does with like architectural terms, for example... 🤔

Sort:  

Definitely prompt based, switched "futuristic cityscape" to "medieval town" with some other embellishments and got these out. I wanted something more run down, but it insists on putting nice quality buildings there, even on muddy roads.

00000-3211241143.png

00004-2854989471.png

Heh, wow. There's some hints of texture in the walls of the top image but I see what you mean, slightly jarring in the second... unnaturally clean. I suppose if you added 'rundown buildings' it might make them closer to ruins—holes in the roofs, collapsed walls, exposed framework, that sort of thing—structural degradation rather than just a bit of normal wear. Depends what you're after of course, but I guess you're wanting complete buildings 😄

It's surprisingly hard to get a specific result in my experience!

A picture's worth a thousands words, as the old saying goes, but these image models kind of show that the difference between a hundred-word picture and a thousand-word picture isn't really 1:10

I think the next big step might be further training image models with RLHF, like how ChatGPT is a refinement of GPT-3 guided by human preference. Imagine telling alfamix to focus on making the walls slightly more timeworn without changing the fundamental composition of the image. Would be interesting...

Now with GPT-4 able to respond to multi-modal prompts, this could be just over the horizon