Sort:  

Heh, wow. There's some hints of texture in the walls of the top image but I see what you mean, slightly jarring in the second... unnaturally clean. I suppose if you added 'rundown buildings' it might make them closer to ruins—holes in the roofs, collapsed walls, exposed framework, that sort of thing—structural degradation rather than just a bit of normal wear. Depends what you're after of course, but I guess you're wanting complete buildings 😄

It's surprisingly hard to get a specific result in my experience!

A picture's worth a thousands words, as the old saying goes, but these image models kind of show that the difference between a hundred-word picture and a thousand-word picture isn't really 1:10

I think the next big step might be further training image models with RLHF, like how ChatGPT is a refinement of GPT-3 guided by human preference. Imagine telling alfamix to focus on making the walls slightly more timeworn without changing the fundamental composition of the image. Would be interesting...

Now with GPT-4 able to respond to multi-modal prompts, this could be just over the horizon