Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Lemon
May 22, 2003

Warning: Big Effortpost Incoming

I've been playing around with Stable Diffusion for a while now and find it endlessly facsinating, but do agree to an extent with the comment/criticism that the images it produces, whilst pretty, are often lacking in composition/intent. Obviously this is where img2img comes into play and I've used that a bit, but I wanted to dig a bit deeper and see what I could really get out if it. Basically, could I use Stable Diffusion to create a quite specific image, down to particular points of composition? This definitely isn't anything that hasn't already been done, but I figured it may be interesting to share anyway.

The image I have in mind is a fantasy painting of an adventurer, haggard and bloodied, exiting out of a dungeon back into the open. He is wearing light armor and holding a sword. His expression is not heroic, but shaken due to what he has gone through in the darkness. He is stepping out of an almost pitch-black archway, and around the edges of the picture there is greenery, in sunlight. In general a not-so-subtle symbolism of the beginnings of a rising from the depths back into the light.

I had a general outline of the composition in my mind's eye, but otherwise was happy to let SD take the wheel on a lot of the other details. I scribbled a quick sketch of the basic outline, again noting a few points that I wanted to make sure were included:



At this point I should note that this sketch was the last time I used my tablet for anything, everything else I did drawing/editing-wise was done using my mouse. The other part of this experiment was to see how much of this could be done with basic equipment and free software only.

I'm using a 8GB GTX 1070, and all of the images are generated using the Protogen x3.4 model, on the DPM++ 2M Karras sampling method at 20 steps.

Edit: I'm using Automatic1111 interface with xformers installed also.

My first step is to make suitable base images in order to feed into img2img. I'm going to do separate images for the archway and for the character, as I think this will work better in terms of letting SD pour on the details without losing the composition. Once I've got some decent images, I'll stitch them together and feed them back into img2img at a lower strength, again to hopefully preserve the composition whilst blending the components into one image.

I could try to free-hand draw these images with my mouse but gently caress that, this is all about leveraging the tools available to me. So I download Daz Studio and throw together a quick scene where I pose my character and decide on the structure of the image. All of the assets I used were things that you get for free with Daz. This is what I end up with:



Perhaps not an amazing composition, but a composition nonetheless, a target to aim for. I also do a separate image of just the archway, zoomed out slightly; I figure if I make my archway image a bit wider in scope then it will give me more leeway when I'm stitching them back together:



Next piece of free software is Krita. I grab the airbrush tool and trace on top of of my archway image:



Very simple stuff, I just make sure to throw in some really basic details like the greenery around the edges of the images, the cracked walls and stonework, an and indication of sunlight coming in from the top right.

Now this goes into img2img, with a size of 512x768. I won't go into detail of exactly why I'm choosing the particular prompts as that would take more time than I have, but I will list them all (also for a lot of them I'm just leaving in the same prompts as previous iterations until a change is really needed so they're not totally specific to each image anyway). Likewise with the denoise and CFG, I'll include the numbers but most of my thinking on these is based on "feel" more than anything else.

Main Prompt: masterpiece concept art, a mysterious ancient archway, eldritch horror, intricate stonework, cracks, vines, sunlight, flowers and rocks on ground, detailed, sharp-focus, hyperrealism
Negative Prompt: messy drawing, octane render
Denoising strength at 0.6 and CFG around 12-15. After about 10 rounds I got something I liked enough:



Now for the character. Again, I just trace directly over my Daz image in Krita with the airbrush and rough brush tools:



This goes into img2mg:

Main Prompt: masterpiece concept art, white background, a haggard adventurer, bloody, bruised, sword, light leather armor, detailed, sharp-focus, hyperrealism
Negative Prompt: messy drawing, octane render
Denoising at 0.6 and CFG at 12.

This took around 30 generations or so before I get something I'm happy to use:



As I suspected might happen, it's not handling the sword well at all. So when I put the images together in Krita, I just remove it for the time being. Again, this is all being doing very quick and rough, so I'm not worrying about making things neat at this point. The white background makes it simple to roughly cut out the soldier and put him on top of the archway image, which I've resised somewhat. I also draw in the missing part of his foot and use the airbrush on his left hand side to emphasise him moving out of the darkness. This is what we now have:



Back into img2mg:

Main Prompt: masterpiece concept art, a haggard adventurer, bloody, bruised, light leather armor, exiting mysterious archway, eldritch horror, crumbling stonework, cracks, vines, sunlight, grass and rocks on ground, detailed, sharp-focus, hyperrealism
Negative Prompt: messy drawing, octane render, impressionism, lowres
Denoising at 0.55, CFG 10.

After another 30 or so rounds I got something that I liked well enough:



There are a few issues with the image I want to fix before moving on, being the vines, the attempt at a bow in his hand and the piece of light fur-looking stuff by his boot. These are simply removed with the Smart Patch tool in Krita:



Back into img2img, and I'm now bumping the size up to 720 x 1080.

Main Prompt: masterpiece oil painting, chiaroscuro, a haggard adventurer, bloody, bruised, light leather armor, exiting mysterious archway, eldritch horror, crumbling stonework, cracks, vines, sunlight, grass and rocks on ground, detailed, sharp-focus, hyperrealism, gritty, 8k
Negative Prompt: anime, sketch
Denoising 0.6, CFG 12.5

Went through a lot of generations on this one, about 70 or so. Ended up with this:



Now I'm pretty happy with this as essentially the final image in terms of the structure and most of the details of the armor, stonework, greenery, etc. However, he still needs his sword, so we jump back into Krita and grab the polygonal lasso and fill bucket:



Into img2img:

Main Prompt: fantasy medieval sword, white background
Negative Prompt: cartoon, clipart
Denoising 0.6, CFG 9

Only about 5 generations to get something half-decent:



Back to Krita, cut out the sword and kludge it into his hand. A bit of airbrushing on the hilt and generally reduce the luminosity of the sword to blend it in somewhat:



Back to img2img, bump the size up again to 800 x 1200:

Main Prompt: masterpiece oil painting, chiaroscuro, a haggard adventurer holding a sword, bloody, bruised, light leather armor, exiting mysterious archway, eldritch horror, crumbling stonework, cracks, vines, sunlight, grass and rocks on ground, detailed, sharp-focus, hyperrealism, gritty, 8k
Negative Prompt: digital photo, messy drawing, bokeh
Denoising at 0.15 and CFG at 18; quite different to previous numbers as I want to preserve as much of the image as possible but just blend in the sword.

Around 20 or so re-rolls and we get:



I like this but the hand is a shiter. So now we go to Inpaint and draw a rough mask around the hand, hilt and a bit of the blade. Masked Content = Original, I used Inpaint at Full Resolution and padding pixels at 32 (the default value for me).

Main Prompt:masterpiece oil painting, chiaroscuro, a hand holding a sword, bloody blade, detailed, sharp-focus, hyperrealism, gritty, 8k
Negative Prompt: digital photo, messy drawing, bokeh
Denoise 0.45, CFG 9

About 20 or so generations until I get something half-decent. Not great, but I think at this resolution it's as good as I'm gonna get unless I want to get into serious manual editing:



You'll probably have noticed that I haven't been using restore faces so far. As the face and the emotion is one of the vital parts of the image, I figured that I was going to have to deal with that separately, so have left that until now that I've got everything else sorted out.

I started off with trying to use Inpaint directly on top of this image but really wasn't getting anywhere. So I cropped a 512x512 image of our guy:



Then back to img2img (with restore faces on, I used CodeFormer at 0.5 strength):

Main Prompt: masterpiece oil painting, chiaroscuro, adventurer, scared face, wide eyes, shock, sharp-focus, hyperrealism, gritty, 8k
Negative Prompt: digital photo, messy drawing, bokeh
I forgot to make a record of the Denoise and CFG at this point but I think it was around 0.65 and 12, as we have a quite different image from the source:



This isn't exactly what I want for the final image but I think it captures enough of the feel that I'm looking for and I figure the more extreme the better, because I want SD to be able to be able to pick it up after I've scaled it down and pasted it on top of our character in Krita:



Now I put this image into Inpaint and masked out the head and shoulders, with a good bit of the background. Same inpaint settings as before, and the same prompts as above. 20 or so generations give us our final guy:



Almost done but he's looking a bit too clean. Back into Krita and grab the paint splatter brush on a multiply level:



img2img:
Main Prompt: masterpiece oil painting, chiaroscuro, a haggard adventurer holding a sword, bloody, scared, wide-eyed, shocked, horror, detailed, sharp-focus, hyperrealism, gritty, 8k
Negative Prompt: digital photo, messy drawing, bokeh
Denoise 0.25 to preserve as much of the detail as possible, CFG 9

Just a handful of re-rolls until we're almost at our final image:



I'm pretty happy with the way it's blended the blood into the clothing, however it's cleaned up the face and changed the expression, and I'm not too happy with the hand. All I do to fix this is to put this image as a layer on top of the previous one in Krita, then use a soft eraser over the face and the hand until those parts come through enough:



And we're done!

Overall, I'm pretty pleased with the result in terms of sticking to the composition that I had in mind and being able to include pretty much all of the elements that I wanted. There are certainly lots of things that could be tidied up, such as the general discrepancy in style between the background and the character, the odd geometry in the top left, and the sword could definitely be better; but from going through this process, I'm pretty confident that these could be achieved with the application of a bit more time. The entire process took about 6 hours, although that included me making all the notes and such along the way.

I consider myself a mediocre artist and I think it's pretty amazing that in just a few hours and without spending any money on professional software I can produce quite a controlled result with this. I think in the hands of truly skilled artists we are going to see some incredible stuff.

Lemon fucked around with this message at 09:54 on Jan 5, 2023

Adbot
ADBOT LOVES YOU

Sedgr
Sep 16, 2007

Neat!

Lemon posted:

Warning: Big Effortpost Incoming

...artists we are going to see some incredible stuff.

Excellent info. :hmmyes:

Rutibex
Sep 9, 2001

by Fluffdaddy

IShallRiseAgain posted:

Goblin in the style of a romance novel cover


"Romance novel cover" is getting added to the spellbook :cheers:

Rutibex
Sep 9, 2001

by Fluffdaddy
There are two kind of "hidden" modes in midjoureny v4 and I have finally got around to testing them with some "optical illusion paradox". To activate the hidden mode you have to add

"--style 4a"



"--style cursed"



The 4a style is apparently the original v4 model, and cursed mode is some kind of AI that was traumatized during training or something. Everything it produces is bad and cursed! The 4a model is interesting and different, I am going to have to run that prompt a bunch to see what else it can produce!

AARD VARKMAN
May 17, 1993

Lemon posted:

Warning: Big Effortpost Incoming

outstanding post dude, well done.

Inspired me to immediately start figuring out installing on my own 8gb 1070 so I can tackle some of the more specific ideas I have that haven't come out well just through prompting.

Analytic Engine
May 18, 2009

not the analytical engine
thanks Lemon

themon

Humbug Scoolbus
Apr 25, 2008

The scarlet letter was her passport into regions where other women dared not tread. Shame, Despair, Solitude! These had been her teachers, stern and wild ones, and they had made her strong, but taught her much amiss.
Clapping Larry
"One paw shall be sufficient..."

Moongrave
Jun 19, 2004

Finally Living Rent Free
Automatic1111 github just got nuked by the looks

Megazver
Jan 13, 2006

Lemon posted:

Warning: Big Effortpost Incoming

That was interesting!

busalover
Sep 12, 2020

BARONS CYBER SKULL posted:

Automatic1111 github just got nuked by the looks



why?

Megazver
Jan 13, 2006
The new repo:

https://gitgud.io/AUTOMATIC1111/stable-diffusion-webui

Moongrave
Jun 19, 2004

Finally Living Rent Free

as far as anyone can tell: mass reports? probably?

mobby_6kl
Aug 9, 2009

by Fluffdaddy
Didn't he turn out to be some sort of rear end in a top hat or something? I remember we already talked about cancelling him a month or two ago.

r u ready to WALK
Sep 29, 2001

https://news.ycombinator.com/item?id=34257818 has lots of juicy gossip but it sounds like it was most likely racism-related

Rutibex
Sep 9, 2001

by Fluffdaddy
gently caress this guy

quote:

A possible reason are the apparently racist game mods he'd also been creating on the same account, including removing any non-white characters from Rimworld and a mod called 'peaceful protests' which is seemingly criticizing the valid upset about George Floyd's murder by the state.

https://web.archive.org/web/20221013110153/https://github.com/AUTOMATIC1111/PeacefulProtests

https://web.archive.org/web/20221013110023/https://github.com/AUTOMATIC1111/WhiteOnly

It seems every artist in his artist's file tagged with an 'n' is black.

https://gitgud.io/AUTOMATIC1111/stable-diffusion-webui/-/blob/master/artists.csv?plain=1#L207

e.g.

Howardena Pindell,0.7686921,n

Barkley L. Hendricks,0.69986427,n

Carrie Mae Weems,0.6645416,n

RETNA (Marquis Lewis),0.47963,n

Wangechi Mutu,0.6394607,n

Bruce Onobrakpeya,0.42588046,n

Moongrave
Jun 19, 2004

Finally Living Rent Free

mobby_6kl posted:

Didn't he turn out to be some sort of rear end in a top hat or something? I remember we already talked about cancelling him a month or two ago.

seems so, but also it's back up because if github killed off every racist programmer's account they wouldn't have a site anymore

pixaal
Jan 8, 2004

All ice cream is now for all beings, no matter how many legs.


BARONS CYBER SKULL posted:

seems so, but also it's back up because if github killed off every racist programmer's account they wouldn't have a site anymore

Stable Diffusion Dev every time automatic was mentioned prior to him being kicked from that discord would remind people about python code being really easy to slip malicious payloads in. Stable Diffusion Dev isn't clean either it could be pure greed or fear. SD Dev might also know something more.

As they aren't getting money for it I don't care about using their product it's fine. As a possible entry point for being owned since it also will update well I'm not sure I trust him there and I'm not sure how many eyes are really on this project since it's basically just cobbling together a bunch of other projects.

GitHub isn't going to remove you for being a nazi unless you start adding nazi references into other projects just because you can "startup tip ... nothing wrong" might be enough. He did had to have done something. (or mass report from art community but why Automatic and not the pieces of it?)

Moongrave
Jun 19, 2004

Finally Living Rent Free
I'm simply not updating since I only did it when there was a new actually useful change

lunar detritus
May 6, 2009


pixaal posted:

GitHub isn't going to remove you for being a nazi unless you start adding nazi references into other projects just because you can "startup tip ... nothing wrong" might be enough. He did had to have done something. (or mass report from art community but why Automatic and not the pieces of it?)

Apparently it was because there were some questionable links in the wiki.
https://github.com/AUTOMATIC1111/st...824e05cfab5cb98

Tunicate
May 15, 2012

Wow. What a shithead. What're the alternative repos for SD local? Even if it comes back, don't wanna use it

lunar detritus
May 6, 2009


Tunicate posted:

Wow. What a shithead. What're the alternative repos for SD local? Even if it comes back, don't wanna use it

https://github.com/invoke-ai/InvokeAI is the biggest one. They are slower in adding features but it's more professional (it actually has a changelog and versions!)

Tunicate
May 15, 2012

Cool i'll try that.

KakerMix
Apr 8, 2004

8.2 M.P.G.
:byetankie:
Looks like some people are just finding out that Adobe can analyze (i.e. train on) any files you have with them, in any software, in the cloud through their services, anything. Turns out they are also not happy about it, but like cmon now, we all knew this was how it was going to go.

Tree Reformat
Apr 2, 2022

by Fluffdaddy

KakerMix posted:

Looks like some people are just finding out that Adobe can analyze (i.e. train on) any files you have with them, in any software, in the cloud through their services, anything. Turns out they are also not happy about it, but like cmon now, we all knew this was how it was going to go.

Artists learning the faceless, soulless corporations they rely for tools for their workflow on are in fact horrible exploitative monsters and treating this as some sort completely out of nowhere betrayal has been an amusing B-plot in all this.

I'm reminded of that one The Onion article after Steve Jobs died that talked about the dude who felt like his own father had died, but completely ignores his actual real alive father.

Ruffian Price
Sep 17, 2016

https://www.youtube.com/watch?v=C9AKL328jSU
this is five years old

like with sites responding to GDPR guidelines, people only react when the danger slowly closing in is already here

Tree Reformat
Apr 2, 2022

by Fluffdaddy

The comments full of people excited and "shut and take my money" for it is just :kiss:. Literally only one dude down on the idea.

Twitter galvanization is loving brain poison for everyone.

Cabbages and VHS
Aug 25, 2004

Listen, I've been around a bit, you know, and I thought I'd seen some creepy things go on in the movie business, but I really have to say this is the most disgusting thing that's ever happened to me.
Github has reinstated the repo and also made it clear that the only reason they killed it was because... the old wiki linked to NSFW models.

That's pretty loving stupid, dude's lovely regressive racist bullshit other projects aside. Doubly stupid when it's just a normal commit so the "offending" comment is now permanent history.



Dude may be kind of a regressive shithead but if he's cranking out good useful OSS, well, I assume that when I do an npm install or w/e I am probably installing a lot of code written by nazis, the world being what it is.

This would definitely keep me away from giving him money, though! I haven't spun up invoke in a bit, maybe I will update to latest and see the comparison and if the stuff I think I like better about webui has caught up.

edit: lol here's a normie or two on YC arguing with nazis about this guy and it's pretty yikes



I am not going to 4chan. I will accept on faith that someone accused of having a long history of posting racist screeds on 4chan as "Voldemort" is definitely a regressive shithead though.

It's really not that hard to not be a nazi. I did some kind of cringe stuff online when I was loving 16 that never approached this level of dedication to being a lovely edgelord, had completely moved on from that poo poo by the time I was 18, and if my employer or future employer ever data mines my SA account and NSA/FBI files, they're just gonna say "wow, this dude used to write a LOT about LSD and DMT back in the mid '00s. Like a LOT. Like loving BOOKS"*. And plays too much magic, has too many synths, and maybe posts his consistent 2.9 posts per day during business hours a little frequently.

(*that quote is close to verbatim from the only take on my posting I was ever able to find in helldump....)

Cabbages and VHS fucked around with this message at 17:57 on Jan 5, 2023

Ruffian Price
Sep 17, 2016

I'm sure the owner of GitHub putting https://bing.com/create out today was coincidental

(that aside, if it's half as good as Azure's neural text to speech is, that's one to watch)

AARD VARKMAN
May 17, 1993
bold to hang up a hand demo like this as one of your examples, Microsoft Bing

AARD VARKMAN
May 17, 1993
Oh hang on this is a DALL-E collaboration?

quote:

Image Creator is a product to help users generate AI images with DALL-E. Given a text prompt, our AI will generate a set of images matching that prompt.

[...]
We take our commitment to responsible AI seriously. We are working together with our partner OpenAI, who developed DALL-E 2, to deliver an experience that encourages responsible use of Image Creator. To that end, we have incorporated OpenAI's safeguards and additional protections to Image Creator.

KakerMix
Apr 8, 2004

8.2 M.P.G.
:byetankie:

AARD VARKMAN posted:

Oh hang on this is a DALL-E collaboration?

of course Microsoft shows up to the part late and out of touch already, lol.

Rutibex
Sep 9, 2001

by Fluffdaddy
sailing the seas of paradox



MikeJF
Dec 20, 2003




Ruffian Price posted:

I'm sure the owner of GitHub putting https://bing.com/create out today was coincidental

(that aside, if it's half as good as Azure's neural text to speech is, that's one to watch)

It seems largely to be a tool to get people curious about it to click agree on Microsoft Rewards making GBS threads up their Windows and Inbox.

pixaal
Jan 8, 2004

All ice cream is now for all beings, no matter how many legs.


AARD VARKMAN posted:

bold to hang up a hand demo like this as one of your examples, Microsoft Bing


I'm actually impressed if the descriptions are the prompts and not cherry picked. The language interpretation seems pretty good. Image fidelity seems way easier to fix up with testing than that part.

e: oh if this is just Dalle it hasn't progressed at all I guess, that was pretty good too. Was excited as this being a starting point, unless they dropped it between launch and now this is probably a dead end.

Moongrave
Jun 19, 2004

Finally Living Rent Free

TIP
Mar 21, 2006

Your move, creep.



the AI art haters have started to eat their own :lmao:


https://nichegamer.com/art-subreddit-bans-artist-style-ai/

the most hilarious part is the justification given:
I don’t believe you. Even if you did “paint” it yourself, it’s so obviously an AI-prompted design that it doesn’t matter. If you really are a “serious” artist, then you need to find a different style, because A) no one is going to believe when you say it’s not AI, and B) the AI can do better in seconds what might take you hours.
Sorry, it’s the way of the world.

busalover
Sep 12, 2020
You paint like an AI. Tough poo poo, rear end in a top hat. Bye.

Tree Reformat
Apr 2, 2022

by Fluffdaddy
Welcome to the global, eternal Turing Test. The Problem of Other Minds is now an actual problem.

Tree Reformat fucked around with this message at 21:10 on Jan 5, 2023

Moongrave
Jun 19, 2004

Finally Living Rent Free
that "i taught myself to draw like the AI to troll people" post but it's real

Adbot
ADBOT LOVES YOU

AARD VARKMAN
May 17, 1993

TIP posted:

the AI art haters have started to eat their own :lmao:


https://nichegamer.com/art-subreddit-bans-artist-style-ai/

the most hilarious part is the justification given:
I don’t believe you. Even if you did “paint” it yourself, it’s so obviously an AI-prompted design that it doesn’t matter. If you really are a “serious” artist, then you need to find a different style, because A) no one is going to believe when you say it’s not AI, and B) the AI can do better in seconds what might take you hours.
Sorry, it’s the way of the world.


this poo poo has been easily available for what, 4 months? you have to wonder how long they expect to claim they can eyeball decide what's AI or not. good loving luck

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply