Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
null_pointer
Nov 9, 2004

Center in, pull back. Stop. Track 45 right. Stop. Center and stop.

Roman posted:

Trying to mentally plan making a film around all of this rapidly moving tech right now is weird.

I think making any sort of creative project around our current toolset is seriously hair-pulling. Either you commit to using a given environment, knowing it's going to be surpassed by The Next Amazing Thing, or you end up chasing that tail, and have to throw away / regen work every time you upgrade.

I'm sort of in that position with a casual project I've been playing with. I abandoned a sci-fi novel, years ago, and have been using Midjourney to sketch out some scenes and see if I can get the sort of vibe I'm aiming for. If I could a consistent visual style and characters, I could see the story resurfacing as a Simon Stahlenhag "Tales From the Loop" visual novel ...

... If I could get the Midjourney V6 alpha model to consistently recreate the characters' features and a look-and-feel, which has been pretty much just burning through my fast hours and praying (I know someone here has said "AI Art" can't be art, because there's no struggle on the part of the creator, but holy poo poo getting the model to do what you want is like trying to use dudes off the street as actors in a serious film)

Lead male:





Not too terrible. MJ really struggles with lighting, costuming, hair styling, and general facial expression. Using very specific and consistent prompt ("John Doe from the movie Blah with his hair in a ponytail and wearing a fashionable trench coat...") does alright, but elements can and will vary greatly.

Lead female:




The first shot is amazing and if I could nail that look and expression for the whole project I would be incredibly satisfied with it. By the third shot, though, we've started to drift (granted, this is a climactic scene and a look of violent disgust is perfect, but it's not quite right)

Alright, let's try a two shot:





Eh? I guess? The first of the two-shots in the lab has a great, cluttered feel and you can feel the tension between the two; by the second shot, her hair has changed and the background has drifted. I could probably this with using reference characters, but I haven't figured out how to use a separate reference image for each character. Maybe generate the scene and characters separately, then combine them in photoshop? But that sounds like a pain in the dick, and I don't know if the next update to MJ will make that unnecessary.

The lead male is incredibly inconsistent in terms of hair style and costume, and the entire art-style is flopping between "dark and cluttered" to "open and airy". Both of them are good, but I can't control it.

Other random poo poo to prove a point:


Uhhh, I guess Val was unavailable this day and Robert Pattinson subbed in?


Nope, Val's back on set. Thanks RPatt? I guess?


Wow, this shot of the female [pro/an]tagonist is great. Can I use the same art style and background, but get her into some sleek, black, cyberpunk armor?


No, I guess not. One more try?


Nope. It's just missing something that the first shot had.

I could go on, forever, about how difficult this is. I've burned through probably 10+ hours of MJ fast time with this pseudo-process: generate some images, tweak the prompt and parameters, regen with variations, maybe flip back and forth between a couple of models, cut out some reference character templates for one-shots, burn votive candles for two-shots, and on and on. I'm practically ready to pay someone to teach me how to inpaint and use photoshop to composite shots,...

... but by then, MJ v6 will have updated and I'll have more control over the image, and all of that time and effort will be throwaway.

Still, it's been fun, and I've been more updated to create than I have in years, but it's also incredibly frustrating.

Adbot
ADBOT LOVES YOU

feedmyleg
Dec 25, 2004
I've more or less got a workflow that accomplishes what I want 90% of the time, but I'm also a graphics professional so I don't expect that it's as accessible for everyone. I generate a base image with ChatGPT, roughly edit it in Photoshop using both traditional methods and Firefly/Generate Fill, subtract and add new elements as needed, adjust any colors or proportions, then rerun select portions of the image through SD to clean them up or further modify them. It still requires a solid base image from GPT, but sometimes I'm able to collage multiple elements together from multiple images and eventually unify them into a seamless style. If I'm not happy with the end result, I'll try plugging that image back into MJ with a heavy image weight to see if it comes out with something better that I can start the process over with. It requires a lot more effort than push-button-get-art, but it really helps it feel like you're a part of the process rather than just a miner down in the image mines searching in the dark for gold.

Every time a new tool comes out, it just makes the workflow easier. I just wish the GPT img2img was anywhere near as good as the MJ one.

feedmyleg fucked around with this message at 17:29 on Jan 1, 2024

null_pointer
Nov 9, 2004

Center in, pull back. Stop. Track 45 right. Stop. Center and stop.

Mind posting a series of images showing how things progress through your workflow? Like the output of each step from beginning to end? I'd love to see it!

Humbug Scoolbus
Apr 25, 2008

The scarlet letter was her passport into regions where other women dared not tread. Shame, Despair, Solitude! These had been her teachers, stern and wild ones, and they had made her strong, but taught her much amiss.
Clapping Larry
I've been using Midjourney and PS to produce paperback covers for a friend of mine. The amount of photoshopping used is pretty impressive to get everything to work.





Each one of the characters was generated separately, then they all had to be color-balanced to match, hands and arms needed to be repositioned.

This is the original "Katrina' image that MJ kicked out with the prompt... 'woman, mercenary, beautiful, young, short dark hair pixie cut, suit, action pose, victorian illustration --ar 2:3'



And this is the final version used on the cover...



That cover has 14 layers, of which 9 are smart objects with 3 to 6 layers of their own each.

Vasa (the big bald guy) was also horizontally flipped for better composition, but that meant I also had to flip the buttons on his vest and his topcoat back so they would be correct and I had to replace his legs with another versions because they just looked weird.

Humbug Scoolbus fucked around with this message at 18:09 on Jan 1, 2024

feedmyleg
Dec 25, 2004

null_pointer posted:

Mind posting a series of images showing how things progress through your workflow? Like the output of each step from beginning to end? I'd love to see it!

Since I'm mostly working on a goofy throwaway Jurassic Park fan project right now, I'm more or less working destructively so I don't have a ton of examples, but this is a recent one. I wanted to depict the scene of young John Hammond with a flea circus on Petticoat Lane. I found images of old flea circuses which show them to be inside of steamer trunks, so I told it to make a young Richard Attenborough in a three-piece suit gesturing toward a trunk in a 1940s London market.

Here's the original generated image:



I felt that it looked enough like him to work with. There's some wonkiness with the tent in the background, with his foot and the trunk, with everyone's faces, and with the fact that GPT always wants to make anything in the medium of gouache or oil paint have a lot of physical artifacts such as cracks or paper bleeding through. In retrospect, I should have left "flea circus" out of the trunk entirely, but it worked out okay.

I erased him from the image using Generate Fill to get a clean background plate, cleaned up the tent



I ran that through SD focusing mostly on the tent and buildings through img2img:



Then once I was satisfied, started using img2img on the people in the background, which I wanted to nudge toward being a family with kids:



Then I worked on Hammond's face, trying to get a balance of likeness and matching texture:



Then I wanted to work on the flea circus itself, so I cleaned up the trunk manually and made it shallow:



Then I made some rough vector shapes to match a reference image I found:



I made those shapes more painterly with SD:



Then piece by piece added the elements I wanted in order to match the dialogue in the film, which I generally roughed-out in PS and ran through SD:



I tightened up the paint work in a few places, threw a sepia tone on it, and called it a day:



If this was for a less silly project I would have done a lot more. It's got a lot of rough edges still, but ultimately I didn't want to spend too much time on it. Overall, this probably took 3 hours? Maybe a bit more, since I kept going back and touching up elements here and there.

feedmyleg fucked around with this message at 18:05 on Jan 1, 2024

Humbug Scoolbus
Apr 25, 2008

The scarlet letter was her passport into regions where other women dared not tread. Shame, Despair, Solitude! These had been her teachers, stern and wild ones, and they had made her strong, but taught her much amiss.
Clapping Larry
90% of the work takes 10% of the time, but that last 10% is a killer because you always are spotting something that can be tweaked.

null_pointer
Nov 9, 2004

Center in, pull back. Stop. Track 45 right. Stop. Center and stop.

feedmyleg posted:

Overall, this probably took 3 hours? Maybe a bit more, since I kept going back and touching up elements here and there.

God drat. Gotta really sweat for each and every scene, huh?

Thank you so much for posting this, though. It was incredibly insightful. I really appreciate it.

Should I start with the latest version of Photoshop and play with generative fill? Or is there another tool and/or approach that you would recommend?

feedmyleg
Dec 25, 2004

Humbug Scoolbus posted:

90% of the work takes 10% of the time, but that last 10% is a killer because you always are spotting something that can be tweaked.

Yeah, every time I get rid of the obvious problems in any given image, 10 more pop up. I didn't even notice the weirdness with the foot and the trunk until after everything else was done, then sighed and went back in for one more pass. I'm sure plenty of the images I consider "final" at the moment still have weird little GPT goblins hiding in the background that I haven't noticed yet.

feedmyleg fucked around with this message at 18:15 on Jan 1, 2024

Humbug Scoolbus
Apr 25, 2008

The scarlet letter was her passport into regions where other women dared not tread. Shame, Despair, Solitude! These had been her teachers, stern and wild ones, and they had made her strong, but taught her much amiss.
Clapping Larry

feedmyleg posted:

Yeah, definitely get the newest Photoshop Beta and just play around with generative fill. It's pretty straightforward stuff, without any bells and whistles like negative prompting.

Yeah, every time I get rid of the obvious problems in any given image, 10 more pop up. I didn't even notice the weirdness with the foot and the trunk until after everything else was done, then sighed and went back in for one more pass. I'm sure plenty of the images I consider "final" at the moment still have weird little GPT goblins hiding in the background that I haven't noticed yet.

:smith::hf::smith:

feedmyleg
Dec 25, 2004

null_pointer posted:

God drat. Gotta really sweat for each and every scene, huh?

That's definitely my most elaborate one, most images take 5-10 minutes to clean everything up. Here's another one which probably took 10 minutes. Generated:



Final:



And one that took probably 45 minutes. Generated:



Final:



But even then, looking at that second final one, I know with a bit of elbow grease I can really fix that aviary dome structure and sky to make them less muddy and sloppy, which I probably will when I do a final pass at all the images in the project. And even the previous one, it would probably be a lot more fun if I made his right arm flailing in the air rather than grabbing onto the rock.

null_pointer posted:

Should I start with the latest version of Photoshop and play with generative fill? Or is there another tool and/or approach that you would recommend?

Yeah, definitely get the newest Photoshop Beta and just play around with generative fill. It's pretty straightforward stuff, without any bells and whistles like negative prompting. I also sometimes use getimg.ai since I have a bunch of credits there and my machine takes forever to run SD properly. They not only have a good img2img with lots of models, but they also have some pretty decent editing tools of their own.

feedmyleg fucked around with this message at 18:24 on Jan 1, 2024

KakerMix
Apr 8, 2004

8.2 M.P.G.
:byetankie:
Yeah like, these AI tools rule, but they are just that, tools. Earlier this last year this was a workflow that I had, though since then I've abandoned Krita, moved to Photoshop, now have both Photoshop and Krita working differently than before, and now MidJourney and Stable Diffusion XL are both out which minimizes more of this required tweaking. This workflow isn't valid anymore because things are different, but it's similar to what has already been posted and illustrates (heh) the mount of tweaking needed to get something decent. I've got a photo editing background so like the others have said, this is a tool you use, a means to an end, not the end itself. We are getting ever closer. Like anything though, the details matter.

KakerMix posted:

Cross-post from from the traditional games thread on a rough outline of ~workflow~ I've kinda got making Shadowrun images.

I don't have a like, teachable good flow yet, all this stuff is bleeding edge and nobody knows what they are doing, especially me. I do have a background in art and especially photo editing, which I'd say this is mostly the nearest analog. The analogy of the content aware tool works really well for this. I should mention that the Krita plugin presents as new layers every time you ask it for a new image. If you ask for a new txt2img, you get that over whatever the image is underneath, ignoring it the picture below. If you ask it to img2img, it will take whatever is visible in your image and render a new one based off that with whatever settings you have set. Inpainting works the same way as img2img, but will only render the masked area. This allows you to paint directly on the image (or an entirely new layer) then render a new image on top of those, being informed by whatever is underneath. Here is what I've had good luck with on these Shadowrun images.

Tools:
MidJourney account, though I don't always use it. For these Shadowrun ones though each one started out as MidJourney.
Automatic1111 with a big pile of models, for these combinations of Protogen Infinity, Analog Diffusion, and various others.
Krita
Auto SD Paint Extension which allows Stable Diffusion to be run inside of Kirta itself.
I've got a pretty powerful computer with a 3090 and a whole heap of RAM as well. I'd like to upgrade to a 4090 or depending on how far I go into this, maybe a non-videogame render card with even more VRAM.

I've also got an XP-Pen Artist 24 Pro along with two other monitors. I'll have Krita with the plugin on the drawing tablet, with Auto1111's interface in a different window on another monitor. You can use both at different times to generate using Stable Diffusion, just have to take turns and also share the model, if you switch the model back and forth within Krita or the Auto1111 interface it switches it for the other as well. However Auto1111 recently added an option to keep models in memory, and I'll do 4 at once. This means that switching between models is very quick. It ALSO means that I never use just one model, no reason to. Different models are better at different things, switching as needed. With how precise inpainting can get I don't find the absolute need for inpainting specific models, however I do want to explore than and see if they help avoid seams when I have to cut across an seamless expanse. They might be a lot better at that aspect, I'll have to see.
I'll have Auto1111 on the one monitor because sometimes I want to generate something outside of Krita quickly, then just pull the generation into Krita by dragging and dropping directly in. Having two work services like this helps me work. Likewise the other monitor can be used for other stuff. More monitors is always better pretty much.

I'll go into MidJourney and just kinda generate some images and see if a vague idea comes about. I did the more somber ones first which is cool but I wanted more emotion. I thought a more relaxed setting with a running team during downtime laughing with each other would be cool, so I tried 'telling jokes' and built a prompt around that. Then I thought about a troll telling jokes and altered the prompt toward that. Lucky for me I checked to see if MidJourney knew what a 'Shadowrun troll' is, since for Shadowrun a troll is just a big human with maybe tusks and maybe horns, but is normal human skin colors. MidJourney for some reason has the trolls as more goblin like and blue probably because classically trolls would be more like this in other universes. Still, I think I can work with this.
Prompt: 'shadowrun troll'


Right, let's try this prompt then:
Year 1985, Shadowrun, cyberpunk 2077, intimate conversation, troll metahuman telling jokes, laughing, humor, smiling, joyous, dynamic camera angle, film grain, movie shot, ektachrome 100 photo

Lucky for me the first roll I did after I learned it vaugley knew what I meant with 'shadowrun' and 'troll' yielded this output.


The first image seemed like a good direction to go, kinda big monsterous sorta guy, blue sure but that's easy enough to change in Krita. Click upscale and see what we get.


Cool, I can see a future with this image. I'm sure most people can't easily tell, but it seems like I can see a MidJourney image, it has a style all its own. Into Auto1111 and make a few passes first to de-Midjourney the style. img2img, de-noising at a low setting of 0.3 with Protogen a few passes to see what comes out. The trick is to make slow changes but multiple passes so you don't radically alter the base image yet steer the results where you want to go. It's going to be a dramatic change from MidJourney to Stable Diffusion anyway. Changed the prompt to this and looped it into itself two or three times:
intimate conversation, troll telling jokes, laughing, humor, smiling, joyous, cyberpunk, 1985, ektachrome 100 movie still


I work on the troll first. I want a horn so I swipe a picture of some plastic horns, trim them out, flip and warp them around and place it over the guy's head in a way that might work.


I also gotta un-blue the guy so I mask the face and gently caress with the colors and get it reasonably closer to human skin tones AND draw in the pointed ear. You can see that the horn isn't perfectly trimmed and there is a bunch of outline noise, and that the top of the ear is literally me using the airbrush tool and swiping altered colors from the bottom of the ear and just vaguely suggesting a pointed ear.


I then use the airbrush tool and mask over the face and inpaint the mask (using neon green so I can see the mask easier, you can use whatever color you'd like though) with this new, altered prompt:
dark skinned native american troll with horns and pointed ears, telling jokes, laughing, humor, smiling, joyous, cyberpunk, 1985, ektachrome 100 movie still



Awesome result, but what the gently caress the ear isn't pointed at all! :argh:
I redraw the ear like last time with the airbrush tool and sampling the colors as needed and mask ONLY it, change the prompt to:
pointed ears on a dark skinned native american troll with horns and pointed ears, telling jokes, laughing, humor, smiling, joyous, cyberpunk, 1985, ektachrome 100 movie still




Good, looks great. You might notice that his mouth is hosed up with a weird bit of flesh, or the outline around where the mask is. These are things I will take care of when the rest of the image is done and I'm doing the last bit of tweaking to finalize the whole image. Learned my lesson with the ear. I do this sort of thing to all sorts of bits of the image. The troll's hands holding the bottle which is me painting out whatever stuff the guy is holding and mashing a bit of hand together, cloning the bottle in the center of the image and roughly putting it his hand, masking it and changing the prompt to compensate.

Here is what I did to the woman's hand holding the cup:





I then started working on her overall but came back to the hand and tweaked it later. Still wasn't fully aware I needed to work up but the hand was not in her main mask so it was ok. Speaking of the lady, mask her off and change the prompt yet again. I iterate a few times till I get a result I like. New prompt:
gorgeous techno disco queen, telling jokes, laughing, humor, smiling, joyous, cyberpunk, 1985, ektachrome 100 movie still






Last one is THE one. It's her smile, it feels more full, more joyous and real. She's not being polite, she can't help but crack up. Combined with her body's stance is like she's heaving with a deep laugh. Again with the masking I'll attempt to tackle that towards the end of me messing with the image.

The rest of the work is variations on that, masking out things, changing the prompt, refining things down. It still does have that thing where you'll be working towards something then go "NOPE!" and just back right out from the last 40 minutes of work you put in to go a completely different direction. This is how the plate of food ended up on the table, I got rid of all the extra cups and the hosed up hands. The plate of food was a stock image of wings.

For getting rid of the tell-tale signs of masking I will mask like normal, but not bound it with the selection tool. Finally I'll switch to analog_diffuision, then strip the prompt down to the original prompt, set denoising to 0.1 and run it through a couple times to add in some nice noise and help even out everything so it looks more like an 80s movie.

Not entirely happy with this specific image, but it was the first one where I did a combination of a lot of things to make it. Lots of errant halos still, the hands are hosed up, if less so than normal AI stuff. I doubt I'll come back to this image, but it's all in a single krita file with all its layers at least!

Big takeaways are:
1. Like drawing, work towards details. If I did that going in I wouldn't have wasted my time with drawing that ear twice. Work in broad strokes and 'layer' up as you finish parts.
2. Smaller the better. If you can work in smaller areas within Krita you get the whole resolution for just the area you are working on. This means that two major AI tells for me, car wheels and people's faces, are rendered 'full res' when you draw a bounding box around your mask and inpaint only within there. The overall resolution doesn't change but you render that detail into the small area, like super sampling. Maybe it IS super sampling, idk. The exception to this is when I'm cleaning up errant seam lines from masking. Those I'll paint on the image directly to obfuscate the seam behind some brush strokes, then inpaint without the bounding box (thus giving a lower resolution result because it's doing the whole image even though the result is just the mask) and adjust the layer opacity as needed. Easy!
3. Using actual pictures is better than drawing details yourself. This is because the AI wants noise, that's in-built to any image you take off the internet. In the Native American troll guy above the horn ended up turning into some sort of fuzzy talisman like horn thing, but that's cool by me because it gives character that I like. It worked that I had a picture, even of cheap plastic horns, because the noise was in the image so Stable Diffusion could use that in a realistic way. If I drew a flat-shaded horn it wouldn't work as well because it's trying to put detail where there isn't any. A worse image with more noise is better than a well-drawn, flat one when it comes to realistic photo-like images. You can almost completely ignore watermarks too as that is noise that will certainly get lost in the first past, or if needed you can roughly paint them out using the surrounding colors and the AI is smart enough to know what you are going for, depending on the prompt.
4. Leave the overall bit of your prompt and change as needed. I'd keep the prompt the same, but add whatever I was inpainting on the front for that specific inpaint. Pointed ears, holding a coffee cup, etc.
5. It is easier to have a vague idea about the emotion of whatever image you want, rather than being specific. In making these I'm not saying to myself 'Ok I am going to have a troll on the left and then a woman on the right and another person just out of frame and...' instead it's 'a troll telling a joke and everyone legit enjoying it would be cool' and see what comes up. Working with things the system provides you is a lot easier to make a cohesive image. The hands are hosed up but there are fingers there for the woman so I used those and built her hand in a way that didn't immediately make my brain go "AI HANDS" and notice it to a fault.
6. SAVE YOUR PROMPTS. You can alleviate this a bit if you keep your generations and can toss them back into the 'PNG INFO' tab in Auto1111, but the prompts are NOT saved within Krita at all or the file itself. I've taken to just putting the prompts I use in a .txt file where I keep the main Krita file.

Final image:

KakerMix fucked around with this message at 01:16 on Jan 2, 2024

AARD VARKMAN
May 17, 1993
I look forward to doing more project style stuff in the future once the tools can lock in a character or style better. Feels like we can't be all that far off at this point from one of the big competitors finally offering at least one of those in a consistent format.

MJ tried the style side with tuners but I had difficulty with subject changes on those and they aren't out on v6 yet

Tarkus
Aug 27, 2000

Yeah, right now when it comes to AI it's going to hinge on technique rather increasing amounts of data or compute. I think being able to direct the actions in finer ways is going to be the thrust of development for the next few years and I'm totally here for it. This poo poo is amazing and is moving at amazing speeds.

null_pointer
Nov 9, 2004

Center in, pull back. Stop. Track 45 right. Stop. Center and stop.


:stonklol:

Holy Christ. While I can follow your steps in a very abstract way, there's so much art and effort within each one of them. Kudos. It's both impressive and also anxiety-inducing to think that I would need to do something similar.

The thin-lined comic book/anime style I'm going for, though, makes this both easier and harder, I think. I really don't know of any tools that could take a photo, and apply a reasonably consistent art style that doesn't automatically scream "AI generated". Also, I have almost no illustration skills, so trying to manually paint in details is probably a non-starter for me.

What makes this harder is that I simply don't know what tools are available to me. For example, I've heard of img2img, but the only thing I know of it is that you can use it to turn goatse into a pastoral scene. That whole thing about using stable diffusion to "denoise" and perform other work? Right over my head.

I think, if someone mentioned above, getting Photoshop with generative fill might be a decent first step for creating reference images and compositing.

Since I think that using Krita to hand paint stuff is a non-starter, is there another tool I could use that might be able to keep characters consistent, from scene to scene? That might be a decent first step.

Thanks again to everyone for sharing their workflows and techniques. It's been eye-opening in both a positive and negative way :smith:

Humbug Scoolbus
Apr 25, 2008

The scarlet letter was her passport into regions where other women dared not tread. Shame, Despair, Solitude! These had been her teachers, stern and wild ones, and they had made her strong, but taught her much amiss.
Clapping Larry
I'm a high school creative writing teacher for my day job and teaching the kids how to use ChatGPT, Claude, or Bard to aid in their own projects was a large part of last semester (you know they're going to do it, at least show them why it's a bad idea to just fling whatever it spits out into a story or essay). These new tools are absolutely incredible, but currently their output is never going to be exactly what you want so you will have to edit and change, and that takes effort and practice.

Tarkus
Aug 27, 2000

Humbug Scoolbus posted:

I'm a high school creative writing teacher for my day job and teaching the kids how to use ChatGPT, Claude, or Bard to aid in their own projects was a large part of last semester (you know they're going to do it, at least show them why it's a bad idea to just fling whatever it spits out into a story or essay). These new tools are absolutely incredible, but currently their output is never going to be exactly what you want so you will have to edit and change, and that takes effort and practice.

I like this point of view. These are powerful tools, they don't need to be a crutch.

Moongrave
Jun 19, 2004

Finally Living Rent Free
doing D&D stuff for friends using the bing one takes so much effort to get around the dalle filters

but the quality is very good

GABA ghoul
Oct 29, 2011

AARD VARKMAN posted:

I look forward to doing more project style stuff in the future once the tools can lock in a character or style better. Feels like we can't be all that far off at this point from one of the big competitors finally offering at least one of those in a consistent format.

MJ tried the style side with tuners but I had difficulty with subject changes on those and they aren't out on v6 yet

I think we are already there if you are willing to use training/dreambooth/lora. Every time I look into it, it seems to get better understood, easier, faster and requires less memory. I think we might already see an easy to use 1 click solution for SD by the end of the year.

AARD VARKMAN
May 17, 1993

GABA ghoul posted:

I think we are already there if you are willing to use training/dreambooth/lora. Every time I look into it, it seems to get better understood, easier, faster and requires less memory. I think we might already see an easy to use 1 click solution for SD by the end of the year.

oh sure, it's doable, but it's a lot of effort if you don't have a specific project you're already doing in mind. since I do this purely for my own enjoyment, right now I'm content with the "finding gems in the random generation mines" since I know I'll be able to play with full project style stuff later on a lot easier.

more power to any of you that can put in the kind of effort we've seen the last couple pages though!!!

Swagman
Jun 10, 2003

Yes...all was once again peaceful in River City.




















KakerMix
Apr 8, 2004

8.2 M.P.G.
:byetankie:

AARD VARKMAN posted:

I look forward to doing more project style stuff in the future once the tools can lock in a character or style better. Feels like we can't be all that far off at this point from one of the big competitors finally offering at least one of those in a consistent format.

MJ tried the style side with tuners but I had difficulty with subject changes on those and they aren't out on v6 yet

It's consistently the winner in the micro polls Midjourney asks every few days right after in-painting for v6. It's coming. Right now with v6 you can kind of keep a character around by using a headshot of someone as the initial image prompt, then text whatever you want them to be doing. I ran into it trying unrelated skull stuff. I should see how viable that is right now, actually.


null_pointer posted:

:stonklol:

Holy Christ. While I can follow your steps in a very abstract way, there's so much art and effort within each one of them. Kudos. It's both impressive and also anxiety-inducing to think that I would need to do something similar.

The thin-lined comic book/anime style I'm going for, though, makes this both easier and harder, I think. I really don't know of any tools that could take a photo, and apply a reasonably consistent art style that doesn't automatically scream "AI generated". Also, I have almost no illustration skills, so trying to manually paint in details is probably a non-starter for me.

What makes this harder is that I simply don't know what tools are available to me. For example, I've heard of img2img, but the only thing I know of it is that you can use it to turn goatse into a pastoral scene. That whole thing about using stable diffusion to "denoise" and perform other work? Right over my head.

I think, if someone mentioned above, getting Photoshop with generative fill might be a decent first step for creating reference images and compositing.

Since I think that using Krita to hand paint stuff is a non-starter, is there another tool I could use that might be able to keep characters consistent, from scene to scene? That might be a decent first step.

Thanks again to everyone for sharing their workflows and techniques. It's been eye-opening in both a positive and negative way :smith:

Sorry :(
I take for granted that when I go 'lol just change it in pshop ez pz', I don't meant to dissuade. As someone that's done this sort of stuff digitally for about 25 years now this AI image thing rules. When this all popped off I was going :hmmyes: and was very surprised when there was such a backlash against this stuff on twitter and the internet in general. Making art (or ~'art'~) should be as easy as possible, and it is! The digital realm should be no different. I sure did watch a number of my contemporaries become luddite boomers right before my very eyes though!

Brawnfire
Jul 13, 2004

🎧Listen to Cylindricule!🎵
https://linktr.ee/Cylindricule

Swagman posted:

One hell of an aesthetic journey

Well that was one hell of an aesthetic journey

null_pointer
Nov 9, 2004

Center in, pull back. Stop. Track 45 right. Stop. Center and stop.

It's a major pain in the rear end, but I think I've found a way to use a multi-person reference image to influence the MJ v6 model in a strong way.

Render lab background in MJ:


Use stock photo to generate lead male reference in MJ:


Use stock photo to generate lead female reference in MJ:


Use Photoshop to cut out reference figures (did the first one by hand before I found the "Remove Background" function :dumbgun: ) and clumsily scale and place them:


Use this as a reference image, making sure to use "on the left" and "on the right" descriptions, first, before going into the background details.

Final MJ v6 image:


What amazes me is that, using the final photoshoped reference image, MJ understood which sentence was referring to who and got it right in all four initial images.

I'm wondering if too much detail in the lab background is helping or hurting MJ, and if there's a way I can save time. Still, it's super encouraging. I might even be able to save my MJ credits and not use it to generate the "cut out" character images!

Javid
Oct 21, 2004

:jpmf:

KakerMix posted:

I sure did watch a number of my contemporaries become luddite boomers right before my very eyes though!

The generational inversion of the reactions to AI images have been hilarious. Younger people are making GBS threads their entire rear end about it, whereas the facebook olds are just "lol magic picture machine go brr"

KakerMix
Apr 8, 2004

8.2 M.P.G.
:byetankie:

null_pointer posted:

It's a major pain in the rear end, but I think I've found a way to use a multi-person reference image to influence the MJ v6 model in a strong way.

Render lab background in MJ:


Use stock photo to generate lead male reference in MJ:


Use stock photo to generate lead female reference in MJ:


Use Photoshop to cut out reference figures (did the first one by hand before I found the "Remove Background" function :dumbgun: ) and clumsily scale and place them:


Use this as a reference image, making sure to use "on the left" and "on the right" descriptions, first, before going into the background details.

Final MJ v6 image:


What amazes me is that, using the final photoshoped reference image, MJ understood which sentence was referring to who and got it right in all four initial images.

I'm wondering if too much detail in the lab background is helping or hurting MJ, and if there's a way I can save time. Still, it's super encouraging. I might even be able to save my MJ credits and not use it to generate the "cut out" character images!

I would honestly try just literally scribbling raw shapes down in Photoshop or paint or whatever, using colors you want them to be, MS-paint style, just make sure to make full shapes. Then when you have that, use that image plus your text in the prompt and see what you get. I've not tried it, but v6 is quite good at lots of things, maybe it's more like img2img in Stable Diffusion.
To answer your sorta-question earlier, what I just described as img2img. I think of it like a scaffolding that you then prompt an image onto.

Tarkus
Aug 27, 2000

Javid posted:

The generational inversion of the reactions to AI images have been hilarious. Younger people are making GBS threads their entire rear end about it, whereas the facebook olds are just "lol magic picture machine go brr"

I think that for a lot of us olds, we don't think it will affect us as much and I think most people my age think it's just some quick fad since we've been promised countless other "WORLD CHANGING PARADIGMS!" all our lives. I've been talking to people at work about it and most people young and old seem to know very little about it and don't how to use it. Most people think it's something for doing homework or making bad images but otherwise know very little.

Personally, for me, I view it as an amazing tool that's capable of a lot of good things and bad things. I believe that both extremes of "AI is just garbage and generates garbage" and "AI is going to bring us AGI and allow us to build a god and deliver us from suffering!" are both wrong. I think it's important to convey to people that it is a tool and should be regarded as such and should be, in some ways, regulated but embraced because it's not going away.

null_pointer
Nov 9, 2004

Center in, pull back. Stop. Track 45 right. Stop. Center and stop.

KakerMix posted:

I would honestly try just literally scribbling raw shapes down in Photoshop or paint or whatever, using colors you want them to be, MS-paint style, just make sure to make full shapes. Then when you have that, use that image plus your text in the prompt and see what you get. I've not tried it, but v6 is quite good at lots of things, maybe it's more like img2img in Stable Diffusion.
To answer your sorta-question earlier, what I just described as img2img. I think of it like a scaffolding that you then prompt an image onto.

Got it. Is there a one-click installer version of Stable Diffusion that could help me in-paint / img2img / scribble my way to reference photos? Generating them in MJ is hella costly in terms of processing time.

moist banana bread
Dec 17, 2023

banana Jake!

null_pointer posted:

Got it. Is there a one-click installer version of Stable Diffusion that could help me in-paint / img2img / scribble my way to reference photos? Generating them in MJ is hella costly in terms of processing time.

I don't know if there's a one-click installer but I can tell you what I did.

This page is gonna be Automatic1111, the web-ui for running Stable Diff.

quote:

Automatic Installation on Windows
Install Python 3.10.6 (Newer version of Python does not support torch), checking "Add Python to PATH".
Install git.
Download the stable-diffusion-webui repository, for example by running git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git.
Run webui-user.bat from Windows Explorer as normal, non-administrator, user.

If you need help with any of this I can help out.

Tarkus posted:

I think that for a lot of us olds, we don't think it will affect us as much and I think most people my age think it's just some quick fad since we've been promised countless other "WORLD CHANGING PARADIGMS!" all our lives. I've been talking to people at work about it and most people young and old seem to know very little about it and don't how to use it. Most people think it's something for doing homework or making bad images but otherwise know very little.

Personally, for me, I view it as an amazing tool that's capable of a lot of good things and bad things. I believe that both extremes of "AI is just garbage and generates garbage" and "AI is going to bring us AGI and allow us to build a god and deliver us from suffering!" are both wrong. I think it's important to convey to people that it is a tool and should be regarded as such and should be, in some ways, regulated but embraced because it's not going away.

I've had at least 3 customers go on rants about how AI is going to ruin everything, spanning generations. The older one was just angry she had to use a kiosk to order McDonalds, though. I mean, I get it. I don't wanna do that sh*t either.

moist banana bread fucked around with this message at 04:20 on Jan 2, 2024

Javid
Oct 21, 2004

:jpmf:
I also wonder how many olds have been automated out of a job before, or know people who have. Watching millennials shriek about AI taking over artists' jobs you'd think it was the first time technology had threatened to do this.

Another hobby of mine is wood engraving. I can spend a few hours fastidiously cutting nice, grippy checkering into a gun stock or hammer handle, or I can pay a laser place $20 to zap an identically functional grid of the same general design onto it. For most people there's no discernible difference, but the guys paying $gently caress for custom rifles will preferentially purchase the services of a human craftsman to perform this operation - to a degree that human errors in the finished product add value rather than subtracting it, so there's still plenty of guys out there making money doing it.

I suspect art is going to settle into an equilibrium like that once the first few years of change-based hysterics are behind us

Sucrose
Dec 9, 2009
The last 2(?) years of AI development are a legit technological breakthrough and I think anyone who doesn’t think so is kidding themselves.

mazzi Chart Czar
Sep 24, 2005

Javid posted:


Another hobby of mine is wood engraving. I can spend a few hours fastidiously cutting nice, grippy checkering into a gun stock or hammer handle, or I can pay a laser place $20 to zap an identically functional grid of the same general design onto it. For most people there's no discernible difference, but the guys paying $gently caress for custom rifles will preferentially purchase the services of a human craftsman to perform this operation - to a degree that human errors in the finished product add value rather than subtracting it, so there's still plenty of guys out there making money doing it.


one of the most hypnotic things ever created was a Japanese top competition.
And the two best contestants were the super high tech cutting edge sciences vs master craftsman.


The end of the competition is at 21:39
https://www.youtube.com/watch?v=-q-hcidtjiM

The craftsmen win, but by an okay percentage.
The scientist guys can re-create everything they did, to make the item, and it will get cheaper every year.
While the craftsmen, will probably always make a better item, by a hair's mark, it will still take a lot of skill, effort, time and money.

But the average consumers won't care about that hair improvement, and be excited to get the scientist's item for a lower price.

I don't think the specialize craftsman will ever disappear.
We still have people who make watches.
And now more amateurs than ever who want to put together watches.

mazzi Chart Czar fucked around with this message at 05:05 on Jan 2, 2024

Moongrave
Jun 19, 2004

Finally Living Rent Free

Javid posted:

The generational inversion of the reactions to AI images have been hilarious. Younger people are making GBS threads their entire rear end about it, whereas the facebook olds are just "lol magic picture machine go brr"

Especially funny is them calling everything AI even when it’s very obviously just a lovely photoshop

It’s loving stupid and everyone sucks

maxwellhill
Jan 5, 2022
Probation
Can't post for 5 hours!

KakerMix posted:

unrelated skull stuff

sounds like a username

feedmyleg
Dec 25, 2004

null_pointer posted:

Got it. Is there a one-click installer version of Stable Diffusion that could help me in-paint / img2img / scribble my way to reference photos? Generating them in MJ is hella costly in terms of processing time.

As someone who is dumb and bad at all this technical stuff, if you have a Mac, Draw Things is the one-click install that does it all. Well, except Automatic1111 as far as I can tell. But it's a fantastic gateway.

The Sausages
Sep 30, 2012

What do you want to do? Who do you want to be?

null_pointer posted:

Got it. Is there a one-click installer version of Stable Diffusion that could help me in-paint / img2img / scribble my way to reference photos? Generating them in MJ is hella costly in terms of processing time.

It's not one-click but Stability Matrix is as close as it gets, my go-to for an installer. You install Stability Matrix and choose the SD packages you want it to install; you can also download models and loras directly from civitAI and they'll be shared between the different SD packages.

For what you're asking you'd probably want to try using controlnet rather than (or in conjunction with) img2img, easiest way to get that up and running is to install SD.Next which has controlnet support built in.

GABA ghoul
Oct 29, 2011

Tarkus posted:

I think that for a lot of us olds, we don't think it will affect us as much and I think most people my age think it's just some quick fad since we've been promised countless other "WORLD CHANGING PARADIGMS!" all our lives.

Yeah, it's this. A major concern isn't even what you can do with it now, but in 10 or 20 years. Almost nobody expected something like dalle-2 would be possible this decade(or, for a lot of people, at all, ever ). Then we got from that to Controlnet and Lora in under a year(which I would have expected around ~2025 at the earliest, even after seeing dalle-2). It certainly feels like a loss of safety and trust in the future for a lot of young middle class people who are just starting out and who were promised from early childhood that creative/intellectual roles in society would be the last to go. The floor can give away under your feet at any moment if there is a new breakthrough.

Javid posted:

I also wonder how many olds have been automated out of a job before, or know people who have. Watching millennials shriek about AI taking over artists' jobs you'd think it was the first time technology had threatened to do this.

I think there were certain implicit societal promises made that if you work with your mind you are going to be safe for a long time and all the truck drivers would be out of work by 2018 and trying to learn Photoshop at community college. Now the truck drivers are making 80k a year and are one of the most in demand jobs and artists are going to truck driving schools. I think it's a feeling of betrayal for middle class college educated people. The working class has been living with the automation Damocles sword over their head for forever so it's not such a huge issue for them, yeah.

Stupid_Sexy_Flander
Mar 14, 2007

Is a man not entitled to the haw of his maw?
Grimey Drawer
Is there a software package to install for MJ or any of the others? I know exactly nothing about Python but everything I see about data sets and making your own generator basically involves programming.

Just hoping there's like a Windows exe where it'll let me just double click and boom, it's installing and I can type in a lil box and make imaginary pictures happen.

pixaal
Jan 8, 2004

All ice cream is now for all beings, no matter how many legs.


Stupid_Sexy_Flander posted:

Is there a software package to install for MJ or any of the others? I know exactly nothing about Python but everything I see about data sets and making your own generator basically involves programming.

Just hoping there's like a Windows exe where it'll let me just double click and boom, it's installing and I can type in a lil box and make imaginary pictures happen.

Mid Journey is like Dalle, it's a private product you can only use it on their servers. It's a discord! You get to post all your prompts in a public chat!

MJ has a website version coming "soon" I might check it out at that point. It's not a free product and if I'm paying I don't want it to suck to use.

null_pointer
Nov 9, 2004

Center in, pull back. Stop. Track 45 right. Stop. Center and stop.

I'm way out of touch, apparently, but mid journey is being sued into the ground? What are the chances it'll still be alive and useful a year from now, or will it be neutered to the point of uselessness like DALL-E?

Adbot
ADBOT LOVES YOU

AARD VARKMAN
May 17, 1993

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply