Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Darchangel
Feb 12, 2009

Tell him about the blower!


Yeah, NovelAI is the one.

You did end up with some decent stuff there. Nice!

Adbot
ADBOT LOVES YOU

mobby_6kl
Aug 9, 2009

by Fluffdaddy
Thanks yeah it's not like Louvre-level stuff I can see making a few prints out of it to spice up the cave.

Anyway, here's what the experiment was about, it's really goes back to the origin of this tread with blending of different cars. It's a relatively new feature.

Stable Diffusion Demos - AND + NOT (a.k.a. negative prompts)
Compositional Generation using Stable Diffusion. Our proposed Conjunction (AND) and Negation (NOT) can be applied to conditional diffusion models for compositional generation. Both operators are added into Stable Diffusion WebUI! Corresponding pages are as follows: Conjunction (AND) and Negation (NOT).

https://energy-based-model.github.io/Compositional-Visual-Generation-with-Composable-Diffusion-Models/

To be honest it's still a bit opaque to me what exactly it does, I probably need to read it again at not like 1 AM. But tl;dr it's a way of controlling how SD does the composition of different concepts. The horse example didn't really work for me, but this one does


a cat AND a dog
a cat and a dog
a cat dog
a cat, a dog

"cat AND dog" clearly tries to make a hybrid (which is what we want), though "cat dog" kind of does too but less successfully.


The other aspect in play here is the "Script" functionality in the Automatic1111 build, which lets you generate hundreds of images to see how it reacts to different parameters. So I ran this with the same seed and a query like "photograph of (cadillac brougham:1.0) (porsche 911:0.0)" where 1.0 and 0.0 are weights that get substituted from a list.


photograph of cadillac_1.0 AND 911_0.0


Clearly not the best idea :v: both Cadillac and 911 are too generic, so you sometimes get an Escalda or 9/11 lol.

photograph of (cadillac brougham_1.0) (porsche 911_0.0).


photograph of (cadillac brougham_1.0) AND (porsche 911_0.0)


photograph of (cadillac brougham_1.0) AND (porsche 911_0.0)


photograph of (cadillac_brougham_1.0) AND (porsche_911_0.0)


They all work and produce interesting results, but I think the last one is the most obvious and predictable result, since it seems to treat "cadillac brougham" as one thing and doesn't turn the query into just "Cadillac" when it's at 0.0.

I'll probably need to do another run with photograph of "(cadillac_brougham_1.0) (porsche_911_0.0)" or something. :negative:


It's also of course possible to do just one variable (weight of the 911) and make a moving picture out of it

https://i.imgur.com/6QSILfk.mp4

mobby_6kl fucked around with this message at 12:59 on Oct 18, 2022

tinned owl
Oct 5, 2021
That last ones a nervous transformer

sharkytm
Oct 9, 2003

Ba

By

Sharkytm doot doo do doot do doo


Fallen Rib

tinned owl posted:

That last ones a nervous transformer

That last one's a fever dream.

Darchangel
Feb 12, 2009

Tell him about the blower!


mobby_6kl posted:

It's also of course possible to do just one variable (weight of the 911) and make a moving picture out of it

https://i.imgur.com/6QSILfk.mp4

The other component is clearly not a Cadillac. Funny how it ends up looking like a Jag XJS somewhere in the middle.


sharkytm posted:

That last one's a fever dream.

You ever seen "What Dreams May Come"?
Yeah, it's that, but cars.

mobby_6kl
Aug 9, 2009

by Fluffdaddy

Darchangel posted:

The other component is clearly not a Cadillac. Funny how it ends up looking like a Jag XJS somewhere in the middle.
I started the animation at like 80% Cadillac because it looked like it resulted in a more interesting progression. You can check the huge grid what the actual starting point was, which to be fair doesn't look like a platonic ideal of a Cadillac either. Not sure why, but I just had to go with something for the experiment. I'll try running it from 100% in smaller steps next time, it just takes hours :)

It seems that it clearly can't just invent completely new things, so it goes though various cars it already knows and steals elements that would satisfy both Cadillac and 911 to some degree, so at some point it does go through the two-headlight XJS looking thing.



So regarding the impressionist miata picture, I've been running into this quit a lot. It's probably a similar issue of trying to do something like a "mona lisa cave painting", where the model has a very strong opinion of what the Mona Lisa looks like and refuses to adjust it. On the other hand, the Japanese snow pictures overpowered it and the cars aren't recognizable miatas either. This can be balanced with weights, but it still ends up being either the wrong car or not matching the rest of the art style:



A pretty big issue turned out to be negative prompts, where generic stuff like "ugly, deformed, watermark" significantly changed what the results looked like. Same exact prompt and seed, just removing the negative ones:
With negative prompts:

Without:



E: I'm not much of an artist but this poo poo is just fascinating. Apparently 100%-45% "GT-R" is an R35 but below that it's more like the R34

https://i.imgur.com/pyl8wOc.mp4

I forgot a few extra frames there but if you stop it at like 13s, that's what happens when you crank up "Monet" to 1.3, lit just tries to write it on the sign :)

mobby_6kl fucked around with this message at 15:45 on Oct 21, 2022

KakerMix
Apr 8, 2004

8.2 M.P.G.
:byetankie:
There have been a few dramatic updates to these models. Stable Diffusion 2.0 is fantastic for machines. Less so for boobs I guess idk that was never my interest. MidJourney's V4 model along with their anime Niji that they are testing are good too, but really restrictive in how they won't blend nearly as well. Every time I use Midjourney I can feel the walls of the system somehow, like I can run up against its capability really quick. It spits out really good results with little input so I think most people are cool with that, but I digress. Cars!

































These are directly out of Stable Diffusion 2.0, the 768 checkpoint which means these haven't been upscaled at all. The prompt is:

positive:
(year in the 1985 or equivalent format) (Usually just a manufacture, 'Toyota' or 'Pontiac' but sometimes a specific model), designed by Giorgetto Giugiaro, wild design, crazy neon colors, impractical. ektachrome, volumetric lighting, f8 aperture, cinematic, studio photo

negative (I realize most of this has nothing to do with cars or machines but it works, promise. More art than science:
cartoon, 3d, (disfigured), (bad art), (deformed), (poorly drawn), (extra limbs), strange colors, blurry, boring, sketch, repetitive, cropped, hands

Euler a sampler at 50 steps.

Every one of these is an img2img as well at first to get the whole car in frame, but later I started using previous generations from months back as the seed image and went from there.

KakerMix fucked around with this message at 18:43 on Nov 30, 2022

kill me now
Sep 14, 2003

Why's Hank crying?

'CUZ HE JUST GOT DUNKED ON!
It is incredible to me how awesome and detailed every single one of those is until you look at the ai's attempt and lugnuts.

its like a human that cant draw hands

Hadlock
Nov 9, 2004

KakerMix posted:

There have been a few dramatic updates to these models. Stable Diffusion 2.0 is fantastic for machines. Less so for boobs I guess idk that was never my interest. MidJourney's V4 model along with their anime Niji that they are testing are good too, but really restrictive in how they won't blend nearly as well. Every time I use Midjourney I can feel the walls of the system somehow, like I can run up against its capability really quick. It spits out really good results with little input so I think most people are cool with that, but I digress. Cars!




These are directly out of Stable Diffusion 2.0, the 768 checkpoint which means these haven't been upscaled at all. The prompt is:

positive:
(year in the 1985 or equivalent format) (Usually just a manufacture, 'Toyota' or 'Pontiac' but sometimes a specific model), designed by Giorgetto Giugiaro, wild design, crazy neon colors, impractical. ektachrome, volumetric lighting, f8 aperture, cinematic, studio photo

negative (I realize most of this has nothing to do with cars or machines but it works, promise. More art than science:
cartoon, 3d, (disfigured), (bad art), (deformed), (poorly drawn), (extra limbs), strange colors, blurry, boring, sketch, repetitive, cropped, hands

Euler a sampler at 50 steps.

Every one of these is an img2img as well at first to get the whole car in frame, but later I started using previous generations from months back as the seed image and went from there.

Very cool looks a lot like a citroen DS?

For negative prompts I would add "not more than 5 lug nuts" and "deformed windshield wipers" there are some very droopy wipers in there. Looks great!

axolotl farmer
May 17, 2007

Now I'm going to sing the Perry Mason theme

1959 DeSota Miata



e: KakerMix, those are some sick non existant cars!

Darchangel
Feb 12, 2009

Tell him about the blower!


Wow, that is a lot better.
It seems to have gotten better at sharp/straight lines as well.


KakerMix posted:

There have been a few dramatic updates to these models. Stable Diffusion 2.0 is fantastic for machines. Less so for boobs I guess idk that was never my interest. MidJourney's V4 model along with their anime Niji that they are testing are good too, but really restrictive in how they won't blend nearly as well. Every time I use Midjourney I can feel the walls of the system somehow, like I can run up against its capability really quick. It spits out really good results with little input so I think most people are cool with that, but I digress. Cars!



I think you've just turned an S15 Jimmy into a hot hatch, here.

quote:



Something like a '73-'77 "Colonnade" GM A-body.

quote:



Hey, a modernized Citroen Deux-Chevaux

bsamu
Mar 11, 2006

KakerMix posted:

Every one of these is an img2img as well at first to get the whole car in frame, but later I started using previous generations from months back as the seed image and went from there.

How do you use img2img to get an uncropped version?

e: potentially using the inpaint tab? i'll poke around there

bsamu fucked around with this message at 02:02 on Dec 4, 2022

KakerMix
Apr 8, 2004

8.2 M.P.G.
:byetankie:

bsamu posted:

How do you use img2img to get an uncropped version?

e: potentially using the inpaint tab? i'll poke around there

Yeah, img2img.

You can use actual pictures of cars too. We'll take this, an actual picture of a 1980 Ford Pinto because it's cool and also it's centered:



Now drag that into the img2img zone, and we'll slap down the prompts I used before, with the 'Just Resize' option and sized roughly to match the ratio of the picture. Denoising is at default of .75 and the lowest res we will want to run is 768 since that's the size the 768 model was trained at:


When you adjust the width and height sliders a red overlay will appear on the picture itself giving you and idea of what portion will actually be rendered. Since I've selected 'Just Resize' the final image will be slightly wider. If I elected to drop the Width down one notch to 960, the final rendering will be slightly taller. You can also just use 'Crop and resize' to cut the image where the bounding box shows. Likewise the prompts I'm using can be changed, it's less that they have a clear effect on the final image and more that it steers the image. Yeah there aren't any limbs in the positive prompt, but the words being there at all change the output. This is why people doing this stuff get very tribalistic and mystic about their prompts. You can remove words or add them just to see what happens. Same with the positive prompts.

Press the button:


Not bad for the first run. Now I'm going to hit generate as it is a few times and see if a better result comes out.





I like that last one so I'm going to put that in the im2img source spot the Pinto was in, and run it again and see what comes out.



Hell yeah, now I'll put THIS image in the img2img source spot and change up the prompt and bit and see:


"1980 off road rally prepped Toyota MA70 Supra for Dakar rally flying through the desert, off-road rally footage, wild design, crazy neon colors. ektachrome, volumetric lighting"

Constantly feeding the generated image back in and tweaking the prompt as you go gives you a ton of options. Plus when the car is in motion you get away from the wheels looking weird since they are spinning.

Now let's change the prompt pretty dramatically in subject and see what it does, we'll feed the image into itself as we go. Starting with the above rally image let's use this prompt, using all the same options. Prompt is:

"1968 rocket-powered speed boat speeding across smooth water, wild design, crazy neon colors. ektachrome, volumetric lighting"


one pass


two passes


10 passes

bsamu
Mar 11, 2006

very cool. makes sense to just give it a picture of a car wholly in frame instead of trying to be a machine whisperer to make it happen. thanks for walking through your process more explicitly!

mobby_6kl
Aug 9, 2009

by Fluffdaddy
Lug nuts are the Achilles' heel human hands of car rendering

Very cool results though! I'm out traveling for another week or so but I'll have to check out the updated models.

Hadlock
Nov 9, 2004

For cars can you do like, negative prompts "wrong lug pattern", "more than five lug nuts", "less than five lug nuts"

KakerMix
Apr 8, 2004

8.2 M.P.G.
:byetankie:

Hadlock posted:

For cars can you do like, negative prompts "wrong lug pattern", "more than five lug nuts", "less than five lug nuts"

You can add whatever you'd like to the negative prompt, but it doesn't work like "No lovely pictures!" and it won't give you lovely pictures. All it does is negate out training data that's been tagged with that prompt. You'd be better off tagging things like "10 lug, 6 lug" or whatever in the negative and then doing "5 lug" or whatever in the positive, but that really only matters if the data has those tags.

"wrong lug pattern" would mean there would have to be images in the training data tagged with "wrong lug pattern" for it to have the desired effect, and it would simply not call upon that portion of training tagged with it. It would still have an effect because anything there does, you can't exactly prompt engineer your way out of rough looking images. The reason these systems have problems with hands isn't that there isn't ample data out there about hands, the anatomy, the motion, all that, it's that an image of a hand is wildly complex. These models are just patterns, it just diffuses out what it's been trained on. It doesn't know what a hand is and what it can do, it only knows images with 'hand' and will mash em all up. You get arm wrestling, hand shakes, hands in pockets, knuckle cracking, painted nails, all sorts of poo poo mixed in all tagged with "hand". For lug nuts the data would have to be in there to matter, and it might be, but an image of a car usually never mentions the lug nuts so the data might not even be called up, you know? When you take a picture of a car the lug pattern usually isn't a part of the image's data like the color or type or brand.

Hadlock
Nov 9, 2004

Totally agree on all points. Just asking because those are similar to prompts I've seen to fix hands in the other thread

PBCrunch
Jun 17, 2002

Lawrence Phillips Always #1 to Me
What is the cheapest graphics card that will let me run this locally? Bonus points for not being made by Nvidia (Linux luser).

Hadlock
Nov 9, 2004

There are several implementations that are CPU only. It's slower than using a GPU, but in a "50 seconds instead of 8 seconds" slower, not "10 minutes instead of 8 seconds"

https://github.com/bes-dev/stable_diffusion.openvino

I haven't used the link above personally but it apparently is popular

KakerMix
Apr 8, 2004

8.2 M.P.G.
:byetankie:

PBCrunch posted:

What is the cheapest graphics card that will let me run this locally? Bonus points for not being made by Nvidia (Linux luser).

I think for your case AMD cards work fine because of Linux. As for the card you'd need I couldn't help you there. On windows in the NVIDIA space 8 gigs is the lowest I ever hear people using, obviously more = better.

Nidhg00670000
Mar 26, 2010

We're in the pipe, five by five.
Grimey Drawer
It also says it'll only work on 2000 series or newer Nvidia cards somewhere in there, but it works just fine on my 1080ti so I dunno, just download and try? :shrug:

axolotl farmer
May 17, 2007

Now I'm going to sing the Perry Mason theme

the prompt was "vw beetle buning on fire flames"

Hadlock
Nov 9, 2004

No prompt, just ran across this one

axolotl farmer
May 17, 2007

Now I'm going to sing the Perry Mason theme

Hadlock posted:

No prompt, just ran across this one



rad

Hadlock
Nov 9, 2004

Facebook seems to just be feeding me a steady diet of Citroen DS content. Please enjoy a DS Porsche (not mine)

KakerMix
Apr 8, 2004

8.2 M.P.G.
:byetankie:




















I ended up buying a drawing tablet, learning Krita, and becoming proficient at 'fixing' details, essentially photo editing but entirely within AI generations. This is why the wheels are dramatically better because with the tablet I can get razors-edge close to various lines and inpaint with insane precision, which also allows me to roughly sketch and paint things out and then have the AI refine whatever garbage I slapped down and bring it up to a part of the whole picture. It's been months and I STILL feel like I did when I had an empty pad of paper and a box of colored pencils when I was a kid. Just now I get to use my words too :toot:

Most of these are generations from MidJourney, but then brought into Stabile Diffusion by me where I clean them up and tweak them. Still not perfect but I bet I could just toss these out to the internet and most people would believe they were real objects.

Suburban Dad
Jan 10, 2007


Well what's attached to a leash that it made itself?
The punchline is the way that you've been fuckin' yourself




The ones you're doing are great. I'm not motivated enough to try it myself but I enjoy seeing them immensely.

axolotl farmer
May 17, 2007

Now I'm going to sing the Perry Mason theme

Very cool! Some of them look like you crossbred an AMC Pacer with a Citroën DS.

KakerMix
Apr 8, 2004

8.2 M.P.G.
:byetankie:


Neat design, they elected to put the lights in the upper camper portion, though the side visibility sucks.





I didn't notice that hosed up electrical cable thing sticking off the roof :negative: oh well



This one is my personal favorite and I couldn't tell you exactly why.





Hadlock
Nov 9, 2004

Can someone do an Art Deco styled Citroen H-Van? Like, mostly photorealistic, but styled art deco?



Also while I'm wishing for ponies, how about an Art Deco aston martin DB5, and maybe also an Art Nouveau DB5. Bonus points if you can render it in a James Bond 60s movie poster style

edit: because they're relatively rare, this is my mental model for an art deco car (the other being the bugatti type 57sc atlantic):




This one really speaks to me, maybe make the C (B?) pillar a little less concave, to pull the 1980 buick rivera out of it, put a 60s lotus front grill on, you'd have a very nice 1978 Toyota 2000 GT

edit: use imgur, puu.sh does weird things when you click through on the image if you don't sanitize the link first (which you did, thanks)

Hadlock fucked around with this message at 07:58 on Feb 8, 2023

Hadlock
Nov 9, 2004

Not mine




quote:

"a vintage single-seat dirt track speedster racecar with open wire wheels, a ford grille and a Massey Fergusson bonet, being worked on in an old timber workshop surrounded by tools and with dim lighting and old vintage signs on the wall"

Hadlock
Nov 9, 2004

photograph of Art Deco styled Aston Martin, photorealistic


a vintage single-seat dirt track speedster racecar with open wire wheels, a ford grille and a Massey Fergusson bonet, being worked on in an old timber workshop surrounded by tools and with dim lighting and old vintage signs on the wall


a vintage single-seat dirt track Aston Martin db5 racecar, being worked on in an old timber workshop surrounded by tools and with dim lighting and old vintage signs on the wall


a side view of vintage racecar front end looks like a Porsche 356 and back half looks like a Citroen ds, being worked on in an old timber workshop surrounded by tools and with dim lighting and old vintage signs on the wall


Citroen DS mixed with a Tesla cybertruck


Citroen DS rally car mixed with a Porsche Cyber 356


TIP
Mar 21, 2006

Your move, creep.



Edit: lol wrong thread, just general AI image stuff

another AI video test

https://i.imgur.com/XRAB1eY.mp4

TIP fucked around with this message at 11:33 on Mar 23, 2023

Darchangel
Feb 12, 2009

Tell him about the blower!


It kinda went off the rails with the Cybertruck.

Love that last 2CV.

KakerMix
Apr 8, 2004

8.2 M.P.G.
:byetankie:
















These don't invoke any specific manufacturer, but instead the era they are from, all described with language vs. names. Pretty great results from MidJourney v5 alpha.

KakerMix fucked around with this message at 04:04 on Apr 1, 2023

Darchangel
Feb 12, 2009

Tell him about the blower!


KakerMix posted:

















These don't invoke any specific manufacturer, but instead the era they are from, all described with language vs. names. Pretty great results from MidJourney v5 alpha.

God drat.
poo poo's getting good.

That first bike, the red one, says "retro version of Kaneda's bike" to me.
I'm in love with the orange and blue bikes, and those M1-looking cars.

Hadlock
Nov 9, 2004

axolotl farmer
May 17, 2007

Now I'm going to sing the Perry Mason theme

A bottle of AUDI branded blinker fluid.

Adbot
ADBOT LOVES YOU

Dagen H
Mar 19, 2009

Hogertrafikomlaggningen

axolotl farmer posted:

A bottle of AUDI branded blinker fluid.



Try to count the rings

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply