Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
WhiteHowler
Apr 3, 2001

I'M HUGE!
Is there a secret to making images in SD bigger than 512x512, other than having more VRAM?

Ideally I'd like to do 16x9 aspect ratio, though I think I read that since SD works best with square images, a 2:1 aspect ratio will give better results?

Adbot
ADBOT LOVES YOU

Boba Pearl
Dec 27, 2019

by Athanatos

WhiteHowler posted:

Is there a secret to making images in SD bigger than 512x512, other than having more VRAM?

Ideally I'd like to do 16x9 aspect ratio, though I think I read that since SD works best with square images, a 2:1 aspect ratio will give better results?

It really depends on what you want to render, but Automatic now supports this little fuckeroo https://github.com/JingyunLiang/SwinIR which is the best drat image upscaler I have ever used.

mobby_6kl
Aug 9, 2009

by Fluffdaddy

WhiteHowler posted:

Using default settings it works fine with 8 GB of VRAM to create 512x512 images.

On my 8 GB 2070 Super, using DDIM or Euler-A with 30 steps takes around 5 seconds per image.
Well 5 seconds per image aren't gonna happen, but as long as it fits in memory it's fine

WhiteHowler posted:

Is there a secret to making images in SD bigger than 512x512, other than having more VRAM?

Ideally I'd like to do 16x9 aspect ratio, though I think I read that since SD works best with square images, a 2:1 aspect ratio will give better results?

512x768 works ok in my experience even with 8gb, the problem in my experience is that the algorithm goes nuts if it loses track of whatever object it already drew.

Maybe check out whatever they're talking about here:

BARONS CYBER SKULL posted:

the newest feature is the High res fix:




Objective Action
Jun 10, 2007



Yeah unfortunately right now the best way to get a higher resolution image is get one that has a composition you like and the use img2img, inpainting, and one of the upscalers on chunks of it to get progressively larger versions. Its slow, some details inevitably shift aroun, and it requires a lot of manual fuckery merging things back together but you can usually get decent results by the end of it.

pixaal
Jan 8, 2004

All ice cream is now for all beings, no matter how many legs.


Boba Pearl posted:

It really depends on what you want to render, but Automatic now supports this little fuckeroo https://github.com/JingyunLiang/SwinIR which is the best drat image upscaler I have ever used.

My SwinIR notebook broke and I haven't been able to figure out how to fix it. Having this local will be snazzy.

Objective Action
Jun 10, 2007



pixaal posted:

My SwinIR notebook broke and I haven't been able to figure out how to fix it. Having this local will be snazzy.

ChaiNNer, https://github.com/joeyballentine/chaiNNer, also supports SwinIR models now too if you want a GUI option.

Dr. Video Games 0031
Jul 17, 2004

I've been using Stable Diffusion with the Web UI from Automatic1111, and it seems like almost no matter what I do, it gives me duplicate faces/hellish lumps of flesh whenever I try to have it generate characters or people. For example, this is what I get when using the hulk hogan prompt posted up thread (Hulk Hogan portrait, intense stare, by Franz Xaver Winterhalter):



It's very persistent across all prompts. Did I install or configure something wrong?

Tunicate
May 15, 2012

from what I understand (and I could be wrong) it can only see 512 x 512 chunk of the image at a time, so if it can't see the old face anymore it thinks a face is missing and starts making a new one.

Dr. Video Games 0031
Jul 17, 2004

Oh, that actually makes sense. And I can see that is the case when I mess with the size settings.

Hadlock
Nov 9, 2004

Under negative prompts can you set something like "multiple faces, deformed bodies"

Dr. Video Games 0031
Jul 17, 2004

Hadlock posted:

Under negative prompts can you set something like "multiple faces, deformed bodies"

That's not super effective unfortunately. I guess if it can only look at 512x512 chunks at a time, it can't know there's another face in a zone it's not looking in.

edit: using negative prompts in images that are ~768x768 seems effective though. That seems like a good compromise resolution.

Dr. Video Games 0031 fucked around with this message at 01:46 on Sep 23, 2022

sigher
Apr 22, 2008

My guiding Moonlight...



Are people using this to make porn yet?

Objective Action
Jun 10, 2007



sigher posted:

Are people using this to make porn yet?

Search your heart.

Objective Action
Jun 10, 2007



Dr. Video Games 0031 posted:

I've been using Stable Diffusion with the Web UI from Automatic1111, and it seems like almost no matter what I do, it gives me duplicate faces/hellish lumps of flesh whenever I try to have it generate characters or people. For example, this is what I get when using the hulk hogan prompt posted up thread (Hulk Hogan portrait, intense stare, by Franz Xaver Winterhalter):



It's very persistent across all prompts. Did I install or configure something wrong?

The better move here (for now) is to generate a 512x512 of the person's face and then generate a torso, stitch that together, smooth out the seam manually or with img2img, etc. Or start with a zoomed out picture and then inpaint specific regions to get details. Trying to go straight to 2048x2048 or something will have the problem you are running into now, unfortunately.

It's a little tedious and manual but you can still get very good results and still much faster than hand painting/photographing/sketching something.

Comfy Fleece Sweater
Apr 2, 2013

You see, but you do not observe.

sigher posted:

Are people using this to make porn yet?

On the internet? Who would do that? It's public, you know.

Anyway, I'm trying to generate a kickass Wizard using a computer, you know those 70's style illustrations, if anyone has tips for a good prompt

I used epic wizard using a computer, linux code background, in space, 70's style and it did pump out some nice ones, but I feel it can do much better, nothing close to what I want (these are some interesting ones from first batch anyway)





lol


Also man this thing just doesn't get fingers does it

I think this looks freaking cool! Wonder how I can get that multiple face style... 3 wolfs incoming

mcbexx
Jul 4, 2004

British dentistry is
not on trial here!



I have been using the Lstein repo and I have hardly any trouble with extra faces when generating 512x704 images and using the built in upscaler - both 2x and 4x look very clean (2070S 8GB).

It has a web UI as well, but I hardly ever bother, I'm mostly feeding text files with a list of prompts and variations to the CLI.

Moongrave
Jun 19, 2004

Finally Living Rent Free
512x702 works well but you'll still get the occasional double face

trying to do huge images without it melting, use:



and set your res to an appropriate size/aspect ratio for Big

Vlaphor
Dec 18, 2005

Lipstick Apathy
I really do love how well SD gets what makes a spooky forest at night so spooky.

Dr. Video Games 0031
Jul 17, 2004

Comfy Fleece Sweater posted:

I think this looks freaking cool! Wonder how I can get that multiple face style... 3 wolfs incoming

In my experience, you get it just by setting the image to be large, if you have enough vram.

Anyway, you inspired me, so...



Prompt: Three wolves howling at the moon, Franz Xaver Winterhalter
(512x512, standard settings)

Infill each wolf head individually with the prompt: Hulk Hogan portrait, intense stare, Franz Xaver Winterhalter

Upscaled up with 10 CFG, 0.05 denoising strength, prompt: detailed portrait, Franz Xaver Winterhalter

I wanted to make the hulks actually howl too but it really didn't like that request.

frumpykvetchbot
Feb 20, 2004

PROGRESSIVE SCAN
Upset Trowel

WhiteHowler posted:

Automatic1111 is the WebUI that many/most people are using.

it works extremely well and is easy to use, but since the past week the "loopback" feature of img2img seems to have gone missing.

I have an older version from about 2 weeks ago that still has that but none of the other recent improvements.

frumpykvetchbot
Feb 20, 2004

PROGRESSIVE SCAN
Upset Trowel

your prompt was surely, "author Terry Pratchett having a normal day, writing a new Discworld novel."

Rinkles
Oct 24, 2010

What I'm getting at is...
Do you feel the same way?
I stepped back from SD for a few weeks, and I already feel like I'm lightyears behind.

AUTOMATIC1111 now has a lot of features I wanted but wasn't sure were possible.

Moongrave
Jun 19, 2004

Finally Living Rent Free
this was done through the High Res Fix:

Objective Action
Jun 10, 2007



Apparently the SD guys are saying they are planning to release a 1.5 publicly sometime by the end of the month. It's been in beta on their web-app thing for a few weeks now.

Rinkles
Oct 24, 2010

What I'm getting at is...
Do you feel the same way?

BARONS CYBER SKULL posted:

this was done through the High Res Fix:



sorry, is this upscaled?

Moongrave
Jun 19, 2004

Finally Living Rent Free

Rinkles posted:

sorry, is this upscaled?

kinda yes kinda no

it makes the small image then makes the big image from the small image, and using an upscaler for that bit but also not really?

it's, weird.

this was the same seed using the scaled latent option too:

Moongrave fucked around with this message at 05:28 on Sep 23, 2022

Rinkles
Oct 24, 2010

What I'm getting at is...
Do you feel the same way?
the early version of the SD-based upscaler I tried a month ago wasn't great

WhatEvil
Jun 6, 2004

Can't get no luck.

I'm surprised that there isn't some sort of distributed computing AI text-to-img training project going on.

Like, a few hundred goons with high-end graphics cards could train a new model with a month or something, right?

Not that I have the knowhow to set anything remotely like this up, but somebody must have.

Moongrave
Jun 19, 2004

Finally Living Rent Free
You need a minimum of about 40ish gb of vram for actual training which no consumer cards have

You’d be better off just paying for some A100 time like the waifu diffusion person did

frumpykvetchbot
Feb 20, 2004

PROGRESSIVE SCAN
Upset Trowel

WhatEvil posted:

I'm surprised that there isn't some sort of distributed computing AI text-to-img training project going on.

Like, a few hundred goons with high-end graphics cards could train a new model with a month or something, right?

Not that I have the knowhow to set anything remotely like this up, but somebody must have.

Like how do you figure, to address a particular shortcoming of the stuff already out there?

The SD drop is a massive chonker of a free gift that keeps giving and already in these first few weeks after the drop spawning numerous active projects producing tools, front-end improvements, specially trained derivative models and add-ons like the textual inversion stuff and cool upsizing features.

Interest groups for very specific niches and fetishes will for sure be crowdfunded training of SD derivatives.

The training dataset they used for the core SD model is wildly inclusive of all kinds of material from the wooly tangle of the LAION image set. A model trained on only "licensed" content is coming but it may by necessity exclude a lot more of the commercial source imagery that makes SD in its current form so versatile.

MJ looks to be imbued with superior baseline aesthetics where even simple prompts generate interesting images. Dall-E looks to be better at comprehending subtle prompts, correctly inferring implicit meanings in requests. I guess those are areas that we may hope to see improvements in the free stuff.

Chainclaw
Feb 14, 2009

I'm trying to figure out how outpainting and upresing works

This wasn't what I intended, but I'm happy with it.

Boba Pearl
Dec 27, 2019

by Athanatos

BARONS CYBER SKULL posted:

You need a minimum of about 40ish gb of vram for actual training which no consumer cards have

You’d be better off just paying for some A100 time like the waifu diffusion person did

You can also do it with 4 3090's which comes out to 72g vram, and about 1.5$ an hour.

Also there are some people with cards who are working to basically all train together and split the load over the internet.

Also, anime and furries are currently at epoch 1, and anime is currently working on uploading five million images to compete with Novel AI's closed tech. Also don't use NovelAI they loving suck. Ponies is at I believe epoch like 8 on 500,000 images or some such.

Boba Pearl fucked around with this message at 06:37 on Sep 23, 2022

Chainclaw
Feb 14, 2009

More experiments with image2image and extracting prompts from an image.

Source image, I've run some of these with the whole box art, some with just the dinosaurs:



Results:







Hadlock
Nov 9, 2004

I think you can also train the model not using a GPU, but it's like 10x slower

I can see Nvidia releasing cards with 128gb if there's demands, I forget exactly but 64 bit stuff should in theory be able to address up to 4tb of ram

I guess also you could just modify your card to have more RAM, desolder the chips on there and reflow larger compatible ones on there

SD models are going to be wild here in six to 18 months and eventually we'll hit a plateau, but this has been amazing so far

Boba Pearl
Dec 27, 2019

by Athanatos
Honestly you're not going to use 1k hours of image gen or training, so just go to runpod.io and rent a pod for 1.50 an hour and save the cash. Like 500 hours is a month straight.

If ten people chip in 20 bucks you can train a light model. And if 100 chip in 10 bucks you can compete with a corp.

Boba Pearl
Dec 27, 2019

by Athanatos
The limit isn't actually just compute but tagging. That's why everyone is using danbooru is their hoping to use their tagging system. That's also why anime and furry is moving so much faster then irl stuff, because there's not a ton of tagged image boards to scrape with irl concepts.

You only need 5 images to teach it a character with textual inversion but if an image has 29 tags then you need to teach it the differences between those tags so you need a bigger dataset. Though lmao that they scraped 5 million images off of danbooru. Honestly artists are kind of hosed if they care about showing up in the datasets and post online, because even if they didn't post it, if pinterest or Google images got a hold of it. It's going to get scraped. The internet AR large is also less scrupulous then data researchers.

Boba Pearl fucked around with this message at 07:00 on Sep 23, 2022

Rinkles
Oct 24, 2010

What I'm getting at is...
Do you feel the same way?

Boba Pearl posted:

Also don't use NovelAI they loving suck.

Why?

Boba Pearl
Dec 27, 2019

by Athanatos
.

Boba Pearl fucked around with this message at 07:47 on Sep 23, 2022

Boba Pearl
Dec 27, 2019

by Athanatos
.

Boba Pearl fucked around with this message at 07:47 on Sep 23, 2022

Adbot
ADBOT LOVES YOU

Boba Pearl
Dec 27, 2019

by Athanatos
They're a text gen service that abandoned text gen to do imggen, except they suddenly realize that people could make bad images with image gen, which everyone told them. They repeatedly dismissed the idea that an NSFW bot hosted by a company could possibly have downsides. Then it hit them "Oh right, people can make anything naked with this, not just anime drawings" and so spent the last 3 months making filters for image gen, while consistently pushing back text gen improvements. They can't compete with any of the open source options, and offer half as much for 25$ a month. Literally almost every feature of there's is doable by Kobold AI, and for 25$ you can just rent a GPU from one of the like 2 million GPU cloud spaces, and just run an open source option yourself. They always pivot to the next fad, just like their chat bots, their AI, Modules v2, text to speech, and hell they even abandoned Krake their flagship AI, which now can't compete with Open Source Neo 20X models in text gen. Their image gen doesn't have the high res fix, or negative prompting (unless they did that recently,) image 2 image, and they said they had to work their one dev into hyper crunch because hiring people takes too long. Meanwhile they consistently miss deadlines and when asked about stuff, they'll just say their lead dev has health problems so they can't do the work, meanwhile they charge 25$ for a model that regularly breaks and repeats the same 3 words over and over until you gently caress with a million settings to get it back on track.

Like how is NovelAI, DreamAI, Midjourney, or Dall-E supposed to compete with Automatic1111? They literally don't have as many features, and work half as well. I know people have computers that can't run the models well, but it's gotten to a point where you just need, a gaming computer. Any gaming computer almost. You can run Stable Diffusion on a GTX 680 I believe at this point, at least at 512x512, and it's only been out for a month. Even if you have literally no option because of your PC, then you can rent a GPU for a few quarters an hour, or use one of the million free google colabs (which I will admit are getting cut back.)

And I didn't even mention that textual inversion already has 500 different concepts it's been tuaght, and there's almost 50 different models and fine tunes that can be used for specific art styles. All of these corporations have to compete with hundreds of thousands of people who are dedicated to this way more then then corps. Every once in a while some crazy rear end rich person or whatever will pop up and fund an entire fleet of GPUs for training and suddenly you have a new model trained on some insane image set. The fact is, the way this was released to the public is going to make it nearly impossible for the free market to keep up. Every day there's new stuff, every day there's a million innovations, and if you're chasing image gen right now trying to get a website up, you're already behind half the pack. Facebook just released open source 175B and NovelAI keeps saying they have industry insiders and poo poo that worked on stable diffusion and on Facebooks stuff, but unless they release a 175B model (which could only be done with a fleet of TPUs,) then they've permanently knee capped themselves in the market.

I seriously doubt they'll have a 175B though, because Sudowrite says they have it, and they have a limit on generations, and still regularly switch you to dumber models without telling you. Their text gen if you use it for too long will just get dumber. Like noticeably dumber, whereas if you use open source text gen, that never happens.

They were the best, then the lead developer wanted to compete with Waifu Diffusion and Furry Diffusion, except they have all the limitations of being a corporation. Which you know, sounds crazy, but training across 1,500 different people all over america a couple hours at a time on stolen image sets is a lot harder to track then 1 TPU farm in bumfuck nowhere. And even if it wasn't, they're painting a target on their back by making a for sale NSFW model.

Oh yeah, and Automatic 1111 now works on Radeon AND Apple chipsets.

e: And their lovely website yells at you if you turn off cloud saves, then their cloud saves corrupt and gently caress up your local saves so you've gotta constantly download .json files and move them to a folder manually, because they don't know how to just have like a local save that works.

e2: They also don't know how their own technology works, and will give people advice that is actively hindering to the text gen. It's literally propped up by a bunch of users who have real heart and haven't realized they're being taken by a ride.

e3: like be real, how many hours of image gen are you going to do? How many hours do you have to do for 20c an hour to be more expensive then a new gpu?



https://www.runpod.io

E: and yes, they have a template that just installs Automatic for you, and you just run the thing.



AI art exists because scientists scraped the entirety of the internet, and everyone who worked on an art piece is a piece of Stable Diffusion and these models. It belongs to the artists, whose permission wasn't asked for. It belongs to people, and these companies that are trying to commoditize AI art are literally trying to take the distilled essence of generations of artists and turn it into a product. In the same way you should use an RSS feed instead of Social Media, you should support open source image and text gen. Your 25$ to Novel AI (or 10, or 15,) will mean 10x as much to some coder in kansas who just really wants to make AI tell cool stories. It's like buying Call of Duty vs donating to Toady1, one's just way more meaningful.

e: 7? Rant over.

Boba Pearl fucked around with this message at 07:55 on Sep 23, 2022

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply