AI Art: It is criminal to not post your prompt

The Something Awful Forums > Main > General Bullshit > AI Art: It is criminal to not post your prompt

WhiteHowler: Apr 3, 2001; I'M HUGE!

Is there a secret to making images in SD bigger than 512x512, other than having more VRAM?

Ideally I'd like to do 16x9 aspect ratio, though I think I read that since SD works best with square images, a 2:1 aspect ratio will give better results?

# ? Sep 23, 2022 00:07

Adbot: ADBOT LOVES YOU

# ? May 30, 2024 20:34

Boba Pearl: Dec 27, 2019; by Athanatos

WhiteHowler posted:

Is there a secret to making images in SD bigger than 512x512, other than having more VRAM?

Ideally I'd like to do 16x9 aspect ratio, though I think I read that since SD works best with square images, a 2:1 aspect ratio will give better results?

It really depends on what you want to render, but Automatic now supports this little fuckeroo https://github.com/JingyunLiang/SwinIR which is the best drat image upscaler I have ever used.

# ? Sep 23, 2022 00:12

mobby_6kl: Aug 9, 2009; by Fluffdaddy

WhiteHowler posted:

Using default settings it works fine with 8 GB of VRAM to create 512x512 images.

On my 8 GB 2070 Super, using DDIM or Euler-A with 30 steps takes around 5 seconds per image.

Well 5 seconds per image aren't gonna happen, but as long as it fits in memory it's fine

WhiteHowler posted:

Is there a secret to making images in SD bigger than 512x512, other than having more VRAM?

Ideally I'd like to do 16x9 aspect ratio, though I think I read that since SD works best with square images, a 2:1 aspect ratio will give better results?

512x768 works ok in my experience even with 8gb, the problem in my experience is that the algorithm goes nuts if it loses track of whatever object it already drew.

Maybe check out whatever they're talking about here:

BARONS CYBER SKULL posted:

the newest feature is the High res fix:

# ? Sep 23, 2022 00:13

Objective Action: Jun 10, 2007

Yeah unfortunately right now the best way to get a higher resolution image is get one that has a composition you like and the use img2img, inpainting, and one of the upscalers on chunks of it to get progressively larger versions. Its slow, some details inevitably shift aroun, and it requires a lot of manual fuckery merging things back together but you can usually get decent results by the end of it.

# ? Sep 23, 2022 00:16

pixaal: Jan 8, 2004; All ice cream is now for all beings, no matter how many legs.

Boba Pearl posted:

It really depends on what you want to render, but Automatic now supports this little fuckeroo https://github.com/JingyunLiang/SwinIR which is the best drat image upscaler I have ever used.

My SwinIR notebook broke and I haven't been able to figure out how to fix it. Having this local will be snazzy.

# ? Sep 23, 2022 00:25

Objective Action: Jun 10, 2007

pixaal posted:

My SwinIR notebook broke and I haven't been able to figure out how to fix it. Having this local will be snazzy.

ChaiNNer, https://github.com/joeyballentine/chaiNNer, also supports SwinIR models now too if you want a GUI option.

# ? Sep 23, 2022 01:16

Dr. Video Games 0031: Jul 17, 2004

I've been using Stable Diffusion with the Web UI from Automatic1111, and it seems like almost no matter what I do, it gives me duplicate faces/hellish lumps of flesh whenever I try to have it generate characters or people. For example, this is what I get when using the hulk hogan prompt posted up thread (Hulk Hogan portrait, intense stare, by Franz Xaver Winterhalter):

It's very persistent across all prompts. Did I install or configure something wrong?

# ? Sep 23, 2022 01:23

Tunicate: May 15, 2012

from what I understand (and I could be wrong) it can only see 512 x 512 chunk of the image at a time, so if it can't see the old face anymore it thinks a face is missing and starts making a new one.

# ? Sep 23, 2022 01:27

Dr. Video Games 0031: Jul 17, 2004

Oh, that actually makes sense. And I can see that is the case when I mess with the size settings.

# ? Sep 23, 2022 01:31

Hadlock: Nov 9, 2004

Under negative prompts can you set something like "multiple faces, deformed bodies"

# ? Sep 23, 2022 01:32

Dr. Video Games 0031: Jul 17, 2004

Hadlock posted:

Under negative prompts can you set something like "multiple faces, deformed bodies"

That's not super effective unfortunately. I guess if it can only look at 512x512 chunks at a time, it can't know there's another face in a zone it's not looking in.

edit: using negative prompts in images that are ~768x768 seems effective though. That seems like a good compromise resolution.

Dr. Video Games 0031 fucked around with this message at 01:46 on Sep 23, 2022

# ? Sep 23, 2022 01:38

sigher: Apr 22, 2008; My guiding Moonlight...

Are people using this to make porn yet?

# ? Sep 23, 2022 02:32

Objective Action: Jun 10, 2007

sigher posted:

Are people using this to make porn yet?

Search your heart.

# ? Sep 23, 2022 02:34

Objective Action: Jun 10, 2007

Dr. Video Games 0031 posted:

I've been using Stable Diffusion with the Web UI from Automatic1111, and it seems like almost no matter what I do, it gives me duplicate faces/hellish lumps of flesh whenever I try to have it generate characters or people. For example, this is what I get when using the hulk hogan prompt posted up thread (Hulk Hogan portrait, intense stare, by Franz Xaver Winterhalter):

It's very persistent across all prompts. Did I install or configure something wrong?

The better move here (for now) is to generate a 512x512 of the person's face and then generate a torso, stitch that together, smooth out the seam manually or with img2img, etc. Or start with a zoomed out picture and then inpaint specific regions to get details. Trying to go straight to 2048x2048 or something will have the problem you are running into now, unfortunately.

It's a little tedious and manual but you can still get very good results and still much faster than hand painting/photographing/sketching something.

# ? Sep 23, 2022 02:41

Comfy Fleece Sweater: Apr 2, 2013; You see, but you do not observe.

sigher posted:

Are people using this to make porn yet?

On the internet? Who would do that? It's public, you know.

Anyway, I'm trying to generate a kickass Wizard using a computer, you know those 70's style illustrations, if anyone has tips for a good prompt

I used epic wizard using a computer, linux code background, in space, 70's style and it did pump out some nice ones, but I feel it can do much better, nothing close to what I want (these are some interesting ones from first batch anyway)

lol

Also man this thing just doesn't get fingers does it

Dr. Video Games 0031 posted:

I think this looks freaking cool! Wonder how I can get that multiple face style... 3 wolfs incoming

# ? Sep 23, 2022 02:53

mcbexx: Jul 4, 2004; British dentistry is
not on trial here!

I have been using the Lstein repo and I have hardly any trouble with extra faces when generating 512x704 images and using the built in upscaler - both 2x and 4x look very clean (2070S 8GB).

It has a web UI as well, but I hardly ever bother, I'm mostly feeding text files with a list of prompts and variations to the CLI.

# ? Sep 23, 2022 03:08

Moongrave: Jun 19, 2004; Finally Living Rent Free

512x702 works well but you'll still get the occasional double face

trying to do huge images without it melting, use:

and set your res to an appropriate size/aspect ratio for Big

# ? Sep 23, 2022 03:17

Vlaphor: Dec 18, 2005; Lipstick Apathy

I really do love how well SD gets what makes a spooky forest at night so spooky.

# ? Sep 23, 2022 03:47

Dr. Video Games 0031: Jul 17, 2004

Comfy Fleece Sweater posted:

I think this looks freaking cool! Wonder how I can get that multiple face style... 3 wolfs incoming

In my experience, you get it just by setting the image to be large, if you have enough vram.

Anyway, you inspired me, so...

Prompt: Three wolves howling at the moon, Franz Xaver Winterhalter
(512x512, standard settings)

Infill each wolf head individually with the prompt: Hulk Hogan portrait, intense stare, Franz Xaver Winterhalter

Upscaled up with 10 CFG, 0.05 denoising strength, prompt: detailed portrait, Franz Xaver Winterhalter

I wanted to make the hulks actually howl too but it really didn't like that request.

# ? Sep 23, 2022 03:51

frumpykvetchbot: Feb 20, 2004; PROGRESSIVE SCAN; Upset Trowel

WhiteHowler posted:

Automatic1111 is the WebUI that many/most people are using.

it works extremely well and is easy to use, but since the past week the "loopback" feature of img2img seems to have gone missing.

I have an older version from about 2 weeks ago that still has that but none of the other recent improvements.

# ? Sep 23, 2022 04:37

frumpykvetchbot: Feb 20, 2004; PROGRESSIVE SCAN; Upset Trowel

Comfy Fleece Sweater posted:

lol

your prompt was surely, "author Terry Pratchett having a normal day, writing a new Discworld novel."

# ? Sep 23, 2022 04:43

Rinkles: Oct 24, 2010; What I'm getting at is...
Do you feel the same way?

I stepped back from SD for a few weeks, and I already feel like I'm lightyears behind.

AUTOMATIC1111 now has a lot of features I wanted but wasn't sure were possible.

# ? Sep 23, 2022 04:56

Moongrave: Jun 19, 2004; Finally Living Rent Free

this was done through the High Res Fix:

# ? Sep 23, 2022 05:15

Objective Action: Jun 10, 2007

Apparently the SD guys are saying they are planning to release a 1.5 publicly sometime by the end of the month. It's been in beta on their web-app thing for a few weeks now.

# ? Sep 23, 2022 05:18

Rinkles: Oct 24, 2010; What I'm getting at is...
Do you feel the same way?

BARONS CYBER SKULL posted:

this was done through the High Res Fix:

sorry, is this upscaled?

# ? Sep 23, 2022 05:20

Moongrave: Jun 19, 2004; Finally Living Rent Free

Rinkles posted:

sorry, is this upscaled?

kinda yes kinda no

it makes the small image then makes the big image from the small image, and using an upscaler for that bit but also not really?

it's, weird.

this was the same seed using the scaled latent option too:

Moongrave fucked around with this message at 05:28 on Sep 23, 2022

# ? Sep 23, 2022 05:25

Rinkles: Oct 24, 2010; What I'm getting at is...
Do you feel the same way?

the early version of the SD-based upscaler I tried a month ago wasn't great

# ? Sep 23, 2022 05:31

WhatEvil: Jun 6, 2004; Can't get no luck.

I'm surprised that there isn't some sort of distributed computing AI text-to-img training project going on.

Like, a few hundred goons with high-end graphics cards could train a new model with a month or something, right?

Not that I have the knowhow to set anything remotely like this up, but somebody must have.

# ? Sep 23, 2022 05:43

Moongrave: Jun 19, 2004; Finally Living Rent Free

You need a minimum of about 40ish gb of vram for actual training which no consumer cards have

You�d be better off just paying for some A100 time like the waifu diffusion person did

# ? Sep 23, 2022 06:16

frumpykvetchbot: Feb 20, 2004; PROGRESSIVE SCAN; Upset Trowel

WhatEvil posted:

I'm surprised that there isn't some sort of distributed computing AI text-to-img training project going on.

Like, a few hundred goons with high-end graphics cards could train a new model with a month or something, right?

Not that I have the knowhow to set anything remotely like this up, but somebody must have.

Like how do you figure, to address a particular shortcoming of the stuff already out there?

The SD drop is a massive chonker of a free gift that keeps giving and already in these first few weeks after the drop spawning numerous active projects producing tools, front-end improvements, specially trained derivative models and add-ons like the textual inversion stuff and cool upsizing features.

Interest groups for very specific niches and fetishes will for sure be crowdfunded training of SD derivatives.

The training dataset they used for the core SD model is wildly inclusive of all kinds of material from the wooly tangle of the LAION image set. A model trained on only "licensed" content is coming but it may by necessity exclude a lot more of the commercial source imagery that makes SD in its current form so versatile.

MJ looks to be imbued with superior baseline aesthetics where even simple prompts generate interesting images. Dall-E looks to be better at comprehending subtle prompts, correctly inferring implicit meanings in requests. I guess those are areas that we may hope to see improvements in the free stuff.

# ? Sep 23, 2022 06:17

Chainclaw: Feb 14, 2009

I'm trying to figure out how outpainting and upresing works

This wasn't what I intended, but I'm happy with it.

# ? Sep 23, 2022 06:27

Boba Pearl: Dec 27, 2019; by Athanatos

BARONS CYBER SKULL posted:

You need a minimum of about 40ish gb of vram for actual training which no consumer cards have

You�d be better off just paying for some A100 time like the waifu diffusion person did

You can also do it with 4 3090's which comes out to 72g vram, and about 1.5$ an hour.

Also there are some people with cards who are working to basically all train together and split the load over the internet.

Also, anime and furries are currently at epoch 1, and anime is currently working on uploading five million images to compete with Novel AI's closed tech. Also don't use NovelAI they loving suck. Ponies is at I believe epoch like 8 on 500,000 images or some such.

Boba Pearl fucked around with this message at 06:37 on Sep 23, 2022

# ? Sep 23, 2022 06:35

Chainclaw: Feb 14, 2009

More experiments with image2image and extracting prompts from an image.

Source image, I've run some of these with the whole box art, some with just the dinosaurs:

Results:

# ? Sep 23, 2022 06:49

Hadlock: Nov 9, 2004

I think you can also train the model not using a GPU, but it's like 10x slower

I can see Nvidia releasing cards with 128gb if there's demands, I forget exactly but 64 bit stuff should in theory be able to address up to 4tb of ram

I guess also you could just modify your card to have more RAM, desolder the chips on there and reflow larger compatible ones on there

SD models are going to be wild here in six to 18 months and eventually we'll hit a plateau, but this has been amazing so far

# ? Sep 23, 2022 06:53

Boba Pearl: Dec 27, 2019; by Athanatos

Honestly you're not going to use 1k hours of image gen or training, so just go to runpod.io and rent a pod for 1.50 an hour and save the cash. Like 500 hours is a month straight.

If ten people chip in 20 bucks you can train a light model. And if 100 chip in 10 bucks you can compete with a corp.

# ? Sep 23, 2022 06:54

Boba Pearl: Dec 27, 2019; by Athanatos

The limit isn't actually just compute but tagging. That's why everyone is using danbooru is their hoping to use their tagging system. That's also why anime and furry is moving so much faster then irl stuff, because there's not a ton of tagged image boards to scrape with irl concepts.

You only need 5 images to teach it a character with textual inversion but if an image has 29 tags then you need to teach it the differences between those tags so you need a bigger dataset. Though lmao that they scraped 5 million images off of danbooru. Honestly artists are kind of hosed if they care about showing up in the datasets and post online, because even if they didn't post it, if pinterest or Google images got a hold of it. It's going to get scraped. The internet AR large is also less scrupulous then data researchers.

Boba Pearl fucked around with this message at 07:00 on Sep 23, 2022

# ? Sep 23, 2022 06:57

Rinkles: Oct 24, 2010; What I'm getting at is...
Do you feel the same way?

Boba Pearl posted:

Also don't use NovelAI they loving suck.

Why?

# ? Sep 23, 2022 07:05

Boba Pearl: Dec 27, 2019; by Athanatos

Boba Pearl fucked around with this message at 07:47 on Sep 23, 2022

# ? Sep 23, 2022 07:21

Boba Pearl: Dec 27, 2019; by Athanatos

Boba Pearl fucked around with this message at 07:47 on Sep 23, 2022

# ? Sep 23, 2022 07:26

Adbot: ADBOT LOVES YOU

# ? May 30, 2024 20:34

Boba Pearl: Dec 27, 2019; by Athanatos

They're a text gen service that abandoned text gen to do imggen, except they suddenly realize that people could make bad images with image gen, which everyone told them. They repeatedly dismissed the idea that an NSFW bot hosted by a company could possibly have downsides. Then it hit them "Oh right, people can make anything naked with this, not just anime drawings" and so spent the last 3 months making filters for image gen, while consistently pushing back text gen improvements. They can't compete with any of the open source options, and offer half as much for 25$ a month. Literally almost every feature of there's is doable by Kobold AI, and for 25$ you can just rent a GPU from one of the like 2 million GPU cloud spaces, and just run an open source option yourself. They always pivot to the next fad, just like their chat bots, their AI, Modules v2, text to speech, and hell they even abandoned Krake their flagship AI, which now can't compete with Open Source Neo 20X models in text gen. Their image gen doesn't have the high res fix, or negative prompting (unless they did that recently,) image 2 image, and they said they had to work their one dev into hyper crunch because hiring people takes too long. Meanwhile they consistently miss deadlines and when asked about stuff, they'll just say their lead dev has health problems so they can't do the work, meanwhile they charge 25$ for a model that regularly breaks and repeats the same 3 words over and over until you gently caress with a million settings to get it back on track.

Like how is NovelAI, DreamAI, Midjourney, or Dall-E supposed to compete with Automatic1111? They literally don't have as many features, and work half as well. I know people have computers that can't run the models well, but it's gotten to a point where you just need, a gaming computer. Any gaming computer almost. You can run Stable Diffusion on a GTX 680 I believe at this point, at least at 512x512, and it's only been out for a month. Even if you have literally no option because of your PC, then you can rent a GPU for a few quarters an hour, or use one of the million free google colabs (which I will admit are getting cut back.)

And I didn't even mention that textual inversion already has 500 different concepts it's been tuaght, and there's almost 50 different models and fine tunes that can be used for specific art styles. All of these corporations have to compete with hundreds of thousands of people who are dedicated to this way more then then corps. Every once in a while some crazy rear end rich person or whatever will pop up and fund an entire fleet of GPUs for training and suddenly you have a new model trained on some insane image set. The fact is, the way this was released to the public is going to make it nearly impossible for the free market to keep up. Every day there's new stuff, every day there's a million innovations, and if you're chasing image gen right now trying to get a website up, you're already behind half the pack. Facebook just released open source 175B and NovelAI keeps saying they have industry insiders and poo poo that worked on stable diffusion and on Facebooks stuff, but unless they release a 175B model (which could only be done with a fleet of TPUs,) then they've permanently knee capped themselves in the market.

I seriously doubt they'll have a 175B though, because Sudowrite says they have it, and they have a limit on generations, and still regularly switch you to dumber models without telling you. Their text gen if you use it for too long will just get dumber. Like noticeably dumber, whereas if you use open source text gen, that never happens.

They were the best, then the lead developer wanted to compete with Waifu Diffusion and Furry Diffusion, except they have all the limitations of being a corporation. Which you know, sounds crazy, but training across 1,500 different people all over america a couple hours at a time on stolen image sets is a lot harder to track then 1 TPU farm in bumfuck nowhere. And even if it wasn't, they're painting a target on their back by making a for sale NSFW model.

Oh yeah, and Automatic 1111 now works on Radeon AND Apple chipsets.

e: And their lovely website yells at you if you turn off cloud saves, then their cloud saves corrupt and gently caress up your local saves so you've gotta constantly download .json files and move them to a folder manually, because they don't know how to just have like a local save that works.

e2: They also don't know how their own technology works, and will give people advice that is actively hindering to the text gen. It's literally propped up by a bunch of users who have real heart and haven't realized they're being taken by a ride.

e3: like be real, how many hours of image gen are you going to do? How many hours do you have to do for 20c an hour to be more expensive then a new gpu?

https://www.runpod.io

E: and yes, they have a template that just installs Automatic for you, and you just run the thing.

AI art exists because scientists scraped the entirety of the internet, and everyone who worked on an art piece is a piece of Stable Diffusion and these models. It belongs to the artists, whose permission wasn't asked for. It belongs to people, and these companies that are trying to commoditize AI art are literally trying to take the distilled essence of generations of artists and turn it into a product. In the same way you should use an RSS feed instead of Social Media, you should support open source image and text gen. Your 25$ to Novel AI (or 10, or 15,) will mean 10x as much to some coder in kansas who just really wants to make AI tell cool stories. It's like buying Call of Duty vs donating to Toady1, one's just way more meaningful.

e: 7? Rant over.

Boba Pearl fucked around with this message at 07:55 on Sep 23, 2022

# ? Sep 23, 2022 07:33

The Something Awful Forums > Main > General Bullshit > AI Art: It is criminal to not post your prompt

«‹›435 »