Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Roman
Aug 8, 2002

thanks for the info. yeah I'm probably going to have to switch to SD, although maybe I should try writing the script first lol.

Adbot
ADBOT LOVES YOU

Roman
Aug 8, 2002

KakerMix posted:

Stable Diffusion is a lot harder to get going but truly is limitless.
I see what you mean. I put a photoreal MJ prompt in and it looked all messed up. BUT, it still had a better idea of the actual person I want to use as a reference. I had to use anna kendrick on MJ to get something close. I will definitely look into it.

KakerMix
Apr 8, 2004

8.2 M.P.G.
:byetankie:

Roman posted:

I see what you mean. I put a photoreal MJ prompt in and it looked all messed up. BUT, it still had a better idea of the actual person I want to use as a reference. I had to use anna kendrick on MJ to get something close. I will definitely look into it.

Right yeah, and when you do get SD going it still won't guide you like Midjourney. There is negative prompts, framing issues (which are all but eliminated with ControlNet and the posing options), deep rabbit holes that are lined with porn and anime and anime porn, all mixed with cargo culting weirdness as we are all throwing bones and claiming we see something in the summoning procedure.

There is no shame either in getting outputs and running them through SD to guide it to whatever you want. I did that for all my Shadowrun images I've done so far, because Midjourney is much better at coherence without need to put a whole lot in.

You can also just take real actual images already out there, and not just photos but drawings, digital art, movie stills, whatever, and wring them through and get something that looks totally different, even if you used an already-made image to get there. Mirror it, flip it, bring it into a photo editor and roughly cut out parts you don't want or draw pointy ears. Take a picture of yourself even doing whatever pose you want and use that.

Never before has visual art been so liquid and able to be morphed, molded and changed. It's truly remarkable and I hope more and more people realize this.

BrainDance
May 8, 2007

Disco all night long!

ThisIsJohnWayne posted:

Knowledge is always* good. You could also make a specific thread for it and link to it here if it'll get very large

Ok, so here’s a very trimmed down version of the guide. Some of it's probably obvious but the goal was to do it literally step by step so even a beginner could use it. Some of the commands cant be posted on SA because cloudflare’s WAF think’s it’s some kind of script injection. Apparently anything with wget and then https://whatever So I edited them to get through but they should still work.

I have a long intro to the guide on my website but I’ll cut that but it’s basically, this is a guide for finetuning gpt-neo locally. GPT-Neo 1.3B is probably trainable on 12gb of vram. We have to use DeepSpeed because otherwise, even with a 4090 we cant finetune 1.3B. Gotta do it in Linux, though WSL2 on Windows 11 works.

Part 1: Environment Setup

Miniconda
We’re going to be using miniconda, especially because we’re going to need to use a specific version of Python. So, start by installing miniconda
code:
wget repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
chmod +x Miniconda3-latest-Linux-x86_64.sh
Run the installer, agree to the terms and install where you want. The default location is probably fine
code:
./Miniconda3-latest-Linux-x86_64.sh
You should also install pip, which we will use for some packages, and git which we will use to clone repos, if you don’t have them already
code:
sudo apt install python3-pip git
You’ll also need the proprietary Nvidia driver on baremetal Linux, nouveau won’t work for this

CUDA
Usually CUDA version mismatches are not a big deal, but here we want to be careful and make sure we get the right version of CUDA installed, a version that matches the version of CUDA our PyTorch is for. CUDA mismatches often cause DeepSpeed to fail to build

Since there are PyTorch builds for CUDA 11.7 that is probably what we should go with. Cuda 11.8 with the nightly PyTorch (which has a PyTorch for CUDA 11.8) will probably be faster on Lovalace GPUs but since it isn’t the stable branch there are more likely to be unexpected problems
code:
wget developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda-repo-ubuntu2204-11-7-local_11.7.0-515.43.04-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204-11-7-local_11.7.0-515.43.04-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-11-7-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt update
sudo apt -y install cuda-11-7 cuda-toolkit-11-7
If you’re using WSL you should replace
wget developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda-repo-ubuntu2204-11-7-local_11.7.0-515.43.04-1_amd64.deb
with
wget developer.download.nvidia.com/compute/cuda/11.7.1/local_installers/cuda-repo-wsl-ubuntu-11-7-local_11.7.1-1_amd64.deb for WSL Ubuntu
Now that CUDA is installed we probably have to set the environment variables for it. In my experience installing a specific version of CUDA doesn’t do that automatically. All we need to do is set them with these lines
code:
export PATH=/usr/local/cuda-11.7/bin:$PATH
export CUDA_HOME="/usr/local/cuda-11.7"
export LD_LIBRARY_PATH="/usr/local/cuda-11.7/lib64:$LD_LIBRARY_PATH"
export PATH="/usr/local/cuda-11.7/bin:$PATH"
Once that’s all done we can make sure it’s working and CUDA 11.7 specifically is the version we are using with
code:
nvcc -V
Creating the Environment
We’re going to create and activate an environment with conda that specifically uses Python 3.9 We need to use this version or lower because DeepSpeed specifically requires triton 1.0.0, and that version of triton is not available from pip for newer versions of Python. You can name the environment whatever, I just went with p39 for “Python 3.9”

code:
conda create -n p39 python=3.9
conda activate p39
Now that we’re in our environment we need to install PyTorch, and again, since we have installed CUDA 11.7 we’re going to use PyTorch stable (1.13.1) for CUDA 11.7. If you went with CUDA 11.8 for Lovelace you will need to get the nightly version of PyTorch and install for CUDA 11.8. There are other ways you can install this, you can see the exact commands at https://pytorch.org/get-started/locally/ but we’re going to use conda for this with the following command. Make sure our environment is activated when we do all this

code:
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
Now install triton 1.0.0, again, it’s very important that it’s this version, DeepSpeed will not build with any other version, if you are using a newer version of python, pip will complain that it can’t find that release

code:
pip install triton==1.0.0
DeepSpeed
Now move to a directory where you want to clone DeepSpeed to. It doesn’t matter where, wherever you feel like putting it. At the time of writing this guide DeepSpeed is at commit 349f845, in case things change in the future, that commit will work for this

code:
git clone https://github.com/microsoft/DeepSpeed
Change to the DeepSpeed repo directory we just cloned
We need to build deepspeed without async_io. We wont use aio at all, and odds are you wont be able to build with async_io support without doing more work anyway. Otherwise we need to have all other ops installed.
code:
DS_BUILD_OPS=1 DS_BUILD_AIO=0 pip install .
This will take a while, hopefully after it’s finished building we’ll have all ops installed besides async_io. After it’s done building check what installed with
code:
ds_report
Finetuning Repo
Leave the DeepSpeed directory and move to another directory where you want to clone a copy of the actual finetuning repo. This is where our models will end up, and where we will put our training data. Clone the repo with
code:
git clone https://github.com/Xirider/finetune-gpt2xl
Change into the finetune directory and install the remaining requirements for this repo with pip
code:
pip install -r requirements.txt
Part 2: Training Data

From here we can train, the repo includes sample data from the works of Shakespeare to train on. But we want to train on our own data. We’ll probably be working with a massive amount of data, so we’ll need to be able to edit it all. Doing this by hand would likely take an incredible amount of time, but we can edit it all pretty quickly using either a combination of awk, sed, and grep or writing scripts to do it for us in Python

What exactly you will have to do depends on what your training data starts out as in the first place, some things will be easier to edit than others. I’m going to start with chat logs from the chat program WeChat. The first part of this will be WeChat specific, but the later parts may be useful more generally

WeChat Specific Text Formatting

The first thing we need to do is actually get the chat logs from WeChat. This is a problem because WeChat logs are encrypted. The only way I found to get them in an unencrypted workable format involved proprietary, commercial software. There was no open source way that I could find. And, worse yet, this software only works on Windows and costs 200rmb. Regardless, depending on whether you’re extracting logs from a phone or the WeChat PC version and whether you’re ok with Chinese or not you can get software to do this at

https://www.coksoft.com/wechatextractor.htm
or
https://www.louyue.com/pcwx.htm

Luckily for most other kinds of logs you won’t need to do anything as drastic. They are probably already in plaintext, like IRC logs for example.
Now we have an htm file with our chat logs. We need to convert this htm file to text before we can use it to train our model. There a couple ways to do this.
We can do it with python with the Beautiful Soup package which allows Python to parse html.

(I cut the python here, it's long and probably not necessary)

Or, if we don’t want to mess with Python, the easiest way is to just open the text in a browser, copy it, and paste it into a text file.
Now we have all of the chat logs, maybe from a group chat. The problem now is it includes usernames, userIDs, and time stamps for everyone. We don’t want those usernames and time stamps in the training data and I wanted to train on the messages from just one person. Also, in WeChat logs the username and time stamps are on one line with the person’s messages on the next.
So, what I did was use awk to look for any line with a specific username or userID then go to the very next line, copy that line to another text file, then do the same throughout the rest of the document. This is the awk to do that

code:
awk '/username/ {f=NR} f && NR==f+1' file  > train.txt
It is probably better to use userIDs instead of usernames, if you use usernames it will also find lines that mention that user and copy the line after that. These lines will almost definitely be lines with another username, userID, and time stamp so they’re easy enough to clean up anyway if we have to.
So, now we’re working from another file. Most of these commands have a way to update the original file, like the -i option in sed, but that’s a little risky. If you mess it up then you have to start over. It’s up to you
Now, since we probably have some of those extra lines because of mentions we have to clean it up further. One way to do this is to look for any line with a timestamp and delete that line. We can do this with either grep or awk
With grep it looks like this

code:
grep -vE "^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}" file.txt > output.txt
We can do the same thing with awk. If we want to update the original file we have to send it to a temp file then move the temp file over the original file which I think is kind of gross but either way it would work like this

code:
awk '!/20[0-9][0-9]-[0-1][0-9]-[0-3][0-9] [0-2][0-9]:[0-5][0-9]:[0-5][0-9]/' file.txt > temp.txt && mv temp.txt file.txt
Now, we want to remove stickers. This is not going to work if the logs you plan to be training on are written in Chinese. For me though, the logs I wanted are all in English. Stickers in wechat are a line of Chinese, so I thought the best way to remove the stickers was to remove any line with Chinese. You can use grep to do this

code:
grep -P -v '[\x{4e00}-\x{9fa5}]' train.txt > train1.txt
General Text Formatting

The following parts aren’t just useful on WeChat logs but probably useful to more kinds of logs or training data in general. Still, other kinds of text might not work exactly like this so you’ll have to do whatever fits how your data is formatted.
The logs are pretty clean now but there are still some empty lines. There is no reason for them to be there for this and it will cause problems later when we try to add a training specific terminator in between lines so it’s best to just get rid of them here. We can use grep to do this

code:
grep -v '^[[:space:]]*$' train1.txt > train2.txt
Some lines have URLs in them. I don’t really consider that the person we’re training it to speak like “talking” so I removed those with sed, too

code:
sed -i 's/http[^ ]*//g' train2.txt
Now, the WeChat logs have a few empty spaces before each line. This is also going to cause problems when we add our trigger word so we can get rid of them too with sed

code:
sed -i 's/^[ ]*//g' train2.txt
At this point we could use this as our training data but there are still a few things left that would probably make it not work how we wanted it to work. If we train it on what we have now it would learn what we trained it on, but it wouldn’t know what exactly to do with it. We want it to speak like the person in our training data.
Transformer models are textcompletion models. That is, you send them a prompt and they try to figure out the next word. For example, if you send it the prompt
I am a
it will respond with something like
I am a boy
So, how do you get it to actually act like a chatbot? You send an example of a conversation with the prompt including the users input and then an empty chat response for it to complete like

user_input = input

prompt:
A: Hi
B: Nice to meet you
A: user_input
B:

And there the AI “completes” for “B:”
But what does “B” sound like? “B” is just general and it has no idea of what “B” is like so it will respond however. But what if we put someone else there? If we replace “B” with “Einstein” and give it an example of Einstein saying Einstein stuff? Then, it will try to sound like Einstein.
But the AI has no idea who the person we’re training it on is, it has no idea that the training data even belongs to a single person or anything. So then what can we do? We need to create a “person” for it speak as, a person’s messages for it to “complete”
We do this by creating a trigger word that we will put in front of each message in our training data. That way when we call that trigger word in our prompt the AI knows what style it needs to take on in its output. There aren’t really any rules to what your trigger word needs to be. I feel like you should pick something that doesn’t already exist as a concept it knows in order to avoid its already existing memory interfering with the output but I haven’t tested that and I don’t know for sure. That’s how it works with training image generator AIs so it is possibly the same here. For this example I chose the word “responseuser1:” and I’m going to use awk to put that behind each line in the training data

code:
awk '{print "responseuser1: " $0;}' train2.txt > train3.txt
Now we have one more thing to deal with. How does the AI know where one message ends and another begins? What’s to stop it from outputting “I’m fine how are you? responseuser1: I’m fine too, thank you?”, at this point, absolutely nothing.
There is a terminator though that we can add in between each line,<|endoftext|> . It tells the model “this is where one block of text ends and another begins.” This is only used in training
Since, in these chat logs, each message is on its own line we can just add <|endoftext|> in between each line. We can do this with sed. This command also puts spaces around <|endoftext|> which may be necessary for some kinds of training data, as if there isn’t a space in between the last character of a line and the token the AI may interpret it as a part of the word instead, and it will show up in your output. This should also be at the very beginning of the document and the very end

code:
sed -e 's/$/ <|endoftext|> /' -i train3.txt
The training data may have some special, unprintable characters in it that we don’t actually want. For example, my daodejing data had a lot of — characters in it. These can be artifacts of converting data in some formats to plain text. We can remove these with sed to make the data cleane

code:
sed 's/[^[:print:][:space:][:punct:]]//g' train3.txt > train4.txt
Other Types of Training Data

What we just did really only works for chat logs, and really only chat logs formatted in a certain way. What if you have text where one “message” covers multiple lines? Or where the terminator needs to go after a block of text and not after every line? Here is an example of text extracted from the Stanley Lombardo translation of the Daodejing. Since it’s in verse, each verse covers multiple lines.
It has so many empty lines and numbers in between each verse
What I did with this was first remove the empty lines with the same sed command I used earlier. Then, since the numbers are in between each verse I figured I could use those numbers to edit it. I replaced each number with an <|endoftext|> terminator and then, because each terminator is placed before the next verse starts I added the trigger “daosays:” to the line below each <|endoftext|>. I used sed to do this and it looked like this

code:
sed -e '/[0-9]/{s/^.*$/<|endoftext|>/;p;s/^.*$/daosays:/}' file.txt
I still had to go in and manually add a few in, but it was not a lot and if it would have saved me any time I could have just as easily done it with sed, too
This can now be saved in the finetune repo directory as train.txt, or really anything we want to call it, we can point it to the right file later in our training script

Validation File

The validation file is just a small text file with a sample of the text we’re training the model on. So, for the Daodejing example it would be one verse. We save this as validation.txt in the finetune repo directory (or again, whatever we want as long as we point to it in the training script.) It should look something like this in the case of the Daodejing
If you are training on something with a lot of small samples, you will likely have to include several examples in your validation file! For example, one verse of the Daodejing is too short. One message from a chat long is much too short. This is not documented anywhere as far as I can find so I don’t know exactly how long it has to be. If it’s too short datasets will give an “index out of bound” error when you attempt to train.

Convert train.txt and validation.txt to csv

Technically we can train straight from a text file. The problem is, dataloader, which this uses doesn’t interpret line breaks correctly and so will read it all as one very long line. It doesn’t have this problem with csvs.
the finetune repo contains a python script to convert train.txt and validation.txt. Just make sure those two files are in the finetune-gpt2xl (you’ll have to modify this if you named them something else) and run

code:
python text2csv.py
This should return “created train.csv and validation.csv files” and you’re good to go

Part 3: Training

Training Script

We’re going to need to make a shell script now in the finetune-gpt2xl that calls DeepSpeed and sets our configuration for our training. A basic training script for finetuning GPT-Neo 1.3B will look like this

code:
deepspeed --num_gpus=1 run_clm.py \
--deepspeed ds_config_gptneo.json \
--model_name_or_path EleutherAI/gpt-neo-1.3B \
--train_file train.csv \
--validation_file validation.csv \
--do_train \
--do_eval \
--fp16 \
--overwrite_cache \
--evaluation_strategy="steps" \
--output_dir finetuned \
--num_train_epochs 1 \
--eval_steps 15 \
--gradient_accumulation_steps 2 \
--per_device_train_batch_size 4 \
--use_fast_tokenizer False \
--learning_rate 5e-06 \
--warmup_steps 10
This would also work on GPT-2 or probably most models by just changing the model name.

Some of these are self-explanatory
“–fp16” tells the training process to use mixed-precision training with 16-bit floating point numbers instead of 32-bit floating point numbers. This reduces the amount of vram we need to finetune the model and increases training speed by quite a bit. Because we’re using consumer hardware we want this.
“–evaluation_strategy=”steps” ” Evaluates the model after the specific number of steps specified by “–eval_steps 15” This doesn’t affect the quality of the output at all and is really useful for tracking the model’s progress while training. You have other options here besides “steps”, “no” and “epoch”
“–output_dir finetuned ” Just says where the model will be built and end up after we’re done finetuning
“–num_train_epochs 1 ” This one is important and has a large impact on our final model, as well as how long it takes us to train the model. Higher values here take longer to train. Specifically, this is the number of times each example in the training date will be iterated over. With a training epoch of 1, the model will see each example in the training data once. This needs to be higher for smaller models which just have more trouble learning from smaller samples. We can also set it higher if we really want to force the model to speak like our training data. If we set this too high we’ll end up overfitting the model and it will have a lot of trouble deviating from exactly what’s in the training data.
“–gradient_accumulation_steps 2” While training, the model is updated by calculating changes (these are the gradients.) This tells the model how many batches of data to accumulate before updating the model. In general, higher settings require less vram, but train more slowly. Higher values can also stabilize training because it’s less sensitive to noise in the larger batches as well as improve the model’s ability to generalize since the gradients are computed on bigger batches.
“–per_device_train_batch_size 4” This is how many examples the model will process at a time while training. This is multiplied by the number of GPUs. Larger values here can make training faster, but it will require more vram. What you set this to depends on how much vram you have, how large the model you’re finetuning is, and how big your training data is.
“–use_fast_tokenizer False ” When you send words to a model it doesn’t actually use the words, it converts the words to tokens, then it generates tokens and the tokens are converted back into words. the fast tokenizer is a newer implementation of this process. It is, as the name implies, faster. But, it’s also less accurate and so we’re probably not going to want to turn this to true unless accuracy is unimportant or the training data is very simple.
“–learning_rate 5e-06” This sets how fast or how slow the model learns during fine tuning. If it’s set high, we’ll be making bigger changes to the model. If it’s set too low, we’ll only be making minor changes. This is another setting where it really depends on our training data and on which model we’re training. Setting this too high may cause the model to start diverging and make a lot of errors. It’s generally best to start small and gradually increase it until you get the results you want. Though, since we’re training on consumer hardware that might be a long process.
5e-06 is scientific notation, the number comes out to 0.000005
“–warmup_steps 10” The model will increase the learning rate gradually from a very small number to our final learning rate over 10 steps. This can help prevent the model from getting stuck in a poor state. The larger your model and the larger your training data, the higher number of warmup steps you need. Though, if you set it too high that can possibly lead to overfitting.

Whatever your settings, save the script as something.sh in the finetune-gpt2xl directory. If for some reason you’re editing it in Windows make sure you save it with unix line endings, you’ll get weird errors the other way. I’m going to go with training.sh, and make it executable

code:
chmod +x training.sh
And now just run your script and let it train. Depending on your hardware and the size of your training data this can take a while and your computer will not be able to do anything well while it’s running.

code:
./training.sh
If you don’t already have the model, or you haven’t put it where transformers caches its models, it will download it before training
With any luck things wont break and eventually it will ask us if we want to use wandb to visualize our results. This is unnecessary but if you’re interested you can check out what it does at wandb.ai
Otherwise, just type 3 and continue. We will still get an output showing us our resoluts from wandb after we train regardless.
Eventually you will get a progress bar and a time estimate, it’s at this point that the model is actually training. So, just let it run and try not to touch anything
Your model will now be in the finetuned directory inside the finetune-gpt2xl directory. You can move it to a more convenient place to use it with your project. You will have to empty out the finetuned directory to train another model, it wont train if there’s already data in that directory.

Using the Model
To use it in your GPT script you can specify the model’s location with

code:
model_path = "path/to/model"
This can be an absolute or relative path, but with the obvious limitation that comes from a relative path (it needs to be relative to where you’re running the script from)

Now, run the script with your new model. You can use the trigger word we finetuned the model with to tell it “generate text in this style.” Like mentioned earlier, have your script pass a prompt to the model like;
conversation example
users prompt
trigger word:
And then filter the those parts from the output to hide them from the user.

You can leave your training environment with
code:
conda deactivate
And enter it again to do more fintetuning with

code:
conda activate p39

BrainDance
May 8, 2007

Disco all night long!

And, more of the Daoism model trained that way. It's very good. I really think someone could be easily convinced these are legitimate and it does much better at this one task after being trained than GPT-Neo 1.3B generally does at anything else. The only tell in one is that it uses information that ancient Chinese people wouldn't have/didn't exist then. It makes it point by talking about how Europe used to have kings but now doesn't, which wasn't true 2500 years ago, and Zhuangzi probably didn't even know what Europe was. But the point is still a good point, it's like if Zhuangzi were alive in 2023.




TIP
Mar 21, 2006

Your move, creep.



controlnet is amazing



truly anything is now possible

Confusedslight
Jan 9, 2020

TIP posted:

controlnet is amazing



truly anything is now possible

Amazing.

Ruffian Price
Sep 17, 2016

KakerMix posted:

all mixed with cargo culting weirdness as we are all throwing bones and claiming we see something in the summoning procedure.
love it when I download an example image and the negative prompt is filled with outright begging

TIP
Mar 21, 2006

Your move, creep.



I dressed it up for my rear end thread

KakerMix
Apr 8, 2004

8.2 M.P.G.
:byetankie:

TIP posted:

I dressed it up for my rear end thread

lmfao


Using ControlNet this time with the combination of inpainting with Krita. I took this real picture:


Got this as the first pass with the prompt I was using:


And went through and hand tweaked a whole lot of stuff within Krita



ControlNet? It's good. Combine that with the intricate stuff you can do with inpainting and Krita, why you have yourself quite the powerful creative tool.

Doctor Zero
Sep 21, 2002

Would you like a jelly baby?
It's been in my pocket through 4 regenerations,
but it's still good.

Hey, thanks everyone for the love on Jim Henson's Alien. It really was an odd fugue state creation. Once I had the tools down, I worked on it every spare second for a week. First I was going to do a gallery, then a video gallery, then make it looks like a VHS dub, then add a little narration, then just loving narrate the whole goddamn thing. I had to stop myself from making a soundtrack for it. :haw:

I had recorded home video, so I used bits of that for the VHS artifacts.

I tried to use chat GPT to write the script, but surprisingly everything it wrote sucked, so I ended up just writing it myself. Alternate history faux documentary script writers are safe for now.

I need to get Stable Diffusion and Control Net working on my Mac if there is a "next time". That poo poo is next level.



Roman posted:

thanks for the info. yeah I'm probably going to have to switch to SD, although maybe I should try writing the script first lol.

There's an upside and a downside to this. Perhaps less so now that this Control Net stuff exists. I started with the pictures and tried to write a script around it, but that was mostly because it was hard to get the tools to give me exact details. I ended up making a bunch of images, arranging them how I wanted and then writing the narration, so it doesn't really flow like a logical documentary.

The better way would be to write the script first, as you say. I imagine that's going to take a lot more images to get what you want.

Doctor Zero fucked around with this message at 13:59 on Feb 20, 2023

Sedgr
Sep 16, 2007

Neat!

BrainDance posted:

Ok, so here’s a very trimmed down version of the guide.

tldr ;)

:lol:

Thanks for the effort post. Good stuff in there.

Lucid Dream
Feb 4, 2003

That boy ain't right.
Someone used the command !chron World War 2 Air Dogs and a minute later we watched this:
https://clips.twitch.tv/AnimatedTardyNightingaleWoofer-cn_K4Z0xm2t3NPh6

LASER BEAM DREAM
Nov 3, 2005

Oh, what? So now I suppose you're just going to sit there and pout?

Lucid Dream posted:

Someone used the command !chron World War 2 Air Dogs and a minute later we watched this:
https://clips.twitch.tv/AnimatedTardyNightingaleWoofer-cn_K4Z0xm2t3NPh6

I had ChatGPT tell me a little more. RIP Chips

ChatGPT tell me a story in the form of a WW2 documentary about the Air Dog Corps posted:

The Air Dog Corps saw action in many theaters of the war. In the Pacific, dogs and their handlers served on airfields in Australia, New Guinea, and the Philippines. In Europe, they were stationed in England, France, and Germany. And in the Mediterranean, they worked from airfields in North Africa and Italy.

One of the most famous Air Dog Corps missions took place in the Pacific theater. In 1943, a dog named Chips and his handler, Pvt. John Rowell, were stationed on a beach in Sicily. When a group of enemy soldiers approached, Chips broke free from his leash and attacked them, biting several before being shot himself. The incident earned Chips a Purple Heart and a Silver Star, and helped to raise the profile of the Air Dog Corps.

Despite their important role in the war, the Air Dog Corps was disbanded soon after the end of World War II. Many of the dogs were either adopted by their handlers or given to civilian homes, while others were put down. But the legacy of the Air Dog Corps lives on, and their service is remembered as an important part of the war effort.

Lucid Dream
Feb 4, 2003

That boy ain't right.

LASER BEAM DREAM posted:

I had ChatGPT tell me a little more. RIP Chips

Went out as a good boy.

LASER BEAM DREAM
Nov 3, 2005

Oh, what? So now I suppose you're just going to sit there and pout?
reddit is really gross. This is from the SD sub right now.



Where else can I go to stay up to date with the latest?

LASER BEAM DREAM fucked around with this message at 00:41 on Feb 21, 2023

Telsa Cola
Aug 19, 2011

No... this is all wrong... this whole operation has just gone completely sidewaysface
That's very on brand for homelander.

RPATDO_LAMD
Mar 22, 2013

🐘🪠🍆
old news, people were already white-itizing the little mermaid previews months ago

Duck and Cover
Apr 6, 2007

LASER BEAM DREAM posted:

reddit is really gross. This is from the SD sub right now.



Where else can I go to stay up to date with the latest?

Watch out for those sandworms.

Roman
Aug 8, 2002

Just wanted to post this one because it was cool.
Non-bold part in the MJ prompt was stuff I copied and pasted from someone else's to see what it would do:

dark metallic pyramid among misty green mountains, glowing red highlights, fog, daytime, rainy, mountain range, grand canyon, in the style of crysis, movie still, cinematic, photorealistic, cgsociety, light and space, reimagined by industrial light and magic, criterion collection, #vfxfriday, behance, sharp focus, high quality, photographed on grainy medium format Kodak Portra 800 film SMC Takumar 105mm f/2.8 c 50 --ar 16:9 --v 4

Roman fucked around with this message at 04:57 on Feb 21, 2023

Analytic Engine
May 18, 2009

not the analytical engine

Alan Smithee
Jan 4, 2005


A man becomes preeminent, he's expected to have enthusiasms.

Enthusiasms, enthusiasms...

Roman posted:

I might have to switch to that. I made a bunch of (slightly less) photorealistic stuff in MJ but my problem is figuring out how to make it fit the actual vibe of the project more.

Like the thing I'm making is supposed to be a live action MIB/Rick & Morty kinda thing but the shots look more like some NCIS crime drama on Paramount Plus.



*Anna Kendrick horror movie that's in theaters for 2 weeks

BrainDance
May 8, 2007

Disco all night long!

Anyone hear anything or know anything about this? https://github.com/FMInference/FlexGen

The first real commit was less than a day ago, but if what it claims is true this seems like a very big deal for text generation with large language models. But I don't wanna get hyped until people smarter than me have weighed in on it.

But these have been my thoughts on language models that I might have ranted in here about. Large language models arent nearly as exciting until we can distill or optimize them in some major way to run on consumer hardware like what happened with Stable Diffusion, and that the future of them is more in training individual models to be very good at one specific thing and not really general models.

The open source models are super capable, but it doesn't mean as much until people can just run them without having to pay google a bunch of money to do it.

LASER BEAM DREAM
Nov 3, 2005

Oh, what? So now I suppose you're just going to sit there and pout?
I've only been playing with Stable Diffusion since Friday and just splurged for a 4090. The 3080 it's replacing is going into a server for batch tasks. I'm mildly concerned that video cards are going to become scarce again as more people start to see how powerful AI tools are. Up until last week, I had no intention of replacing the 3080 till it died, since it really does everything I need it to for games. I've never bought a top series card before, either.

LASER BEAM DREAM fucked around with this message at 17:49 on Feb 21, 2023

AARD VARKMAN
May 17, 1993

Alan Smithee posted:

*Anna Kendrick horror movie that's in theaters for 2 weeks

and is recommended to you on Netflix for years but you always look it up and see it has a 17% on rotten tomatoes

lunar detritus
May 6, 2009


LASER BEAM DREAM posted:

I've only been playing with Stable Diffusion since Friday and just splurged for a 4090. The 3080 it's replacing is going into a server for batch tasks. I'm mildly concerned that video cards are going to become scarce again as more people start to see how powerful AI tools are. Up until last week, I had no intention of replacing the 3080 till it died, since it really does everything I need it to for games. I've never bought a top series card before, either.

I'm somewhat tempted of replacing my 3080 10GB with a 3090, really wish I had more VRAM.

Tagichatn
Jun 7, 2009

Bud, I'm on a 1070 here. It seems like a bad time to upgrade though with Nvidia going crazy with the prices of this generation.

Tree Reformat
Apr 2, 2022

by Fluffdaddy

Tagichatn posted:

Bud, I'm on a 1070 here. It seems like a bad time to upgrade though with Nvidia going crazy with the prices of this generation.

It's been bad for years thanks to cryptobros and the pandemic-worsened chip shortage. I'm desperately keeping my Radeon R9 390 alive because I know I can't afford to replace it when it dies.

IShallRiseAgain
Sep 12, 2008

Well ain't that precious?

BrainDance posted:

Anyone hear anything or know anything about this? https://github.com/FMInference/FlexGen

The first real commit was less than a day ago, but if what it claims is true this seems like a very big deal for text generation with large language models. But I don't wanna get hyped until people smarter than me have weighed in on it.

But these have been my thoughts on language models that I might have ranted in here about. Large language models arent nearly as exciting until we can distill or optimize them in some major way to run on consumer hardware like what happened with Stable Diffusion, and that the future of them is more in training individual models to be very good at one specific thing and not really general models.

The open source models are super capable, but it doesn't mean as much until people can just run them without having to pay google a bunch of money to do it.

I tested it out, still not that great for consumer hardware. You need at least 3090 ti to have it be somewhat usable, except for the real simple models.

pixaal
Jan 8, 2004

All ice cream is now for all beings, no matter how many legs.


IShallRiseAgain posted:

I tested it out, still not that great for consumer hardware. You need at least 3090 ti to have it be somewhat usable, except for the real simple models.

Early Stable Diffusion was pretty rough, we now use half the VRAM and get larger images.

I'm excited for what this year brings. I don't think my 3070 mobile has any hope of ever running this, but it opens the doors for so many cool projects.

StarkRavingMad
Sep 27, 2001


Yams Fan
Newbie question - say I have a bunch of negative prompts that I'm habitually using. Is there an way to combine those into something rather than cutting and pasting them every time?

IShallRiseAgain
Sep 12, 2008

Well ain't that precious?

StarkRavingMad posted:

Newbie question - say I have a bunch of negative prompts that I'm habitually using. Is there an way to combine those into something rather than cutting and pasting them every time?

You can use styles if you are using AUTOMATIC111.

KwegiboHB
Feb 2, 2004

nonconformist art brut
Negative prompt: amenable, compliant, docile, law-abiding, lawful, legal, legitimate, obedient, orderly, submissive, tractable
Steps: 32, Sampler: DPM++ 2M Karras, CFG scale: 11, Seed: 520244594, Size: 512x512, Model hash: 99fd5c4b6f, Model: seekArtMEGA_mega20

StarkRavingMad posted:

Newbie question - say I have a bunch of negative prompts that I'm habitually using. Is there an way to combine those into something rather than cutting and pasting them every time?

If you have the hardware you can train them into a negative textual embedding like https://huggingface.co/datasets/Nerfgun3/bad_prompt then you only need the keyword in the negative prompt.


In other news, I'm on day 2 of trying to get the latest webui to work and I'm thinking of just flattening and reinstalling instead of messing with python more... if I can't get this to work by the end of the day I'm going to post the general engineering idea behind this Block Merge project and make a detailed write up of it later. When the cutting edge becomes the bleeding edge...

StarkRavingMad
Sep 27, 2001


Yams Fan

IShallRiseAgain posted:

You can use styles if you are using AUTOMATIC111.

Aha! Just what I was looking for. I should have moused over those little buttons I never use.

Davethulhu
Aug 12, 2003

Morbid Hound
https://twitter.com/sweaty_goblins/status/1628116870015352833

Megazver
Jan 13, 2006

StarkRavingMad
Sep 27, 2001


Yams Fan
truly we live in blessed times

mobby_6kl
Aug 9, 2009

by Fluffdaddy
Is that it, the perfect woman?

KakerMix
Apr 8, 2004

8.2 M.P.G.
:byetankie:
I know we like to joke around, have a good time, a few laughs in this thread but do not be surprised at what most AI image generation is being used for right now.

Porn. It's porn.

Adbot
ADBOT LOVES YOU

StarkRavingMad
Sep 27, 2001


Yams Fan

KakerMix posted:

I know we like to joke around, have a good time, a few laughs in this thread but do not be surprised at what most AI image generation is being used for right now.

Porn. It's porn.

StarkRavingMad posted:

truly we live in blessed times

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply