Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
AARD VARKMAN
May 17, 1993
god drat that poo poo is crazy :staredog:

Adbot
ADBOT LOVES YOU

Mola Yam
Jun 18, 2004

Kali Ma Shakti de!
well that got good faster than i was expecting

we're not even 24 months out from Dall-E 2

KakerMix
Apr 8, 2004

8.2 M.P.G.
:byetankie:

Mola Yam posted:

well that got good faster than i was expecting

This was me a year ago, it's all going so fast that I just expect it to make huge leaps all the time now. In this instance I'm not surprised at all that text to video is so good now.

pixaal
Jan 8, 2004

All ice cream is now for all beings, no matter how many legs.


Mola Yam posted:

well that got good faster than i was expecting

these are only 10s or so given how other image to video (I know this is text to video) work they tend to fall apart around the 16s mark. I'd be a bit surprised if it was coherent over a minute.

I'm saying this is in line with my expectations of being able to make a 15 minute silent video by end of year.

Brown Moses
Feb 22, 2002

The bird video is insane, the shape and detail off the feathers don't change at all, even when it moves around. The coherence of the images is really incredible. Sam Altman is currently taking requests from Twitter users, which also look incredible
https://twitter.com/sama/status/1758219575882301608

TIP
Mar 21, 2006

Your move, creep.



granny's got t-1000 spoon powers

Captain Hygiene
Sep 17, 2007

You mess with the crabbo...



e: ^^^ I swear I came up with that analogy independently



This one's wild in a couple ways. I thought it looked good at first, but then the kid's hand pops up and reverse morphs itself like the T-1000 changing direction.

And then I went back to rewatch the beginning - I thought the main folks were on a balcony at first, but I think the AI's getting confused and the crowd off to the side is actually just made up of tiny people :psyduck:

KwegiboHB
Feb 2, 2004

nonconformist art brut
Negative prompt: amenable, compliant, docile, law-abiding, lawful, legal, legitimate, obedient, orderly, submissive, tractable
Steps: 32, Sampler: DPM++ 2M Karras, CFG scale: 11, Seed: 520244594, Size: 512x512, Model hash: 99fd5c4b6f, Model: seekArtMEGA_mega20

moist banana bread posted:

The first time I tried to generate something pornographic out of curiosity it output this bizarre warped mangled pile of bodies looking thing. It felt kinda like I violated the computer. It's hard not to personify this stuff and we're definitely gonna have a crowd of dudes in love with a Tamagotchi soon. I know I'm just ranting now but maybe we should be buying Hasbro stock. They just need to release an advanced enough Furby to join the bubble.

I was looking at these services, you guys are really paying $8 for Midjourney or $20 for ChatGPT4+Dall-e? I don't mean that in a disparaging way, I just mean like, yeah, wow, it's good enough to be "worth it."

Does anyone have an LLM augmenting SD? I think options might include a chatGPT API plugin or I guess there are local LLM models? I tried https://github.com/turboderp/exllama, but it requires a different version of torch breaking xformers.

Edit: there's also Petals , the distributed computing approach to LLM's. (their demo)

I'm doing more than just running a local LLM, I'm also setting up new hardware and going over the fast.ai course so I can create my own multi-modal model to generate both text and images. https://course.fast.ai/ It's slow going but I keep making progress and I'm happy with that. I'm big on free, open source, and local run. I hope to release the model to the world when I'm done.

My goal is to be able to directly ask why the image generation AI Model produced six fingers just to see what the response actually is. The answer will be a fascinating insight no matter what it says. The closest I've found so far is LLaVa which I've linked in the past. That is a text generator with machine vision. It doesn't generate images on its own but can view them and talk about them. You can try a demo yourself here https://llava.hliu.cc/.

I can make a longer write-up about LLMs in general but it's honestly easier to just point you at how to run your own because it's so dead simple now. You can ask the AI itself if you want to know more.

https://github.com/LostRuins/koboldcpp/releases If you're on windows there's no need for installation of anything. Download the .exe, run the program, load the model, point your browser at localhost, chat, that's it. No mess, no fuss. The new Dynamic Temperature setting is amazing too, self-adjusting models offer more than an illusion of responding to you.
Especially neat in that you can partially load a larger model on both GPU and CPU at the cost of taking longer to generate. I run a 13B parameter model on a GTX 1660 Super for example, if you wanted you could run on CPU only so it's relatively accessible. Yes, LLM models can be run on some phones if you wanted to.

https://huggingface.co/TheBloke is where you find LLM models that are quantized (shrunk) to fit on your hardware. GGUF files are what you want for KoboldCPP. LLM models are currently being mixed and mashed and merged together in much the same way Stable Diffusion models were early on so it's a wild wonderful chaotic mess right now. No one really knows what's best as we're all kind of throwing spaghetti at the wall to see what sticks. There are thousands of mixes for all sorts of specific fine-tunings. If you find one you like, be sure to share with the class!

https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard Open LLM Leaderboard
https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard Chatbot Arena
Two good sources for getting a relative idea of what model to try. I can give a personal thumbs up recommendation for anything based on LLaMa 2. Mixtral 7b is also incredibly popular but I haven't tried it myself yet. Things are moving incredibly fast and anything I tell you about today could outdated by tomorrow, literally. We live in exciting times!

XYZAB
Jun 29, 2003

HNNNNNGG!!

BoldFace posted:

OpenAI announces their new text-to-video model, Sora.

https://openai.com/sora

edit: Some highlights
bunch of videos

I need to see some otherworldly imaginative poo poo, otherwise it might as well just be a video of a loving bird. Which is simultaneously amazing and points to the fact that we've crossed a threshold, but why are these all milquetoast renders of realistic scenes when the fun happens when you bend reality and try to break the technology. :confused:

Edit: I'm scrolling what's available on the site and there are a few imaginative ones, but they're burying the lead by hiding them and showcasing the realistic poo poo imo.

edit: here we loving go

"A gorgeously rendered papercraft world of a coral reef, rife with colorful fish and sea creatures."

https://cdn.openai.com/sora/videos/origami-undersea.mp4

XYZAB fucked around with this message at 23:18 on Feb 15, 2024

Seven Force
Nov 9, 2005

WARNING!

BOSS IS APPROACHING!!!

SEVEN FORCE

--ACTIONS--

SHITPOSTING

LOVE LOVE DANCING

XYZAB posted:

I need to see some otherworldly imaginative poo poo, otherwise it might as well just be a video of a loving bird. Which is simultaneously amazing and points to the fact that we've crossed a threshold, but why are these all milquetoast renders of realistic scenes when the fun happens when you bend reality and try to break the technology. :confused:

Edit: I'm scrolling what's available on the site and there are a few imaginative ones, but they're burying the lead by hiding them and showcasing the realistic poo poo imo.

gotta crawl before you can walk/run

gotta know the rules before you can break em













this post was generated by shitpostAI

wedgie deliverer
Oct 2, 2010

Lol I'm not gonna ever find a job again

BoldFace
Feb 28, 2011
A lot more videos on their blog post.

https://openai.com/research/video-generation-models-as-world-simulators

KakerMix
Apr 8, 2004

8.2 M.P.G.
:byetankie:

pixaal posted:

Does it support Image to video so I can say seed it? When can I, a paying customer play with it? (I know this isn't public yet but I can't be the only one interested)

Looks like it, follow BoldFace's link:


OpenAI posted:

Animating DALL·E images
Sora is capable of generating videos provided an image and prompt as input. Below we show example videos generated based on DALL·E 231 and DALL·E 330 images.
https://cdn.openai.com/tmp/s/prompting_1.mp4

https://cdn.openai.com/tmp/s/prompting_3.mp4

Roman
Aug 8, 2002

BoldFace posted:

OpenAI announces their new text-to-video model, Sora.
I am really impressed by the walking. It's the first video model I've seen that seems to understand how legs work. And seeing it work in crowds is impressive. There are glitches but you have to actually look for them.

There seems to be an impressive understanding of physics in this model. It also handles video to video and image to video. Here's an impressive image to video:

https://twitter.com/angrypenguinPNG/status/1758296824484663678

Roman
Aug 8, 2002

Some video to video

https://twitter.com/bilawalsidhu/status/1758308160488566925

LifeSunDeath
Jan 4, 2007

still gay rights and smoke weed every day
those videos are nuts

oh poo poo this just dropped
https://www.youtube.com/watch?v=NXpdyAWLDas

LifeSunDeath fucked around with this message at 04:23 on Feb 16, 2024

Tunicate
May 15, 2012

LifeSunDeath posted:

those videos are nuts

oh poo poo this just dropped
https://www.youtube.com/watch?v=NXpdyAWLDas

look at the legs at 2:39ish, it's amazing how it often gets walking right and then sometimes does something wacky

Soulhunter
Dec 2, 2005

AARD VARKMAN posted:

Lmao combining minions + sref + weird is fun as hell
e: Christ alive

For some strange reason all I hear is banjos when I look at this one.

Solefald posted:

While my probe against the horrific Minion Tits is a joke;
I hear you, Minion Cleavage is Too Far in this, the thread that brought us highbrow AI art like "Sonic in a Jar of Cum", "Shrek Takes a Big Diarrhea poo poo on a Stump," "Donald Duck loving Mickey Mouse," and "Robocop pisses himself in a diaper".

This is a place of refined tastes and delicate sensibilities. All minions will be generated with clothing that promotes modesty from now on, so as to not invite any further unclean, impure thoughts in anyone who found themselves stirred by the forbidden allure of Minion Tits.

Without further ado, here's some Burka / Hijabi Minions I made today:




Some Francisco Goya / Stephen Gammell / Junji Ito minion mash-ups





These are loving amazing and I cannot wait to see what becomes available to the public in the next year. OpenAI remains at the top of their game.

credburn
Jun 22, 2016
President, Founder of the Brent Spiner Fan Club
edit: nevermind

I wish human-edited videos didn't do obnoxious zoom-cuts or weird edits to cut out dead air. It's annoying, jarring, and distracting.

edit 5: this guy has well-edited videos mostly. But there are times he just cuts out dead air and it fuckin sucks. Make the effort to have meaningful transition!

credburn fucked around with this message at 07:36 on Feb 16, 2024

GABA ghoul
Oct 29, 2011

Goddamn, that's like an order of magnitude better than the ~3 months old Stable Video Diffusion. Guess those couple of billions in additional funding for OpenAI will start to show more and more

Nigmaetcetera
Nov 17, 2004

borkborkborkmorkmorkmork-gabbalooins
Oh man, DALL-E has no clue how to depict someone playing cat’s cradle.





It also has no idea how to depict a middle-aged transgender woman of southern Mexican ancestry in the first one. I said to do it in the style of Vermeer and it spits out a picture of girl with a pearl earring playing with what looks like a sensory toy for small children.

I like the middle one though because it reminds me of trying to use a smartphone on mescaline. Especially the skin, that’s like 100% mescaline visuals, at least the way I get them.

Why the gently caress do I pay monthly for this stupid thing? Oh yeah because it kills time.

LifeSunDeath
Jan 4, 2007

still gay rights and smoke weed every day
some sora glitching. goddamn it mostly looks real though. ugh, this stuff was all pretty fun till these hyper realistic videos started, now I'm certain were' all gonna be enslaved.
https://packaged-media.redd.it/5bus...168ae091896#t=0

RIP Syndrome
Feb 24, 2016

Sora looks cool, but may be less useful than still images. What're the use cases for a few seconds of imperfect video, apart from meme gifs? It's still at the level where it couldn't create anything like an actual movie (no object permanence, for starters - things disappear when briefly obscured). And it's nigh impossible for a human to edit out the weirdness.

naem
May 29, 2011

wedgie deliverer posted:

Lol I'm not gonna ever find a job again

I’ve never found a job in the first place

pixaal
Jan 8, 2004

All ice cream is now for all beings, no matter how many legs.


RIP Syndrome posted:

Sora looks cool, but may be less useful than still images. What're the use cases for a few seconds of imperfect video, apart from meme gifs? It's still at the level where it couldn't create anything like an actual movie (no object permanence, for starters - things disappear when briefly obscured). And it's nigh impossible for a human to edit out the weirdness.

This is the first step, it wasn't all that long ago that this was a good image for "Frog in a gamestop"


anyway I'm very excited to shitpost with Sora because I already shitpost with Pika and that gets me like 4 a month.

Nigmaetcetera
Nov 17, 2004

borkborkborkmorkmorkmork-gabbalooins
Does anyone remember the name and/or url for any of those old, primitive AIs that would generate what looks like a cluttered desk full of junk at first glance, but when you look at it closer you realize you can’t identify a single object?

Sab669
Sep 24, 2009

RIP Syndrome posted:

Sora looks cool, but may be less useful than still images. What're the use cases for a few seconds of imperfect video, apart from meme gifs? It's still at the level where it couldn't create anything like an actual movie (no object permanence, for starters - things disappear when briefly obscured). And it's nigh impossible for a human to edit out the weirdness.

Obviously it's only going to get better and better.



Just found this thread this morning as I never venture into GBS; those Minion titties on the other page are the most cursed things I've ever seen.

RIP Syndrome
Feb 24, 2016

pixaal posted:

This is the first step, it wasn't all that long ago that this was a good image for "Frog in a gamestop"

Oh, I totally get that, and it's cool. But both the image and video generators are still in the "make image look plausible" hallucination stages. There's no executive function with any understanding of what's going on, and that seems like it'd be a more pronounced issue in moving pictures. Still image generators haven't really moved past that first step yet either.

It'll happen, though. I think there've been some promising moves with long-term memory and reasoning in text generators, and some of those lessons are likely transferable.

Soulhunter
Dec 2, 2005

LifeSunDeath posted:

some sora glitching. goddamn it mostly looks real though. ugh, this stuff was all pretty fun till these hyper realistic videos started, now I'm certain were' all gonna be enslaved.

Man, even recognizing that it's glitching out, there's a weird quality to it that's kind of convincing / not completely setting off the uncanny valley "THIS IS WRONG" effect to me when I watch it. I'd compare it to watching someone do sleight of hand magic I guess?

Yes, simply pull the plastic sheet out of the hole and unfold it into a molded plastic hoverchair. This is normal. Everything is fine.

Midjourney allows the word "boudoir" past the filter now, and so I did the natural thing and made a bunch of boudoir photos of vegetables. Prompts are variations on the below:

brick walls, a(n) [vegetable], paint me like one of your french girls, in the style of oil paint on canvas, boudoir photography, grungy, the [vegetable] is lightly misted with water, the [vegetable] is tangled in bedsheets --ar 3:2 --c 50 --w 1000








A couple oddities that came out of the set:


:nws: vegetables that came out slightly different when I tried similar prompts in Bing / DALL-E :nws:

Tunicate
May 15, 2012

Nigmaetcetera posted:

Does anyone remember the name and/or url for any of those old, primitive AIs that would generate what looks like a cluttered desk full of junk at first glance, but when you look at it closer you realize you can’t identify a single object?

Art breeder still has rhe older ones last i saw

GABA ghoul
Oct 29, 2011

RIP Syndrome posted:

Sora looks cool, but may be less useful than still images. What're the use cases for a few seconds of imperfect video, apart from meme gifs? It's still at the level where it couldn't create anything like an actual movie (no object permanence, for starters - things disappear when briefly obscured). And it's nigh impossible for a human to edit out the weirdness.

On demand "stock" gifs for web content? Improve your chatGPT generated article on how to eat a pineapple with funny gifs of pineapples! There are rumors going around in CEO circles that those drat genZ kids of today don't like to read pictures anymore and only want to see moving thing.

Also, as someone else already said: that's only like 1-2 years of progress from Lovecraftian nightmare visions of Will-Smith-Thing eating pizza. And they got infinite funding now, which they hadn't before.

RIP Syndrome
Feb 24, 2016

Pre-DALL-E image gens were hot poo poo. Made these in 2019 or 2020, don't remember how:



"trump fursona"



"trump fursona drinking honey polaroid"

Brawnfire
Jul 13, 2004

🎧Listen to Cylindricule!🎵
https://linktr.ee/Cylindricule

I was using those sorts of images for my CYOA back in the day, I found they stimulated a lot of imagination. You don't need to do nearly the mental filling in of blanks with modern AI images, which is fun to look at but it doesn't inspire me to write anything weird like it used to.

pixaal
Jan 8, 2004

All ice cream is now for all beings, no matter how many legs.


Sab669 posted:


Just found this thread this morning as I never venture into GBS; those Minion titties on the other page are the most cursed things I've ever seen.

If you play with the fire you'll see far more cursed. I have attempted to make Danny DeVito kicking a football as a test image with controlnet poser one of my first. I'm very bad at human anatomy. SD decided he looked more like he was taking a poo poo so it made a wrinkly rear end birthing a football. So cursed.

Swagman
Jun 10, 2003

Yes...all was once again peaceful in River City.
























LifeSunDeath
Jan 4, 2007

still gay rights and smoke weed every day

Soulhunter posted:

:nws: vegetables that came out slightly different when I tried similar prompts in Bing / DALL-E :nws:


hell yeah put them titties on the glass, carrot.

mcbexx
Jul 4, 2004

British dentistry is
not on trial here!



Finally, I feel seen.

https://www.youtube.com/watch?v=mcYl70vq_Ns

ymgve
Jan 2, 2004


:dukedog:
Offensive Clock

LifeSunDeath posted:

those videos are nuts

oh poo poo this just dropped
https://www.youtube.com/watch?v=NXpdyAWLDas

He points out that in the Tokyo video the reflections move at a different framerate than the rest of the scene, and I guess this might mean that they self-produce some of the training data by just recording random fly-throughs of scenes in Unreal Engine or Unity.

I gotta revise my prediction for the future a little bit - I previously have said that the future would be billions of insect sized camera drones flying around the world, because the all-knowing AI is drowning in so much generated poo poo that it has to use drones it controls to know what the real world looks like.

Now my prediction is the AI will use the billions of drones to generate training data for OpenAI.

Roman
Aug 8, 2002

ymgve posted:

He points out that in the Tokyo video the reflections move at a different framerate than the rest of the scene, and I guess this might mean that they self-produce some of the training data by just recording random fly-throughs of scenes in Unreal Engine or Unity.
Thread on how UE5 is probably being used in the training data
https://twitter.com/ralphbrooks/status/1758230974700130445

Adbot
ADBOT LOVES YOU

Earwicker
Jan 6, 2003

pixaal posted:

This is the first step, it wasn't all that long ago that this was a good image for "Frog in a gamestop"


anyway I'm very excited to shitpost with Sora because I already shitpost with Pika and that gets me like 4 a month.

thats still a pretty good image of a frog in a gamestop imo

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply