Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
dumb.
Apr 11, 2014

-=💀=-

XYZAB posted:

I posed this question in the D&D AI thread thinking it would be more appropriate there, but it doesn't seem to be getting any traction so I'll ask here instead:

Can anyone here recommend to me an “audio file to text” AI transcription github project that I can install locally to chew over a bunch of .wav files I’ve got kicking around, that doesn’t require me to upload those files to a third party service for processing? Think hundreds of hours of noisy lecture audio in raw 24 bit 48khz *.wav format that I can let an RTX card chug away at, that also isn’t the newest version of Microsoft Word. That’s what I’m looking for. Does such a thing exist?

I've had decent luck with this windows build of Whisper:

https://github.com/Purfview/whisper-standalone-win/

It whipped through a bunch of hour+ long audio files on my 4080, and the results were pretty accurate. Be sure to get the cuBLAS/cuDNN libraries.

Adbot
ADBOT LOVES YOU

dumb.
Apr 11, 2014

-=💀=-






dumb.
Apr 11, 2014

-=💀=-

snorch posted:

I'd watch any one of those.

Also PaleFigure and swagman what are y'all using to get those images? They seem unusually crisp and coherent, or is it just the Art of the Prompt at work? I get decent results with SDXL but I'd say it only ever reaches about 80% of that fidelity.



I've found lowering the CFG scale helps.

dumb.
Apr 11, 2014

-=💀=-

namlosh posted:

Sorry if this is the wrong thread, but what are some of the best models that can be run locally?
I’ve got it stable diffusion running locally from this git repo:
https://github.com/CompVis/stable-diffusion.git
And it’s generating images but they aren’t that great. I’m not trying to generate stuff as good as whats in this thread, but I am trying to generate the best I can locally for free. I want to see what the limit is for offline generation. I have a beefy machine with a 4090 if that matters.
I’m using the “default” stable-diffusion-v1/model.ckpt file that’s around 7.5gb. Should I be grabbing a different one from somewhere.

Sorry if this is all obvious or the wrong place to ask. Just point me in a direction and I’ll head that way.

I'm not familiar with that SD ui, but it looks pretty old... can it only do SDv1? If so, you might want to check out a more modern interface that can run SDXL models like Fooocus, AUTOMATIC1111 or ComfyUI.

As for XL models, I'm a fan of RealVisXL, Juggernaut XL, Colossus Project and ZavyChromaXL.

I don't use 1.5 very often anymore, but some 1.5 models I dig are Analog Madness, Photon, Realistic Stock Photo and Epic Photogasm.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply