Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
mondomole
Jun 16, 2023

pangstrom posted:

This is a factoid from a previous life but you used to hear that backprop was biologically implausible except for maybe kinda-sorta in the cerebellum.

You may be interested in this paper! https://arxiv.org/abs/2202.08587

quote:

Using backpropagation to compute gradients of objective functions for optimization has remained a mainstay of machine learning. Backpropagation, or reverse-mode differentiation, is a special case within the general family of automatic differentiation algorithms that also includes the forward mode. We present a method to compute gradients based solely on the directional derivative that one can compute exactly and efficiently via the forward mode. We call this formulation the forward gradient, an unbiased estimate of the gradient that can be evaluated in a single forward run of the function, entirely eliminating the need for backpropagation in gradient descent. We demonstrate forward gradient descent in a range of problems, showing substantial savings in computation and enabling training up to twice as fast in some cases.

tl;dr: it's possible to do a single-pass directional derivative and get an unbiased estimate of the gradient. I'm not sure if this makes it biologically plausible per se, but this at least gets rid of the bi-directional synapses.

Adbot
ADBOT LOVES YOU

duck monster
Dec 15, 2004

Rahu posted:

I've been trying to learn some ML stuff lately and to that end I've been reading over Andrej Karpathy's nanoGPT.

I think I have a pretty good grasp on how it works but I'm curious about one specific bit. The training script loads a binary file full of 16-bit ints that represent the tokenized input. It has a block of code that looks like this

https://github.com/karpathy/nanoGPT/blob/7fe4a099ad2a4654f96a51c0736ecf347149c34c/train.py#L116

code:
data = np.memmap(os.path.join(data_dir, 'train.bin'), dtype=np.uint16, mode='r')
ix = torch.randint(len(data) - block_size, (batch_size,))
x = torch.stack([torch.from_numpy((data[i:i+block_size]).astype(np.int64)) for i in ix])
What I'm curious about is: what is the purpose of doing `astype(np.int64)` here? The data is written out as 16 bit uints, then loaded as 16 bit uints, then reinterpreted as 64 bit ints when converting from numpy to pytorch and I just don't see what that achieves.

Transormers are probably *not* the best place to start with learning how ML actually works. It may well be the god algorithm for the modern robobrain, but its built on a bunch of simpler, but still quite useful ideas that are worth learning first.

Mata
Dec 23, 2003
Bumping this thread with more of a practical question about LLM rather than the scientific aspect of ML...

Has anyone had luck with a self-hosted LLM for specific programming languages?
I'm a little new to the scene but want to see if I can use (and gradually improve) a tool to make my day-to-day work easier, specifically if I never had to write another line of typescript by hand again I would be pretty satisfied with that. GPT4 works great but it's not suitable for commercial work, even if OpenAI claims not to train on your input anymore, it's still a sensitive topic for most of my clients.

For this reason, I'm looking into seeing if I can make my working day easier using self-hosted open source models. I found Tabby looks like a fun place to start out with, to see if I can get some sensible results.

As fo the model, I quickly glanced at huggingface and saw that codellama looks popular but I wonder if a more specific model might not be more effective at achieving results, but typescript-specific LLMs aren't getting nearly as much attention, understandably.

Evis
Feb 28, 2007
Flying Spaghetti Monster

I don’t think the legal issues are any clearer if you’re using an open source model. Maybe if you’re using a model you created yourself with only code you or your company wrote it would be, but I don’t know if you can train an LLM on that amount of data.

Mata
Dec 23, 2003
The issue isn't so much where the training data comes from (though I would hope open-source models and datasets would respect licenses) but moreso the prompts. It's hard to imagine using e.g. github copilot without your client's codebase leaving their corporate intranet.
If the initial results are promising I would like to train and fine-tune the models as much as possible with what little data I have, but as you say it's not enough on its own.

Entropist
Dec 1, 2007
I'm very stupid.
The Llama models released by Meta are tuneable and people have been using them to make things, such as Open Assistant. The main problem is whether you have enough data to tune it on (and also that it's still non trivial to do in terms of implementation and computational resources). In particular you can't really do RLHF as that requires tons of human resources.

nelson
Apr 12, 2009
College Slice
If the company I work for exclusively used our own codebase to train models they would produce a lot of crappy code.

duck monster
Dec 15, 2004

Mata posted:

Bumping this thread with more of a practical question about LLM rather than the scientific aspect of ML...

Has anyone had luck with a self-hosted LLM for specific programming languages?
I'm a little new to the scene but want to see if I can use (and gradually improve) a tool to make my day-to-day work easier, specifically if I never had to write another line of typescript by hand again I would be pretty satisfied with that. GPT4 works great but it's not suitable for commercial work, even if OpenAI claims not to train on your input anymore, it's still a sensitive topic for most of my clients.

For this reason, I'm looking into seeing if I can make my working day easier using self-hosted open source models. I found Tabby looks like a fun place to start out with, to see if I can get some sensible results.

As fo the model, I quickly glanced at huggingface and saw that codellama looks popular but I wonder if a more specific model might not be more effective at achieving results, but typescript-specific LLMs aren't getting nearly as much attention, understandably.

You can, but mostly just the low end ones. You really want to get the meatiest GPU you can, and with the real stickler being how much ram does it have (Theres no point having a firebreathing gpu if it cant fit the model in its brain). 3080s with 24gb seem to be the sweet spot.

You *can* run them on a CPU too, and that can solve the memory issue but they are slow as balls, though there is a lot of research ongoing on fixing both the ram and the CPU power.

But for reference, I can happily run GPT2 on my macbook on CPU. Its a bit slow, and kinda dumb, but its neat to gently caress about with. I can also run Llama 7b quantized to 4bit weights, but it behaves a little drunk with that much quantization.

Jo
Jan 24, 2005

:allears:
Soiled Meat
I was able to fine tune GPT-2 on a 3090 without problems, but I suspect anything of real utility will require much larger hardware. Perhaps instead it's worth taking StarCoder and self-hosting without any tuning?

Ihmemies
Oct 6, 2012

Is there some way to get to use Copilot X? Mainly I'm interested in the GPT-4 model. GPT-3.5 is not very good, mainly it feels like waste of my time. I got into the copilot chat beta back in May, but it's still powered with the old GPT. Using GPT-4 on a browser is not the same as if it were bolted on straight to IDE.

duck monster
Dec 15, 2004

I'm seeing some scuttlebut around the net about AMDs new Instinct GPUs being actually...... good for AI? Aparently the A100s kick them to the curb a lot of the tensorflow metrics, but a lot of that uses 32bit operation, and AMDs offering thrashes the A100 on 64bit ops. At about $14K. Thus if the models can be adapted to be optimized for 64bit math rather than 32bit math (I'm not sure what it'd gain out that but hey) the Instincts might be a real contender at a significantly lower price.

Jo
Jan 24, 2005

:allears:
Soiled Meat
I don't know how much drive there is for 64-bit ops. If anything, there's a move in the opposite direction to 16-bit floats. Even setting that aside, I'd be concerned about the market dominance of CUDA and all the special operators that are made for it. nVidia has a frustratingly huge lead on pretty much every front.

QuarkJets
Sep 8, 2008

It depends on the context, NVidia's high-end Tesla cards started out being *required* if you wanted performant 64-bit operations on a GPU; that was a niche they filled for many years, and as big as AI is it's far from being the only domain that's using CUDA. But for AI specifically... yeah, what Jo said

Haramstufe Rot
Jun 24, 2016

duck monster posted:

I'm seeing some scuttlebut around the net about AMDs new Instinct GPUs being actually...... good for AI? Aparently the A100s kick them to the curb a lot of the tensorflow metrics, but a lot of that uses 32bit operation, and AMDs offering thrashes the A100 on 64bit ops. At about $14K. Thus if the models can be adapted to be optimized for 64bit math rather than 32bit math (I'm not sure what it'd gain out that but hey) the Instincts might be a real contender at a significantly lower price.

Why would you need 64bit for Machine Learning?

16bit and 8bit are the metrics that matter imo.


Edit: Also, AMDs problem isn't hardware. It's that their software stack is absolute garbage and has been for years - and there is no drive at AMD to fix it.

BAD AT STUFF
May 10, 2012

We choose to go to the moon in this decade and do the other things, not because they are easy, but because fuck you.

Mata posted:

Bumping this thread with more of a practical question about LLM rather than the scientific aspect of ML...

Has anyone had luck with a self-hosted LLM for specific programming languages?
I'm a little new to the scene but want to see if I can use (and gradually improve) a tool to make my day-to-day work easier, specifically if I never had to write another line of typescript by hand again I would be pretty satisfied with that. GPT4 works great but it's not suitable for commercial work, even if OpenAI claims not to train on your input anymore, it's still a sensitive topic for most of my clients.

For this reason, I'm looking into seeing if I can make my working day easier using self-hosted open source models. I found Tabby looks like a fun place to start out with, to see if I can get some sensible results.

As fo the model, I quickly glanced at huggingface and saw that codellama looks popular but I wonder if a more specific model might not be more effective at achieving results, but typescript-specific LLMs aren't getting nearly as much attention, understandably.

Github Copilot for Business is the way you'd want to go for an out-of-the-box coding assistant that doesn't feed prompts or suggestions back into the model (assuming you trust Microsoft): https://docs.github.com/en/enterpri...usiness-collect

It does require an enterprise account, so I'm not sure how that would work if you're contracting. :shrug:

Copilot isn't perfect, but I'm not sure if there's anything better out there right now. I was unimpressed by the demo I got of Databricks' assistant. Apparently StackOverflow has one now, too. Haven't seen it in action.

Waffle House
Oct 27, 2004

You follow the path
fitting into an infinite pattern.

Yours to manipulate, to destroy and rebuild.

Now, in the quantum moment
before the closure
when all become one.

One moment left.
One point of space and time.

I know who you are.

You are Destiny.


Just wondering if anyone itt has used Aquarium:

https://github.com/fafrd/aquarium

Diva Cupcake
Aug 15, 2005

Wow.

https://x.com/tomwarren/status/1725613011157549116?s=46&t=DcBXErlGIUJUj8quAgYfkQ

CarForumPoster
Jun 26, 2013

⚡POWER⚡

They fired the face of the company a few days after a major public update?

BAD AT STUFF
May 10, 2012

We choose to go to the moon in this decade and do the other things, not because they are easy, but because fuck you.
His sister made some allegations that resurfaced recently. If I'm being cynical, though, I don't know that enough people paid attention that the board would care.

Diva Cupcake
Aug 15, 2005

lol. lmao.
https://x.com/satyanadella/status/1726516824597258569?s=20

hyphz
Aug 5, 2003

Number 1 Nerd Tear Farmer 2022.

Keep it up, champ.

Also you're a skeleton warrior now. Kree.
Unlockable Ben
As OpenAI runs on Azure this is now a massive conflict of interest.

Macichne Leainig
Jul 26, 2012

by VG
Microsoft has had a 49% stake in the company for a while now so it is realistically no different than it was before imho (that is to say, Capitalism Sucks)

ultrafilter
Aug 23, 2007

It's okay if you have any questions.


https://twitter.com/stokel/status/1726502623967392060

Diva Cupcake
Aug 15, 2005

Macichne Leainig posted:

Microsoft has had a 49% stake in the company for a while now so it is realistically no different than it was before imho (that is to say, Capitalism Sucks)
The majority of the OpenAI employee base including leadership is likely just walking over to Microsoft, including the interim CEO they had just named. Satya effectively made this an acquisition.

This whole situation is crazy.

https://x.com/sama/status/1726594398098780570?s=20
https://x.com/karaswisher/status/1726599700961521762?s=20

Macichne Leainig
Jul 26, 2012

by VG
Unions kick rear end.

Lol, and even lmao to the board today. They're gonna need it

CarForumPoster
Jun 26, 2013

⚡POWER⚡

I cant really figure out twitter so forgive me...is there..a controversial part of this or something? Seems like a dude asking *presumably* genuine questions about sex topics. If the questions are being asked in bad faith or somethin, its not shown in this tweet and I cant find the context easily by clicking or I hit a login wall.

CarForumPoster fucked around with this message at 05:46 on Nov 21, 2023

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

CarForumPoster posted:

I cant really figure out twitter so forgive me...is there..a controversial part of this or something? Seems like a dude asking *presumably* genuine questions about sex topics. If the questions are being asked in bad faith or somethin, its not shown in this tweet and I cant find the context easily by clicking or I hit a login wall.

You can't figure out what's controversial about saying 40-60% of women have rape fantasies? Like this is just a normal discussion you'd have with workmates around the water cooler?

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!
He cited a statistic that is very commonly used for rape apologism to explain the existence of TikTok videos by women that were endorsing rape fantasies. (The alternative explanation being psychological damage.)

That's not a good look, but it's not really doing anything wrong in context.

Nobody is going to care though, so his resignation is probably imminent.

OneEightHundred fucked around with this message at 09:23 on Nov 21, 2023

CarForumPoster
Jun 26, 2013

⚡POWER⚡

Jabor posted:

You can't figure out what's controversial about saying 40-60% of women have rape fantasies? Like this is just a normal discussion you'd have with workmates around the water cooler?

I don’t think statements of fact derived from a credible source and presented in good faith on a public discussion on that topic should be particularly controversial. It’d be extremely inappropriate to say this in a work environment, especially when you’re a CEO or manager, but Twitter isn’t the workplace and I don’t have any reason to know they work together.

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

CarForumPoster posted:

I don’t think statements of fact derived from a credible source and presented in good faith on a public discussion on that topic should be particularly controversial. It’d be extremely inappropriate to say this in a work environment, especially when you’re a CEO or manager, but Twitter isn’t the workplace and I don’t have any reason to know they work together.

it's a weird thing to talk about and he's being weird about it. especially as a prominent corporate guy who's in the news

it's only half about what he's saying about it, it's at least that much about the fact he's talking about it at all. and that's partly because people infer from the fact he's talking about it, that he might have strongly held and off-putting opinions in that area

like, I don't know how many women have rape fantasies. And I don't particularly care so I wouldn't get into an animated discussion about it online. The fact that he did, and under his IRL name makes his behaviour Weird with a capital W.

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe
Anyway has anything at all come out about why they decided to boot Altman in the first place? They must surely have had some compelling reason to do it.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

CarForumPoster posted:

I don’t think statements of fact derived from a credible source and presented in good faith on a public discussion on that topic should be particularly controversial. It’d be extremely inappropriate to say this in a work environment, especially when you’re a CEO or manager, but Twitter isn’t the workplace and I don’t have any reason to know they work together.

If you had that debate on the Something Awful forums, you'd get a redtext that would follow your account around and colour people's interactions with that alias forever. Or until you decided to make a clean break with a parachute account.

It's the same thing on Twitter, except to a broader audience and it's under your real name. It's a completely inappropriate discussion to have in that context, because it might as well be around the water cooler with all your work colleagues at the same time as it's with all your friends and private acquaintances.

Keisari
May 24, 2011

Hammerite posted:

it's a weird thing to talk about and he's being weird about it. especially as a prominent corporate guy who's in the news

it's only half about what he's saying about it, it's at least that much about the fact he's talking about it at all. and that's partly because people infer from the fact he's talking about it, that he might have strongly held and off-putting opinions in that area

like, I don't know how many women have rape fantasies. And I don't particularly care so I wouldn't get into an animated discussion about it online. The fact that he did, and under his IRL name makes his behaviour Weird with a capital W.

Yeah, and what the hell does he even mean "genuine" rape fantasy? That one strikes to me as "some women really enjoy/want to be raped" type poo poo. No one, no one wants to be raped, as rape by definition is unwanted. If someone "wants" to be raped, what they actually have is a taboo fetish or something.

Moreover, if we presume that the studies cited were done scientifically properly and aren't some hackjobs, the conclusion to draw isn't ":actually: 40-60 % of women WANT to be raped", but could be more along the lines of "this study would indicate that about half of the population, men included, might have a particular taboo fetish. How surprising." How he just glossed over the ~55 % of men who also have these fantasies and presented it as somehow unique to women is odd.

Where this study would be appropriate to bring up is a good question. Maybe some university psychology class on sexuality, sexual violence or some fetish site. But weird, weird thing to bring up on Twitter, triply so as an executive of a big corporation.

EDIT:

Yeah managed to open the twitter thread and yup, it's weird. Getting some Elon Musk level weird vibes.

Keisari fucked around with this message at 11:57 on Nov 21, 2023

Diva Cupcake
Aug 15, 2005

Hammerite posted:

Anyway has anything at all come out about why they decided to boot Altman in the first place? They must surely have had some compelling reason to do it.

There’s been speculation that Adam D’Angelo was a driving force. For some reason they put the CEO of
Quora on their board and that cursed website’s only glimmer of hope was Poe, an AI chatbot interface where user can create their own using external API. Same thing ChatGPT announced at their DevDay with GPTs.

Macichne Leainig
Jul 26, 2012

by VG
The CEO can have those kinds of thoughts, I can’t control his brain. But for the love of god you do not need to share every opinion you have online

QuarkJets
Sep 8, 2008

CarForumPoster posted:

I don’t think statements of fact derived from a credible source and presented in good faith on a public discussion on that topic should be particularly controversial. It’d be extremely inappropriate to say this in a work environment, especially when you’re a CEO or manager, but Twitter isn’t the workplace and I don’t have any reason to know they work together.

The CEO is a public figure, Twitter is part of the workplace when he's using the account that is publicly associated with him

QuarkJets fucked around with this message at 17:42 on Nov 21, 2023

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!
e: Nevermind, this is a bad derail.

OneEightHundred fucked around with this message at 19:05 on Nov 21, 2023

CarForumPoster
Jun 26, 2013

⚡POWER⚡
For the record, I agree its in bad taste to discuss publicly, particularly when you have hundreds of employees. A guy said some dumb but not hateful internet stuff is not something I'd want to be hung by though.


On to AI:

Last year I made an easy FastAI image recognition API to classify and grade American coins, as well as pull out a few attributes. Lil hobby project, but definitely put a couple weeks into it. Worked well enough to be useful.

I tried a customized GPT on 10 or so samples via ChatGPT and it gave nearly comparable results including now being an API that returns JSON in like 6 or 7 prompts. It was better at extracting dates but generally worse (though not uselessly so, I'd probably use it in a voting system) at classifying the type of coin. Absolutely incredible results for how little effort there was. It has the big problem though of not being able to feed back in its mislabeled data to improve it, can seemingly only tune the prompts.

Macichne Leainig
Jul 26, 2012

by VG
The custom GPT stuff is pretty impressive given that the requirements for training a decent LLM with any accuracy is pretty god damned ridiculous in my experience. Abstracting all that away to yet another GPT prompt web interface is very useful and saves a ton of time. Naturally, though, we even had a discussion today at work about not relying on Open AI stuff due to recent turbulence, so we are seeing what our company will allow us to afford in AWS :(

Adbot
ADBOT LOVES YOU

Xun
Apr 25, 2010

When you look at a GitHub page for a research paper, what do you guys want to see if the readme? Im updating mine for a publication and sadly the only advice I'm getting from my labmates is "the code is all there and there's a bibtex citation, why are you worrying :confused:"

Honestly I'm usually pretty happy with just "type this to run model" but idk if that's the norm lol

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply