Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Mata
Dec 23, 2003
Bumping this thread with more of a practical question about LLM rather than the scientific aspect of ML...

Has anyone had luck with a self-hosted LLM for specific programming languages?
I'm a little new to the scene but want to see if I can use (and gradually improve) a tool to make my day-to-day work easier, specifically if I never had to write another line of typescript by hand again I would be pretty satisfied with that. GPT4 works great but it's not suitable for commercial work, even if OpenAI claims not to train on your input anymore, it's still a sensitive topic for most of my clients.

For this reason, I'm looking into seeing if I can make my working day easier using self-hosted open source models. I found Tabby looks like a fun place to start out with, to see if I can get some sensible results.

As fo the model, I quickly glanced at huggingface and saw that codellama looks popular but I wonder if a more specific model might not be more effective at achieving results, but typescript-specific LLMs aren't getting nearly as much attention, understandably.

Adbot
ADBOT LOVES YOU

Mata
Dec 23, 2003
The issue isn't so much where the training data comes from (though I would hope open-source models and datasets would respect licenses) but moreso the prompts. It's hard to imagine using e.g. github copilot without your client's codebase leaving their corporate intranet.
If the initial results are promising I would like to train and fine-tune the models as much as possible with what little data I have, but as you say it's not enough on its own.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply