Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
BAD AT STUFF
May 10, 2012

We choose to go to the moon in this decade and do the other things, not because they are easy, but because fuck you.

Mata posted:

Bumping this thread with more of a practical question about LLM rather than the scientific aspect of ML...

Has anyone had luck with a self-hosted LLM for specific programming languages?
I'm a little new to the scene but want to see if I can use (and gradually improve) a tool to make my day-to-day work easier, specifically if I never had to write another line of typescript by hand again I would be pretty satisfied with that. GPT4 works great but it's not suitable for commercial work, even if OpenAI claims not to train on your input anymore, it's still a sensitive topic for most of my clients.

For this reason, I'm looking into seeing if I can make my working day easier using self-hosted open source models. I found Tabby looks like a fun place to start out with, to see if I can get some sensible results.

As fo the model, I quickly glanced at huggingface and saw that codellama looks popular but I wonder if a more specific model might not be more effective at achieving results, but typescript-specific LLMs aren't getting nearly as much attention, understandably.

Github Copilot for Business is the way you'd want to go for an out-of-the-box coding assistant that doesn't feed prompts or suggestions back into the model (assuming you trust Microsoft): https://docs.github.com/en/enterpri...usiness-collect

It does require an enterprise account, so I'm not sure how that would work if you're contracting. :shrug:

Copilot isn't perfect, but I'm not sure if there's anything better out there right now. I was unimpressed by the demo I got of Databricks' assistant. Apparently StackOverflow has one now, too. Haven't seen it in action.

Adbot
ADBOT LOVES YOU

BAD AT STUFF
May 10, 2012

We choose to go to the moon in this decade and do the other things, not because they are easy, but because fuck you.
His sister made some allegations that resurfaced recently. If I'm being cynical, though, I don't know that enough people paid attention that the board would care.

BAD AT STUFF
May 10, 2012

We choose to go to the moon in this decade and do the other things, not because they are easy, but because fuck you.

Keisari posted:

Has anyone been able to really well fiddle with the custom GPTs and more specifically, the "knowledge" part of them? You can upload all kinds of poo poo to be their knowledgebase. I have tried to make one to help me build a program to use a certain API, and uploaded a bunch of JSON files that describe the API. Another I made was inspired by the built in board game explainer, I made one that was focused on explaining and clarifying the rules, and uploaded some game manuals.

The API GPT basically completely errored out and either ignored all the knowledge or just crashed.

Did you give it an OpenAPI spec? And were you asking it to write code that queries an API or setting that up as an action for a custom GPT?

I got good results with regular ChatGPT when uploading a basic JSON file describing a service I wanted it to create using FastAPI. I had much less success trying to set up an action for my own GPT. It gave an unhelpful error message and crashed each time. I think the root of the issue was that I gave it the full JSON file that I pulled from Swagger. I want to go back with a cut down file to try adding endpoints one at a time. That might help with finding the source of the error, but also I think it should perform better if I can limit extraneous context.

If anyone has found things that work well for cleaning up knowledge inputs to GPTs, I'd love to hear about that.

...also this OpenAI drama makes me feel better about the dumb politics poo poo that happens at my company. :munch:

BAD AT STUFF fucked around with this message at 20:48 on Nov 22, 2023

BAD AT STUFF
May 10, 2012

We choose to go to the moon in this decade and do the other things, not because they are easy, but because fuck you.

Insurrectionist posted:

- What kinds of network designs are favored for generative NNs? Is there a focus of width (nodes per layer) over depth (# of layers) or vise versa? Do they utilise input-reducing layers like CNNs (pooling etc)?

GPT stands for Generative Pre-trained Transformer, and the "transformer" bit is what to look at if you want to understand how these new generative models are different. The paper that really kicked things off is called "Attention Is All You Need". I don't think I understand attention and transformers enough to do it justice, but that's a good starting point.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply