|
Bumping this thread with more of a practical question about LLM rather than the scientific aspect of ML... Has anyone had luck with a self-hosted LLM for specific programming languages? I'm a little new to the scene but want to see if I can use (and gradually improve) a tool to make my day-to-day work easier, specifically if I never had to write another line of typescript by hand again I would be pretty satisfied with that. GPT4 works great but it's not suitable for commercial work, even if OpenAI claims not to train on your input anymore, it's still a sensitive topic for most of my clients. For this reason, I'm looking into seeing if I can make my working day easier using self-hosted open source models. I found Tabby looks like a fun place to start out with, to see if I can get some sensible results. As fo the model, I quickly glanced at huggingface and saw that codellama looks popular but I wonder if a more specific model might not be more effective at achieving results, but typescript-specific LLMs aren't getting nearly as much attention, understandably.
|
# ¿ Sep 9, 2023 21:54 |
|
|
# ¿ May 19, 2024 01:37 |
|
The issue isn't so much where the training data comes from (though I would hope open-source models and datasets would respect licenses) but moreso the prompts. It's hard to imagine using e.g. github copilot without your client's codebase leaving their corporate intranet. If the initial results are promising I would like to train and fine-tune the models as much as possible with what little data I have, but as you say it's not enough on its own.
|
# ¿ Sep 9, 2023 22:08 |