I was able to fine tune GPT-2 on a 3090 without problems, but I suspect anything of real utility will require much larger hardware. Perhaps instead it's worth taking StarCoder and self-hosting without any tuning?
|
|
# ¿ Sep 10, 2023 18:54 |
|
|
# ¿ May 21, 2024 01:53 |
I don't know how much drive there is for 64-bit ops. If anything, there's a move in the opposite direction to 16-bit floats. Even setting that aside, I'd be concerned about the market dominance of CUDA and all the special operators that are made for it. nVidia has a frustratingly huge lead on pretty much every front.
|
|
# ¿ Sep 27, 2023 19:22 |