The Artificial Intelligence & Machine Learning Megathread

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > The Artificial Intelligence & Machine Learning Megathread

duck monster: Dec 15, 2004

Hammerite posted:

This is a very pedestrian question, but I wanted to try out the ChatGPT demo and when I try to log in it just says "The email you provided is not supported". This occurs consistently when trying to log in in several ways:

- with my personal microsoft account
- with my work microsoft account
- trying to create an account with my work email address

For the microsoft account ones, I had to click through a page granting access to my profile to OpenAI, and I got notifications that the app was associated with my account. But even after granting that access, I still got the "email not supported" message when trying to log in.

Has anyone encountered this problem and got past it?

I googled 'chatGPT "The email you provided is not supported"' and got a load of SEO spam. I also googled 'chatGPT "The email you provided is not supported" reddit' and found a couple of Reddit threads where people were complaining about having the same problem, but none of them could provide consensus on what a solution might be. There was one guy who described a crazy procedure involving connecting to VPNs and using Tor which, if that's what I need to do to get to this thing, then gently caress that.

ChatGPT can be weird to sign up for. As an alternative perhaps sign up for poe.com. Its got an interface to a bunch of these models, in limited capacities. I'm rather fond of "Claude" from anthropics, its about on level with GPT4/ChatGPT+, perhaps a little behind , but its ever so well behaved and seems pretty good at explaining itself. The only problem is one it shares with GPT, which is hallucinating its brain out if you ask for citations. A lot of folk in the AI research community are fond of Claude, as it "ConstitutionalAI" (which apparently is slightly different to the usual RLHF method of politeness training) seems to work really well, so its kind of a helpful friendly little dude without Bing's freakouts.

duck monster fucked around with this message at 06:12 on Apr 26, 2023

# ¿ Apr 26, 2023 06:10

Adbot: ADBOT LOVES YOU

# ¿ May 21, 2024 07:33

duck monster: Dec 15, 2004

Rahu posted:

I've been trying to learn some ML stuff lately and to that end I've been reading over Andrej Karpathy's nanoGPT.

I think I have a pretty good grasp on how it works but I'm curious about one specific bit. The training script loads a binary file full of 16-bit ints that represent the tokenized input. It has a block of code that looks like this

https://github.com/karpathy/nanoGPT/blob/7fe4a099ad2a4654f96a51c0736ecf347149c34c/train.py#L116
code:
data = np.memmap(os.path.join(data_dir, 'train.bin'), dtype=np.uint16, mode='r')
ix = torch.randint(len(data) - block_size, (batch_size,))
x = torch.stack([torch.from_numpy((data[i:i+block_size]).astype(np.int64)) for i in ix])
What I'm curious about is: what is the purpose of doing `astype(np.int64)` here? The data is written out as 16 bit uints, then loaded as 16 bit uints, then reinterpreted as 64 bit ints when converting from numpy to pytorch and I just don't see what that achieves.

Transormers are probably *not* the best place to start with learning how ML actually works. It may well be the god algorithm for the modern robobrain, but its built on a bunch of simpler, but still quite useful ideas that are worth learning first.

# ¿ Jun 23, 2023 18:10

duck monster: Dec 15, 2004

Mata posted:

Bumping this thread with more of a practical question about LLM rather than the scientific aspect of ML...

Has anyone had luck with a self-hosted LLM for specific programming languages?
I'm a little new to the scene but want to see if I can use (and gradually improve) a tool to make my day-to-day work easier, specifically if I never had to write another line of typescript by hand again I would be pretty satisfied with that. GPT4 works great but it's not suitable for commercial work, even if OpenAI claims not to train on your input anymore, it's still a sensitive topic for most of my clients.

For this reason, I'm looking into seeing if I can make my working day easier using self-hosted open source models. I found Tabby looks like a fun place to start out with, to see if I can get some sensible results.

As fo the model, I quickly glanced at huggingface and saw that codellama looks popular but I wonder if a more specific model might not be more effective at achieving results, but typescript-specific LLMs aren't getting nearly as much attention, understandably.

You can, but mostly just the low end ones. You really want to get the meatiest GPU you can, and with the real stickler being how much ram does it have (Theres no point having a firebreathing gpu if it cant fit the model in its brain). 3080s with 24gb seem to be the sweet spot.

You *can* run them on a CPU too, and that can solve the memory issue but they are slow as balls, though there is a lot of research ongoing on fixing both the ram and the CPU power.

But for reference, I can happily run GPT2 on my macbook on CPU. Its a bit slow, and kinda dumb, but its neat to gently caress about with. I can also run Llama 7b quantized to 4bit weights, but it behaves a little drunk with that much quantization.

# ¿ Sep 10, 2023 09:35

duck monster: Dec 15, 2004

I'm seeing some scuttlebut around the net about AMDs new Instinct GPUs being actually...... good for AI? Aparently the A100s kick them to the curb a lot of the tensorflow metrics, but a lot of that uses 32bit operation, and AMDs offering thrashes the A100 on 64bit ops. At about $14K. Thus if the models can be adapted to be optimized for 64bit math rather than 32bit math (I'm not sure what it'd gain out that but hey) the Instincts might be a real contender at a significantly lower price.

# ¿ Sep 27, 2023 15:34

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > The Artificial Intelligence & Machine Learning Megathread