The Artificial Intelligence & Machine Learning Megathread

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > The Artificial Intelligence & Machine Learning Megathread

Rahu: Feb 14, 2009; let me just check my figures real quick here; Grimey Drawer

I've been trying to learn some ML stuff lately and to that end I've been reading over Andrej Karpathy's nanoGPT.

I think I have a pretty good grasp on how it works but I'm curious about one specific bit. The training script loads a binary file full of 16-bit ints that represent the tokenized input. It has a block of code that looks like this

https://github.com/karpathy/nanoGPT/blob/7fe4a099ad2a4654f96a51c0736ecf347149c34c/train.py#L116

code:

data = np.memmap(os.path.join(data_dir, 'train.bin'), dtype=np.uint16, mode='r')
ix = torch.randint(len(data) - block_size, (batch_size,))
x = torch.stack([torch.from_numpy((data[i:i+block_size]).astype(np.int64)) for i in ix])

What I'm curious about is: what is the purpose of doing `astype(np.int64)` here? The data is written out as 16 bit uints, then loaded as 16 bit uints, then reinterpreted as 64 bit ints when converting from numpy to pytorch and I just don't see what that achieves.

# ¿ Jun 14, 2023 10:51

Adbot: ADBOT LOVES YOU

# ¿ May 20, 2024 23:48

Rahu: Feb 14, 2009; let me just check my figures real quick here; Grimey Drawer

Stubear St. Pierre posted:

The forward method of their GPT model feeds that input through an nn.Embedding layer which requires torch.long (int64) input, so they're doing the conversion on the batch code because that will generally run on the CPU, or at the very least can be precomputed/queued, whereas a conversion further down the actual network in the Embedding layer will happen on the GPU.

Ah, didn�t realize that it only took fixed size input like that. Thanks :tipshat:

# ¿ Jun 15, 2023 12:24

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > The Artificial Intelligence & Machine Learning Megathread