show/tell me about DEC hardware (and x86 memory management)

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > YOSPOS > show/tell me about DEC hardware (and x86 memory management)

Kazinsal: Dec 13, 2011

it's DEC's least interesting ISA but you could run Windows NT on it which was neat. it was a 64-bit RISC that they were hoping to make competitive for 20+ years but then DEC ran out of money and got bought by Compaq who promptly went all-in on Itanium (lmao) and killed Alpha off. this kind of pissed microsoft off because unlike the Itanium it was actually possible to write software for Alpha and run it on real hardware that you could just buy and plug in.

in a time when ARM was all 32-bit low power designs and 64-bit POWER didn't exist yet it was a 64-bit machine that checked the RISC boxes on the wave-of-the-future checklist, was from a big name in the minicomputer world, and had a VMS port as well as DEC's own Mach and BSD derived Unix and the aforementioned Windows NT port

# ¿ Jan 10, 2022 09:24

Adbot: ADBOT LOVES YOU

# ¿ May 22, 2024 09:17

Kazinsal: Dec 13, 2011

Captain Foo posted:

go on about VAX architecture plz

the VAX ISA is fuckin wild. it takes the idea of a complex instruction set computer and adds a dash of purestrain late 70s "dude this blow is amazing we need to fly to colombia for the weekend more often". it's a 16-register machine with a similar register layout to ARM, a base ISA that's more or less the PDP-11 ISA extended to 32 bits, a four-ring protection level system, four one-gigabyte segments, and a whole bunch of insane extra instructions for making assembly programming "easier". want instructions to implement doubly-linked ring buffer operations with optional multiprocessor-safe locking in a single machine cycle? VAX has those. want two- and extended three-operand versions of the C standard library string functions as microcoded instructions? VAX has that. need an instruction to do CRC16 and/or CRC32 on an arbitrary length string of bytes? VAX has one. have you ever wanted an instruction that's a whole implementation of a stream editor like sed? look no further than VAX! and if that's not enough and you want to emulate your own instruction set extensions in software, VAX has a mechanism for that.

later versions of the VAX ISA describe an alternate larger-exponent-smaller-mantissa 64-bit floating point format and a super large 128-bit floating point format, and extend all the instructions to allow use of those anywhere the normal 32-bit and 64-bit floats are usable. there's also a full set of instructions to convert between every type of float to each other, and to convert different sized integers to different sized floats and back, with different rounding/truncation options, and each option is its own opcode.

oh and of course since it's a full 32-bit minicomputer it's got all the nice fancy memory management features you'd want out of something that will have dozens of simultaneous users and hundreds of processes running like paging and per-page privilege level checking and segment limits so the OS can do things like auto-grow process stacks. and separate machine-level registers for setting the base and limit of the system segment on a context switch so you could implement syscall trampolines if you wanted to

Kazinsal fucked around with this message at 09:31 on Jan 13, 2022

# ¿ Jan 13, 2022 09:26

Kazinsal: Dec 13, 2011

other awesome wild VAX stuff: since DEC was the only company making VAXen the spec has a whole bunch of standard requirements for VAX machines built in that the operating system can expect. things like clocks, built-in timers, PDP-11 emulation, and a proto-"lights out console" attached to the system firmware that can be used to debug the machine and single-step through code in the event of a crash. the first multiprocessor VAX came out in 1982 and the first line of SMP VAXes were released in 1986. by the early 90s you could get VAXes with up to 8 processors and support for a gig of RAM if you wanted to drop some serious cash. and all your software from 1978 would conveniently still work!

# ¿ Jan 13, 2022 10:04

Kazinsal: Dec 13, 2011

Good Sphere posted:

when you say segments, what are you referring to exactly? i'm not up to par on this processor lingo

basically separate sections of the virtual address space that are mapped to different regions for different purposes. in the VAX world the 4 GB virtual address space is split up into four segments of a max size of 1 GB that each have their own permissions and addressing limit. if code reaches the segment limit a fault fires and the kernel can deal with it in an appropriate manner (eg. map more stack if it's the stack segment or map more heap if it's the heap region, or terminate the process if it's some kind of fatal segmentation fault).

on VAX the implicit segments are P0 (starts at 0x00000000 and goes to 0x3FFFFFFF, intended for user-space code and data, grows upwards), P1 (0x40000000 to 0x7FFFFFFF, intended for user-space stack and data, grows downwards because that's how the stack works), SYSTEM (0x80000000 to 0xBFFFFFFF intended for system code and data, grows upwards), and the System Reserved area (0xC0000000 to 0xFFFFFFFF). each segment has its own page table, which is a virtual address mapping method where every access to any given virtual address is checked against the page table entry for the page (VAX pages are 512 bytes; x86 pages are 4096 bytes for standard sized pages -- large and huge pages don't exist in VAX) to determine if there's memory there, if the code is allowed to access that memory, and what physical address to translate the virtual address to. there's also a few extra bits in page table entries that are used by the OS to implement things like copy on write etc.

in the x86 world there's also segmentation but it's a lot more freeform and was originally designed to deal with the fact that the 8086 was a 16-bit machine with a 20-bit physical address space. 32-bit x86 still has segmentation but it was more intended for separating out address spaces by privilege level. 64-bit x86 doesn't allow you to have any segments with bases other than 0 and limits other than 0xFFFF[...]FFFF, but there's a special model-specific register (internal processor registers only accessible via kernel-exclusive read/write instructions) for setting one of the segmentation registers' base address on the fly to make thread-local storage really easy. page tables on x86 are also more or less global; you slap the physical address of your highest level page table pointer (you need a lot of tables for a modern virtual address space so it's practically a page table pointer pointer pointer pointer these days) into a control register and the processor invalidates the translation lookaside buffers and boom, your virtual address space mappings are refreshed.

paging on VAX gets a bit more complex sometimes because some I/O devices on VAX use the VAX MMU which means they're subject to paging, whereas on x86 the device bus always has direct access to the physical address space

# ¿ Jan 15, 2022 02:13

Kazinsal: Dec 13, 2011

Neslepaks posted:

incredible effortposting itt. 5.

weird low level computer poo poo is one of the few things I can just blather on about for hours. I�ll probably get baked and write something about the insanity of x86 memory management later tonight

# ¿ Jan 16, 2022 00:51

Kazinsal: Dec 13, 2011

okay, time to talk about x86 memory management. this post is going to get a bit out of hand so I'm breaking it into sections.

way back before the 8086 the 8080 family had a 16-bit address space but no internal mechanism to manage it, so if you wanted more than 64KiB of RAM/ROM/MMIO you needed an external chip that you could frob to switch what a certain range of the address bus actually pointed at. this sucked but a lot of 8-bit CPU families did this and the chips are fairly simple. you hook them up to say a 16 KiB window and expose a couple I/O ports to the CPU so software can switch that 16 KiB bank to a different slice of whatever RAM chip is hooked up to it through the bank selector chip.

so when Intel designed the 8086 they realized how badly this sucked and even though they were still making a 16-bit CPU they stuck 4 more address lines onto it so they could basically do a sort of pseudo bank switching system in the standard address decoding logic. the 8086 has four 16-bit segment registers that are used in the address decoding logic to create a 20-bit address from a 16-bit segment and a 16-bit offset (usually formatted in documentation as eg. 1234:CDEF, where the word before the colon is the segment and the word after it is the offset. the logic is pretty simple: the segment word is shifted four bits to the left and the offset is added to it. in the above example we'd get this:

code:

segment << 4 | 12340
    + offset |  CDEF
------------ | -----
   = address | 1F12F

this is super advantageous over the bank switching chip thing for two reasons: one, you get full access to the whole 20-bit address space of the system at all times, and two, your code can always assume it starts at 0000 and goes to however much memory it needs in a 16-bit space since the OS can just give you a block of memory and say "your segment is 0x1234". all your indexes and offsets are from 0, no matter where in the 20-bit address space you actually are! and you get separate selectors for code (CS), data (DS and ES), and stack (SS), and segment override prefixes so you can access up to four 64K segments at any time without reloading the segment registers

now, there's no memory protection on the 8086 so any code can run over any memory at any time, and there's no concept of privilege levels so any code can use any instructions including mov Sreg, r/m16 to change what's in your segment registers, which is one of the main reasons why no one really ever cared too much about multi-user multitasking OSes for the 8086. the other main issue is that the most common 8086 out there was actually the 8088, which only had an 8-bit data bus instead of a 16-bit one so full word memory and I/O accesses needed two cycles instead of one, but that's something for a different post or a sidebar

sidebar: the 8086 was also interesting compared to a lot of other general-purpose 16-bit CPUs in that it carried over the separate I/O port space from the 8080/Z80. it uses its own in and out instructions and has a 16-bit address space of its own. it was relatively fast on real 8086es because it only took a couple clock cycles but it's still about the same speed on modern machines because the I/O port space is emulated in System Management Mode these days and anything it's accessing is an ISA device which is also either emulated by SMM, by a SuperIO chip, or by an actual 4.77 MHz ISA device going through half a dozen different bridge chips

so now we come to the 80286. it was still a 16-bit CPU but it had a 24-bit address bus and added a new CPU mode called protected mode that changed how memory management worked significantly. when you're in protected mode, instead of pointing directly to a segment base address, you point at an index into the Global Descriptor Table, which is an array of 48-bit fields aligned to 64 bits that describe the 24-bit base address for the selector (the name for the GDT entry you write the index for into a segment register), the segment limit (up to 0xFFFF; it's still a 16-bit machine at this point), and a flags field for things like minimum privilege level and some things mostly related to hardware task switching. the only problem with protected mode was that on the 286, you couldn't *leave* protected mode without resetting the CPU, and the BIOS's built-in driver routines were all only available in real address mode (the retronym for the 8086's operating mode). in order to go back you'd have to set the reset vector to your 16-bit real mode trampoline code and then intentionally catastrophically fault the CPU (by, say, setting the GDT or the IDT -- Interrupt Descriptor Table; same idea as the GDT but for interrupt vectors -- no a null pointer and causing an interrupt/exception, which would cause a recursive exception and "triple fault" the CPU, resetting it).

needless to say the 286 protected mode didn't see a lot of use on common systems like DOS and early Windows. Windows/286 2.x existed but no one really wrote programs for it because RAM was still pretty expensive and by that point there were bank switching systems for using more than 640KiB in real mode so you could sorta use big chunks of RAM 64K at a time in DOS. when the PC/AT debuted with an 80286, most of the computing world treated it as just a faster 8086 (as it ran at either 6 or 8 MHz depending on what options you bought from IBM instead of 4.77 MHz).

however this did technically give the 286 the capability for multitasking with process isolation! the GDT could technically have up to 8192 entries if you really wanted to, and because the 8-byte-aligned nature of GDT entries gave Intel 3 bits at the bottom of the selector index to work with, they assigned two of those bits to the "desired privilege level" -- if your DPL was lower (closer to kernel mode) than the minimum privilege level of the GDT entry, the CPU would generate a protection fault -- and one to "use Local Descriptor Table". you could then give each process its own LDT, load it into the LDT pointer register before switching to that process, and then the process could use that LDT for all of its selectors if it wanted to instead of using global selectors. this meant you could assign multiple chunks of memory to a process without needing to perform a costly system call back to kernel mode and rewrite the contents of a GDT entry every time a process wanted to access a different 64K chunk.

a couple years later the 80386 came out. this was a big fuckin deal as it was a full fat 32-bit machine with a 32-bit physical address bus, a 32-bit virtual address size, and a 32-bit data bus (16-bit on the 80386SX). to fit in this 32-bit setup, Intel used the 16 bits of padding in GDT entries for a 16/32-bit flag, shoved in another 8 bits to make the base address 32-bit instead of 24-bit, and another 4 bits for the segment limit. why 4 bits? well, when you set the 32-bit flag, the limit gets shifted 12 bits to the left and ANDed with 0xFFF. this seems like a bit of a granularity issue but it's because the 386 was really designed to be used less as a machine with segmentation-based memory management and more like a real machine with full paging capabilities. the 386 did add two more segment registers (FS and GS) however, which ended up being used to easily implement thread-local storage when multi-threaded application programming became common.

x86 has a base page size of 4096 bytes, and the 80386 had a two-level page table system wherein your top-level page table pointer (held in CR3 -- Control Register 3) would be a "page directory" composed of pointers to first-level page tables and flags for how they're allowed to be accessed (minimum privilege level, read/write or read only, cache enable/disable, and a few other flags that were later added), and then each page table was composed of pointers to the upper 20 bits of physical addresses to translate the matching virtual address to ANDed with another set of the same general flags, just this time for that specific page. the granularity on this is awesome and it's part of how x86 became a real 32-bit workhorse in the late 80s. the formula for determining how to translate a virtual address to a physical address is fairly simple: take the top 10 bits of the virtual address; this is your index into the page directory. dereference that to get your page table (or throw a page fault if the access flags don't match your current system state). take the next 10 bits of the virtual address; this is your index into the page table. dereference that to get the physical address to replace the top 20 bits with (or throw a page fault if the access flags don't match your current system state).

paging was great for simplifying memory protection and mapping because instead of each process needing a bunch of selectors, as far as they were concerned they owned the whole 32-bit address space except where the kernel was (and they couldn't read/write kernel space). every process could use the same ring 3 (privilege level) selectors and the OS would just slap the process's page directory address into CR3 before a context switch. and of course, you can just grab whatever free slab of memory you find first when you need to allocate memory to a process because with virtual address translation the actual physical location of a page doesn't matter to the process at all. Intel finally had a real multitasking x86 processor and it was still backwards compatible with original 8086 real mode code as well as 286 16-bit protected mode if you really wanted it (with the added bonus of allowing you to switch between all these modes at will instead of needing to reset the CPU).

sidebar: yes, you could mix paging with arbitrary segment bases and limits. pretty much no one ever did this and while I don't have the 80386 programmer's manual handy I'm pretty sure it says that it's a bad idea and the 386 can't efficiently cache translations if you gently caress around with weird segment bases and paging at the same time the only one I can think of off the top of my head is OpenBSD/i386, which uses the segment limit on code selectors to implement W^X on pre-64-bit machines.

along with paging a couple extra instructions for it were added for dealing with cache control registers and the like, because the 386 cached a bunch more stuff than the 286 did. if you modified a page table entry you'd need to reload the CR3 register with the same pointer (eg. mov eax, cr3; mov cr3, eax) and the 386 would invalidate the whole translation lookaside buffer. later, the 486 added an instruction to invalidate a single page; executing an invlpg address instruction would remove the cached translation for the page that address is in.

pushing onwards a bit past the 486 and Pentium we reach the Pentium Pro, which is our next stop here because it added a new awesome thing: Physical Address Extension (PAE). Intel looked at the 286's address bus compared to the 8086's, said "let's do that again", and slapped on another four bits. now, the problem here is that the page entries were already chock full of bits so what Intel did was add a flag to a control register to enable PAE mode, in which page table and directory entries became 64 bits wide instead of 4 bits wide to fit the extra bit and some additional flags for the future, and the CR3 register now pointed at a four-entry Page Directory Pointer Table where each entry would point to a page directory that controlled 1 GiB of the 32-bit virtual address space.

if you're having a "wait, this reminds me of the VAX post" moment, welcome to the magic of x86. it was basically designed stealing all the best parts of other architectures and steadfastly refusing to throw away any of your old legacy cruft.

another cool thing that was added on the original Pentium but wasn't used too often at the time was the page size flag. in a page directory entry, you could flip the page size flag (formerly a must-be-zero reserved bit) from 0 to 1 and instead of pointing at a page table, that page directory would be a direct mapping to a 4 MiB virtual slab of the address space (2 MiB in PAE mode), effectively making the 10 bits (9 in PAE mode) normally used as an index into a page table just part of the virtual address that wasn't translated by the memory management unit. now you could either slam a huge chunk of RAM into a process's address space or map a block as "not present" and use the fault it generated as a way to signal the kernel to do some other operation like disk I/O or whatnot.

shoving on ahead to 64-bit x86 we now have 64 bits of virtual address space as an extension of PAE mode, right?

kinda.

we actually have 48 bits. in 64-bit pointers. when you enter 64-bit long mode, CR3, which was extended to being 64 bits wide, now points at the Page Map Level 4 (PML4), which has 512 (9 bits' worth) entries, each of which points at a page directory pointer table, which has 512 entries, each of which points at a page directory, which has 512 entries, each of which points at a page table, which has 512 entries, each of which points at a page. in order to make the address "canonical" though it must be sign extended and the upper 16 bits are sign-extended from bit 47. so, your canonical addresses suddenly jump from 0x00007FFFFFFFFFFF to 0xFFFF8000000000. this is fine, whatever, you're basically splitting a 256 TiB address space into two 128 TiB address spaces. this is still an enormous amount of address space.

also, at this point, segmentation is dead. hooray! you are literally not allowed to use segments with bases other than 0 and limits other than "all of it" in 64-bit mode. but to make thread-local storage easier, AMD and Intel agreed that it would be a good idea to have hidden processor registers called FSBASE and GSBASE that... readded a base address to any memory accesses made against the FS and GS selectors.

a lot of OSes use this for convenient separation of user and kernel space. user space gets the bottom half, and the kernel gets the upper half. this is, even as of 2022, still enough on all but the most ridiculously big hypercomputing systems to map every byte of physical RAM and every byte of memory-mapped I/O space into kernel space if you wanted to (and most do! you've got all that space and it's advantageous to be able to easily access any physical address by just slapping a prefix onto it to make it a virtual address). so, naturally, Intel went a step further a couple years ago with the Ice Lake processor generation and added a sub-mode of long mode called 5-level paging mode wherein CR3 points at the Page Map Level 5 (PML5), which has 512 entries, each of which points at a PML4. the rest of the virtual address translation works the same as in normal 4-level paging long mode, but this does change the way canonical addresses are generated so the kernel needs to be aware of how to work in 5-level paging mode. I think most kernels do support it at this point because it doesn't require too much extra work to implement, but depending on the underlying kernel space memory mapping implementation it may require a kernel linked to a different address -- this is how Windows handles it; instead of ntoskrnl.exe you'd see your kernel image being ntkrla57.exe.

later 64-bit x86 microarchitectures added huge pages (the cowards at intel refuse to call it this, but everyone else including their competitors with similar implementations do), wherein a PDPT entry points at a 1 GiB block of memory like the 2 MiB large page. this simplifies and speeds up address translation when you know you can just pre-allocate an enormous block of RAM to a process. database servers and the like love huge pages and heavily randomly-accessed databases can genuinely sometimes be more performant with huge pages because each gigabyte of RAM only needs one TLB entry, so the TLB isn't constantly being rewritten and flushed with all the accesses. there's also a security/caching feature called process-context identifiers (PCID) that lets the kernel tag a paging structure with a 12-bit value of its choice, which will propagate to the TLB entries the processor creates, allowing it to only do lookups based on the PCID of the current paging structure. this both speeds up lookups since the TLB is built on content-addressable memory and the PCID just becomes part of the TLB entry address, and is good for security because the processor isn't reading through other threads' TLB entries to get to the right one.

there's some deeper stuff that's pretty much irrelevant these days that I could talk about in more detail like hardware task switching and call gates but those haven't been used since around the 486 era on account of them becoming gradually too slow to be worth it compared to the kernel saving and loading thread/process contexts in software. there's also I/O Privilege Level maps that live in the hardware task switching system's Task State Segment (each process in a hardware task switching OS had its own TSS that its context was stored in) and allows you to let specific ISA I/O ports be used by processes in user mode, which do still exist (you need to have one TSS for the processor to dump user thread context in when doing a user/kernel mode transition, but the kernel then just copies the stuff that matters back to and from the its own thread structure). I know there's also a few additional things like nested paging for virtualization but I honestly don't know much about the VMX instructions and their associated stuff used for building hypervisors so I won't butcher all that.

anyways there's my giant effortpost on x86 memory management. if I think of anything else about re: the x86 that could be fun and/or I could link back to DEC hardware (sorry for fuckin up your thread OP) I'll put together a batch of thoughts and post 'em. but for now I'm going to stop typing and realize that this kind of poo poo is exactly the reason I'm single

# ¿ Jan 16, 2022 10:27

Kazinsal: Dec 13, 2011

Tankakern posted:

surely you mean segments

gently caress

# ¿ Jan 16, 2022 10:58

Kazinsal: Dec 13, 2011

glad to see my awful ramblings are appreciated. I'll do my best to not add anything else onto the thread title (partially because I don't know any other architectures well enough, but mostly because stuff like the history of interrupt routing on x86 isn't nearly as fascinating as the myriad memory management modes, though it is pretty batshit)

epitaph posted:

lost in the shuffle: virtual 8086 mode which was introduced with the 386. because people didn't want to rewrite their real mode apps intel introduced a mode with similar addressing only translated by the mmu. this enabled things like ems (remember emm386.sys?) where a small window into higher memory ("page frame") was introduced and could be shifted as needed. allegedly even bill gates called it a terrible hack, but oh well.

thought I had forgotten something! I never actually wrote a v86 monitor but I know a couple people who implemented one for the purpose of doing BIOS calls from within a 32-bit kernel and really, it's a bad idea, don't do it ever. common wisdom is that if you *have* to use the BIOS for anything eg. getting a memory map, do it before you move to protected or long mode and just shove the structs the BIOS hands you somewhere in memory and let your kernel parse them early on.

Cybernetic Vermin posted:

intel (and to some extent microsoft) were geniuses for realizing just how much real value exists in "legacy" software, at every stage dragging every little thing along keeping peoples and businesses things ticking along. which on the theme of the thread is interesting, because as great as the alpha was, maybe the world would have looked really different if dec had done a pentium pro for vax, or motorola had done a pentium pro for the 68k.

afaik there is no real reason it wasn't perfectly doable, dec had some of the pieces already in the rather performant nvax, and the 68k had different challenges than x86 but i don't see that they were worse.

the 68020 was kind of motorola's "okay guys it's time to do things a bit differently wait what's all this weird hacky code you're writing" moment because the 68000 and 68010 had a 24-bit address bus and iirc it just ignored the upper 8 bits of pointers so you could store tag information and stuff in there. in the apple world this became known as "32-bit dirty". a couple early '020 macintosh models claimed to have 32-bit clean ROMs but actually didn't so someone wrote a system extension to patch that and it was so useful that apple just bought the rights to it and made it free lol

I think the 68k's big challenge was that most of motorola's customers dried up (Atari died, Sun invented SPARC, Amiga died, SGI moved on to MIPS) so they teamed up with Apple and IBM to bring POWER to the desktop. there's a few post-Apple 68ks that are interesting like the 68060 which brought it up to roughtly Pentium-class microarchitectural performance but at that point the writing was on the wall for CISC designs. x86 powered through just through sheer brute force and dumb luck (and in a few cases, by implementing the architecture as a virtual machine in custom RISC microcode)

VAX could have lived longer if DEC just extended it to 64 bits and kept cranking out microarchitectural improvements. process improvements on their own would have pushed clock speeds up and iirc NVAX was pushing close to 200 MHz at the end of its life. I guess it just wasn't cost-competitive compared to a Pentium Pro

# ¿ Jan 17, 2022 00:30

Kazinsal: Dec 13, 2011

AMD spun their fabs off in 2008 and sold a majority share in the resulting company to get rid of like a billion dollars in debt and it kept them afloat long enough to develop Zen, demand for which is high enough that AMD probably wishes they still had their own fabs lol

working on a post about interrupt routing in x86 over the years. not sure what I'll write about after that, kinda running out of actually interesting stuff

# ¿ Jan 18, 2022 02:49

Kazinsal: Dec 13, 2011

Quebec Bagnet posted:

you could always post about something cursed in x86 instead

it's x86. all its components are cursed.

teaser: this one involves embedded bytecode in firmware that it's up the OS to implement a VM for

echinopsis posted:

fast ram sucks it seems

fast ram was good for throwing code into because the CPU would never stall for a few bus cycles fetching instructions like it could if the code was in chip RAM (because the RAM may be busy being DMAed to/from or the video chip could be reading it to redraw a scanline or what have you).

# ¿ Jan 18, 2022 08:07

Kazinsal: Dec 13, 2011

alright, interrupts on x86. I was hoping to get to this post sooner but instead of editing my draft last night I spent five hours in the ER but now I have antibiotics and painkillers so it's time to :justpost:

x86 chips have only a single general-purpose interrupt pin but they support up to 256 interrupts -- you can invoke these in software manually with the int imm8 instruction, and hardware can invoke a specific interrupt by writing the interrupt number to the low 8 bits of the data bus and raising the INTR pin (generally an interrupt controller does this and other devices ask the bus to tell the interrupt controller to do it for them).

back in the early days of the 8086, the PC had a really simple Programmable Interrupt Controller, the Intel 8259A (which was a 16-bit version of the 8-bit 8259 with optional interrupt buffering). it had eight interrupt pins that would be hooked up to devices, an interrupt output pin that was hooked up to the CPU's interrupt pin, and 8 data pins for communicating with the CPU and software. during bootup you'd tell the 8259 what its base interrupt vector was and it would add that to whatever interrupt pin number it was signalling for when it told the processor to raise an interrupt. super convenient! except at that time Intel hadn't really codified the reserved set of interrupt numbers and the 8086/8088 only had seven fault/traps built in so IBM's BIOS hooked up the 8259 to vectors 0x08 through 0x0F (and used the next 16 vectors for BIOS service routines, the cheeky fuckers).

this was alright for a bit, but suddenly PC users had a shitload of ISA cards in their systems and ran out of IRQs pretty quickly, especially because IBM used IRQ 0 for the Programmable Interval Timer, IRQ 1 for the keyboard controller, and 3 and 4 for serial ports. add in a floppy drive (usually IRQ 6) and there's only three IRQs free for other cards. sooooo in classic hack-something-together fashion, IBM bodged another handful of interrupts into the 286-based PC/AT by throwing *another* 8259A on the board, hooking its interrupt output pin to the first 8259's IRQ 2 pin, and letting IRQs 8-15 cascade through two interrupt controllers to get to the CPU. this *worked* but it also meant that in you now needed to keep track of whether your interrupt was being invoked by the primary PIC or the secondary PIC, and acknowledge the IRQ on the primary PIC in both cases and the secondary PIC if you're on that one.

older yosposters will remember having to janitor IRQs. this is why. there simply weren't enough, even though all the way as far back as the 8086, the processor itself could happily address a full 256 interrupt vectors. and at the time the 8259A was one of the most advanced interrupt controllers available as a single integrated circuit, what with how reprogrammable it was. as plug and play capability started to show up with cards that had the ability to signal "yes, this is actually me raising an interrupt", driver and kernel developers got used to checking that in their interrupt routines to allow multiple devices to safely share a single IRQ.

enter the 486. at this point people have realized that multiprocessor systems are neat but expensive, so Intel baked in some multiprocessor support. they also released a new interrupt controller to support this called the 82489DX Advanced Programmable Interrupt Controller (APIC), which was a single chip that could handle either of two roles. one, the IOAPIC, took the place of the 8259s (and in fact, would emulate them by default, so a freshly booted IBM-compatible system would still behave like a PC/AT). the other, the Local APIC, was present attached to each individual CPU socket, and handled local external interrupt routing and inter-processor interrupts (IPIs). they were physically the same chip, just configured with fuses to do one job or the other, and they had their own local bus for communication and interrupt routing separate from the system bus. the local APIC can raise interrupts on the CPU it's attached to from 32 to 255, meaning we were finally free of interrupt routing catastrophes, right?

haha you're like a third of the way through the post you know where this is going

so for starters, ISA cards were still in common use, and they still could only talk to the emulated 8259 side of the IOAPIC. more advanced PnP EISA cards and, by the Pentium, PCI cards could be set up to route through the IOAPIC to arbitrary interrupts, but this had a few issues. for starters, the Multi-Processor Tables generated by the system's firmware would have to be parsed to see if each device could actually do that properly (mostly for the legacy devices; I've never heard of a PCI card that *needs* legacy PIC interrupt routing). if so, you now had to turn on APIC mode on each local APIC and the IOAPIC, tell the device (or its configuration space on the PCI bus) that you're moving to APIC mode, and that it needs to generate its interrupts on a specific line. now your kernel needs to keep tabs on what interrupts are going where because sometimes you'd STILL have multiple PCI cards on the same interrupt and they couldn't be reassigned for dumb and awful reasons like cheap bus controllers on the cards or whatnot. also you'd need your interrupt trampoline to tell the handler what interrupt it came from to make guessing the device easier. but it made it less likely you'd need to overload your IRQs, so it sucked less. right?

in the mid 90s power management became a concern and the Advanced Configuration and Power Interface spec came about to replace the previous Advanced Power Management spec as well as the Multi-Processor Spec that the aforementioned MP Tables were part of and the Plug and Play BIOS spec that made IRQ and I/O port reconfiguration work on PnP-aware ISA cards. ACPI is this gigantic unwieldy beast of a spec that is composed of dozens of tables, some of which are easily parseable, like the base system information ones and the APIC list so you can tell how many CPUs you have and start them up. then there's the PCI routing table. the PCI routing table is not actually a table. it's bytecode.

see, ACPI has its own virtual machine spec called ACPI Machine Language (AML). AML bytecode was designed with the intent of being CPU-agnostic, so in theory the same ACPI code could be used on x86 or PowerPC or ARM or any other system that had ACPI compatibility. in practice, it's pretty much x86 only, but we're stuck with it, so OSes that wanted to properly support full multiprocessing with interrupt routing through different processors needed to have an AML virtual machine, because lol gently caress you the firmware doesn't have one for you. and AML is *dense*. it's not a general purpose virtual machine, but it's pretty close. it's a whole CISC machine in its own right, and it's part of the reason that power and multiprocessor routing on Linux sucked so much for so long -- the only people who got a whole AML virtual machine done were Microsoft and Intel. Intel later open sourced their as the ACPI Component Architecture, but it's an enormous library (I'm pretty sure at this point it's well north of a million lines of code) and integrating it into a kernel is just a nightmare, but if you want to do APIC routing properly, you gotta either import ACPICA and hook it up to your kernel or write your own subset of an AML virtual machine. once you've got that all ready to go, you use it to find and execute the bytecode inform the ACPI-aware firmware that you're moving to IOAPIC mode, find and execute the bytecode to see what virtual interrupt pins (INTA# through INTD#) are being used on every device, then you need to finally program a redirection entry into the IOAPIC with the information you got from the ACPI tables so that the IRQs for those pins actually goes to the correct interrupt vectors.

if it's such a pain, why would you use the IOAPIC for interrupt routing in a multiprocessor system? well, at this point in history you have to. the 8259 PIC emulation mode only connects to CPU 0, so to implement multiprocessor interrupt routing with it you would have to dedicate CPU 0 to always being in charge of every interrupt that comes in and then use IPIs to tell another CPU to actually do the work. motherboard designers could also add more IOAPICs onto the system if it was expected that you'd be shoving shoving enough devices in there to result in more than 16 IRQs (the max an 82489DX could handle) being needed. also, the IOAPIC has about one third the interrupt latency that the PIC/emulated PIC does, so you get significant performance improvements from it.

in the Pentium 4 the APIC was extended as the xAPIC (Intel is not exactly a creative bunch; it stands for Extended APIC), increasing the number of supported CPUs and using the high-speed system bus to communicate between local APICs instead of a separate, much slower (by the P4 era) out of band bus. the x2APIC showed up in Nehalem and theoretically increased the number of supported CPUs in a machine to 2**32-1. more importantly though the x2APIC reduced inter-processor interrupt latency and complexity, and made virtualization of IPIs much faster as well. the x2APIC also makes the "clustered mode" method of addressing batches of CPUs/cores with as a single logical destination much less annoying to use; instead of needing to parse a hierarchy of clusters to figure out what cluster mapped to what set of local APICs, you just check one of the CPUID functions (I think it's EAX=1F, ECX=0) and it tells you how to interpret cluster IDs.

but what if we don't want to do this? well, in some cases, we're in luck! the thoughtful shits at PCI-SIG added something called Message Signalled Interrupts in the PCI 2.2 spec and made it mandatory in PCI Express. MSI works by ignoring the IOAPIC altogether and letting you tell a device to simply write a data word to an arbitrary address on the physical address bus when there's an interrupt. since every local APIC in a system is exposed to the address bus, you can skip the IOAPIC and a driver can tell a device that supports MSI to just tell a specific CPUs local APIC to raise an interrupt on whatever vector you want it to. this is even faster than IOAPIC routing, so your latency is cut down to between 1/7 and 1/10 of emulated PIC routing, *and* you can just blat the interrupt exactly where it needs to go. if only PCI-SIG was also thoughtful enough to not charge thousands of dollars just to read the specifications for the system bus that practically every desktop, laptop, and server on the planet uses.

we can do better, though. MSI doesn't require support for 64-bit addressing, and a device can only allocate up to 32 interrupts with MSI. so the extended MSI-X spec fixes this by requiring 64-bit addressing support and up to 2048 interrupts per device through a table that supports sending different interrupts to different destinations (so you could have different queues on a NIC go to different processors, for example). sure, each processor itself can only support 224 non-fault interrupts per core, but that's fine! you've got loads of interrupts, and modern systems have lots of cores! go ham!

as a fun note, PCI Express doesn't actually support ISA and PCI style interrupt pins/lines. it just emulates them by using special in-band messages to tell the bus controller "hey, pretend I raised #INTA" etc.

so now we're at the modern state of interrupt routing on x86: the x2APIC being primarily used for local APIC functionality (which since the Pentium has been integrated into the CPU core), MSI/MSI-X being used to actually route interrupts wherever possible, and the IOAPIC for edge cases like legacy devices. but there's a few other neat features that local APICs provide.

since the local APIC is integrated into the CPU core it can benefit from the core's high precision multi-GHz timing. this means that the local APIC is technically capable of providing extremely high resolution timers, down to sub-microsecond scale. early local APICs had a few timer bugs and calibrating the local APIC timer is a bit of a pain (it doesn't actually tell you its frequency; you have to estimate it at bootup) but these days Windows uses it for internal kernel timing and Linux uses it to provide tickless real-time kernel functionality. each local APIC timer can be configured differently, and the timer supports three modes: one-shot, where it counts down for a specified period and then raises an interrupt; periodic, where it counts down for a specified period, raises an interrupt, then starts counting down again; and TSC deadline, where it waits until the CPU's internal cycle counter reaches a certain value and then raises an interrupt.

the local APICs also let you send arbitrary interrupts to other CPUs. this is excellent for telling other CPUs to synchronize to whatever one CPU just did. because translation lookaside buffers are per-CPU, if one CPU modifies a page table and invalidates the appropriate TLB entry, any possible TLB entries for that page will be unaffected on the other CPUs. so you assign an interrupt vector on all CPUs to "invalidate this TLB entry" or "invalidate all your TLB entries" and when you need to do a TLB shootdown globally you send an IPI to all processors except the originating one for that vector. anything you can think of that you need a fast method to get another CPU into kernel mode and have it do something you can assign a known interrupt vector for and just issue an IPI whenever needed. there's pre-defined interrupt destination groups for "broadcast", "broadcast except to self" and "self-IPI" if you need that for some reason, so most of the time you only need to issue one IPI and they should all theoretically arrive at all CPUs at the exact same time.

and the best part is that you don't even need to gently caress with ACPI's garbage virtual machine in order to use them.

# ¿ Jan 18, 2022 10:02

Kazinsal: Dec 13, 2011

Farmer Crack-rear end posted:

were 8088s really substantially cheaper to manufacture than 8086s, or was that mostly a market segmentation thing?

not particularly because the pinout is the same and the die is the same size, just the 8088 has an 8-bit data bus controller and the 8086 has a 16-bit data bus controller so maybe it was a couple bucks less per unit at the "buying by the thousands" scale. I couldn't find good reliable numbers for 8086 per-unit launch prices but the 8088 was released at a list price of $124.80 per unit without volume discounts.

also the 8-bit data bus of the 8088 made wiring it up to MCS-85 family chips super easy so IBM basically slapped together a motherboard containing an 8088 and a bunch of 8080/8085 support chips and it just magically worked right. the IBM PC used the following MCS-80/85 components: 8237 DMA controller, 8253 programmable interval timer, 8255A programmable peripheral chip (basically a GPIO controller), 8259A programmable interrupt controller, and 8288 bus controller (effectively a 16-bit version of the 8283). the keyboard controller was just firmware running on an 8048 microcontroller on its own 400 kHz clock and everything else was an add-in card.

I've written code for a PC/XT and it's wonderfully quaint. everything's so slow you don't really have to worry about timing and synchronization all that much. kinda thinking about doing a quick and lovely unix clone for the 8088 as an extremely elaborate and niche april fools joke

# ¿ Jan 18, 2022 12:37

Kazinsal: Dec 13, 2011

Sweevo posted:

the 68000 was also just really expensive at the time

also this. in 1979 the 68000 was $500 a piece, compared to the 8088's $125 and the 8086 being probably not terribly much more than that.

# ¿ Jan 20, 2022 09:39

Kazinsal: Dec 13, 2011

Gun Metal Cray posted:

oh man, I wish I had the time to spare to do something weird like this

what would be a good simulation setup to get started?

SIMH is a good simulator for all sorts of vintage mainframes and minicomputers. PCem is pretty much the gold standard for reasonably accurately emulating vintage x86 machines through the 80s and into the early 90s

currently fighting with openwatcom to try to get it to actually link a 16-bit flat binary where I want it to but its linker script format is completely different than standard ld and frankly really loving sucks.

# ¿ Jan 23, 2022 02:08

Kazinsal: Dec 13, 2011

JawnV6 posted:

DEC teams survived long after the company itself went under, there were specific projects where decades later there was significant resistance to the intel way of doing things

idk how A20 looms in my head as a giant complication on everything ever, but it wouldn't surprise me if practitioners rarely had to gently caress with it

if you're using a full UEFI boot you don't have to frob the 8042 to turn the A20 gate on, but if you're using legacy BIOS boot or UEFI CSM you still do

epitaph posted:

wasn�t NT developed by ex-DEC people? i remember one of the architects expressing particular scorn at unix for its i/o model (DEC had famously hitched their wagon to VMS)

Dave Cutler was lead on both VMS and NT and he fuckin hates unix with an ahabian passion

# ¿ Jan 31, 2022 20:36

Kazinsal: Dec 13, 2011

Lady Radia posted:

i gotta hear stories come ONNnn

I've never met the guy but DEC folks who worked with him said he thought the entire I/O and userspace processes/daemons model of Unix was obsolete before it even left Bell Labs and it held back small-scale computing for years, and a book that interviewed some folks who worked on NT with him described him as considering Unix system design as his Moriarty a la Sherlock Holmes as well as considering it "a junk operating system designed by a committee of PhDs". steve ballmer pretty much got him on board by telling him it was a chance to write a microcomputer OS that scaled up to servers and minicomputing that would displace Unix. he doesn't do interviews really so it's hard to get first-hand accounts from the man

he's probably a huge dick to work with/for but he did some phenomenal design and implementation work from the 70s through the 90s and even when he was managing the NT team he was still writing kernel code himself because he wanted to be hands-on. I think he still works at Microsoft on the Hyper-V team

Kazinsal fucked around with this message at 08:26 on Feb 1, 2022

# ¿ Feb 1, 2022 08:24

Kazinsal: Dec 13, 2011

some other dumb legacy PC poo poo:

- the IBM PC/AT used an Intel 8042 microcontroller to replace the much more expensive Intel 8055 peripheral interface as a keyboard controller because they dropped the cassette interface that the 8055 was also responsible for. at that point the 2KiB of ROM in the 8042 was more than enough to handle decoding signals from the keyboard and converting them to scancodes, and there was a bunch of room left over in the ROM so they also shoved the A20 gate in there as well as a bit that was directly latched to the machine's RESET signal. the A20 gate was literally a "should this bit be respected or should it always be 0" gate on the 80286's 20th address pin. by default it was "always 0", so when a machine booted up its physical memory bus would behave exactly like an 8086/8088's. if you turned the gate on, the pin would be respected, but code that relied on the address space wraparound wouldn't work properly anymore and would actually gain access to another 65520 bytes of memory -- this became known in DOS parlance as the "high memory area".

- modern systems either emulate the 8042 and a bunch of other legacy peripherals in System Management Mode or farm them off to a dedicated Super I/O chip. Super I/O chips are basically a PS/2-era peripheral chipset on a single low-cost ASIC and include a keyboard/mouse controller, parallel and serial ports, a floppy disk controller, IDE controllers, GPIO pins, and usually stuff like physical sensors for case temperature and case intrusion detection and fan speed control. National Semiconductor made most Super I/O chips and then sold that business off to Winbond, who still makes them to this day.

- CD drives use SCSI internally, but SCSI bus controllers were goddamned expensive when CD-ROMs were becoming a thing, so the ATA/IDE bus standard got an extension called the ATA Packet Interface that was basically an encapsulation of the SCSI command set inside packets delivered inside ATA commands. a lot of early ATA/IDE CD drives were literally just SCSI CD drives with a chip on them to de-encapsulate the ATA command containing the ATAPI packet and send it to the SCSI controller on the drive, basically simulating a point-to-point SCSI "bus" in the process. the same packet interface ended up being used for tape drives, zip disks, magneto-optical drives, and other 90s-tastic removable media formats. SATA CD/DVD/BD drives still speak ATAPI over SATA, too, so decades later we're still using the same nasty hack that we were when CD-ROMs were new and saving $50 per machine by not implementing a proper SCSI bus was a worthwhile tradeoff

- there's a reverse standard of the above called SCSI/ATA Translation but the only place it's used is when connecting SATA disks to SAS buses so even most people who know about the weird ATAPI poo poo and how it works don't even know it's a thing

- anecdotally, the Windows kernel team told Intel to make resets via triple faulting on the 80286 faster after getting their hands on some engineering samples because they were using it as a hack to enable multi-tasking with MS-DOS programs under Windows/286, since the 286 didn't actually have a native way to return to real mode from protected mode. the Intel engineers didn't believe them at first until they showed them how it was working and they were reportedly both impressed and horrified enough to go back and do exactly that, so the Windows 2.x and 3.x kernels in 286 protected mode are constantly resetting the CPU. there's a vague reference to this in the 80286 programmer's manual and Larry Osterman has reported it to have been a meeting that actually happened so I'm inclined to believe it

# ¿ Feb 1, 2022 09:26

Kazinsal: Dec 13, 2011

echinopsis posted:

i read female dating strategies so I know lvm means low value male (or man) but llvm? no idea

Low Level Virtual Machine, it's basically a bytecode of sorts that frontends like clang (a C/C++ compiler), llgo (a Go compiler), rustc (rust reference compiler) output instead of directly spitting out architecture-specific assembly, and then the LLVM core optimizes it and turns it into native executable code

# ¿ Feb 2, 2022 10:20

Kazinsal: Dec 13, 2011

sometimes I can't tell sincere echi posts from joke echi posts. poe's law in action

# ¿ Feb 2, 2022 10:42

Kazinsal: Dec 13, 2011

it's only going to be worth something to the kind of people who are posting in a thread about DEC hardware in a subforum of a subforum on a dead gay comedy forum from 1999

so, unless you're in the greater vancouver area and you think it can all fit in the back of my BRZ, no, it's not really "worth" anything

# ¿ Feb 2, 2022 12:59

Kazinsal: Dec 13, 2011

someone talk me out of buying a vax

https://www.ebay.ca/itm/304204811759

# ¿ Feb 11, 2022 04:27

Kazinsal: Dec 13, 2011

hundred bucks since I'm in canada

# ¿ Feb 11, 2022 04:33

Kazinsal: Dec 13, 2011

scsi2sd will be here next week, AUI transceiver the week after, and the VAX and console cable should be here the week after that

dear ebay seller: sned the yosvax

# ¿ Feb 15, 2022 10:51

Kazinsal: Dec 13, 2011

eschaton posted:

Kazinsal, are you going to set up VMS on SIMH initially and then transfer the disk images to an SD card to boot on the VAX?

if so be aware that SIMH will spend an extra block or two to the image with its own metadata rather than put that in a separate file, but it�s easy to deal with since it�s just an append

this way you can get VMS 7.3 set up with all the layered products etc. and maybe even get MONSTER-Helsinki set up if you want to run the YOSMUD

yeah, that's basically the plan, that or set the SCSI2SD up to present two devices; one hard disk, one CD drive containing the VMS image. eventually I'd like to have multiple hard disks on one card so I can boot into different OSes and so I can fiddle with my own low-level VAX code

# ¿ Feb 15, 2022 22:37

Kazinsal: Dec 13, 2011

you definitely cannot disable canonical address checking in x86-64. it always raises a general protection fault (or a stack fault if it's a stack reference).

you *could* intentionally dereference non-canonical addresses anyways and then trap on that in your #GP handler but the context switching/interrupt latency cost would suck so you'd want to use it extremely sparingly and also as soon as AMD or Intel decides to implement PML6 you can't do that anymore because the whole 64-bit virtual address space will be available

Kazinsal fucked around with this message at 04:04 on Feb 16, 2022

# ¿ Feb 16, 2022 04:01

Kazinsal: Dec 13, 2011

eschaton posted:

MPE/iX normally doesn�t boot on a system that�s not specifically enabled for it by a few values in EEPROM (�stable storage� in HP terms) which let HP sell the same hardware badged as HP 3000 for two to five times what they charged for it badged as HP 9000 and running HP-UX

I had to interrupt the boot process, which then let me access a secret tool in the Service menu off the main boot menu, and run that to change my hardware model and part number, some model strings that are checked at boot, and the bit that says whether MPE is allowed to run

here�s an example of how the process works

that's friggin cool!

vax update: still in ebay purgatory (also known as erlanger, kentucky) :argh:

# ¿ Mar 3, 2022 04:57

Kazinsal: Dec 13, 2011

Itanium would have been a phenomenal architecture if compilers were as magic as the Itanium design team thought they were, but for the general purpose you just can't magically optimize everything for VLIW (very long instruction word) architectures -- the most successful VLIW architecture I can think of is ATI/AMD TeraScale, which is the underlying architecture for the Radeon HD 2000 through 6000 series GPUs

the Itanium architecture also had some absolutely insane, end-developer-hostile stuff like having 128 general-purpose registers + 128 float registers, with register windowing, exposing branch prediction registers to the programmer, and 128 "special purpose" registers for doing stuff like telling the hardware how to spill rotated register window contents onto a stack if needed because the instruction set lets you rotate the register windows in blocks of 32 for hyper-parallelism

intel straight up designed it with the idea that the compiler would just batch together up to six different instructions into a single giant instruction word "bundle" and throw in branch prediction hints in the mix. in practice this meant that you could slam a shitload of fused multiply-adds into a single execution cycle. in practice, everything else sucked and five years into Itanium's life, GPGPU started to show up and now you could do hundreds of FMA instructions per cycle on a $500 consumer PCIe card that had its own ultra-fast slab of memory

e: the idea of Explicitly Parallel Instruction Computing is cool but it's totally garbage for the average use case and great for highly regular parallelizable operations, which is why modern CPUs just have a whole bunch of mostly independent CPU cores with their own ALUs/branch predictors/etc. and any highly parallel numerical work gets offloaded to GPUs

Kazinsal fucked around with this message at 10:17 on Mar 4, 2022

# ¿ Mar 4, 2022 10:12

Kazinsal: Dec 13, 2011

the VAX is here! I haven't unpacked it yet but I'm currently grabbing some ISO images to shove into SCSI2SD, starting with something I'm more familiar with (NetBSD) then getting used to the machine a bit and installing VMS. I've assigned four SCSI LUNs on the SD card:
- ID 0: CD-ROM
- ID 1: hard disk for NetBSD (or maybe Ultrix if I feel spicy)
- ID 2: hard disk for VMS
- ID 3: hard disk for possible future hobby project

NetBSD/vax is an active port, so the latest release is from May 2021. the last version of VMS that supported VAX was 7.3, which came out in 2001 (and still had support all the way through 2012).

it's kind of weird that on a machine this old you can run an operating system that's still under active development and that isn't a hobby project. it almost feels like cheating... so naturally I'll be using this as an excuse to learn VAX assembly and architecture in more detail and will be writing an operating system for it for fun at some point :v:

# ¿ Mar 11, 2022 10:53

Kazinsal: Dec 13, 2011

feedmegin posted:

It was in use for a long time for graphics card stuff, because VGA is defined in terms of it. Hence why Linux graphics drivers had to do the iopl() thing (yes, I might have written a couple back in the day) as recently as early PCI or even I think AGP days because that was still part of the mode switching process for most hardware, even for better modes than SVGA. And if you didnt have a specific driver for your card, well, that's all you had to work with in that case hope you enjoy 800x640 resolution.

the fact that every graphics card can still emulate the VGA/CRTC I/O interface in 2022 is painful

at one point AMD was really good about implementing native resolutions in VBE but I don't know if that's the case still. I remember my R9 290 returned a 1440p resolution in VBE so you could get a driver-agnostic but completely unaccelerated framebuffer if you really needed one. not sure about nvidia, and intel it was pretty irrelevant for because every one of their GPUs still implements the modesetting procedure of the i915 so you can pretty easily tell it to do whatever you want and worst case it'll tell you to pound sand

e: a basic modesetting driver for i915 is like, 150 lines of code, with no need to do any sort of v86 or real mode calls to VBE to make it work either

Kazinsal fucked around with this message at 13:56 on Mar 11, 2022

# ¿ Mar 11, 2022 13:52

Kazinsal: Dec 13, 2011

so I'm still having a few issues with the SCSI controller (something about it doesn't like autobooting from the SCSI2SD but it'll let me manually boot off it) but the VAX works! login is a bit slow because it turns out a circa 1990 microvax isn't exactly that fast at doing 256 rounds of blowfish. also it can't handle SSH so I need to build some kind of SSH-to-telnet bridge (maybe automate it somewhat using krb5 telnet authentication? idk). also it's not keeping the time right, but I think that's my fault since the RTC chip/soldered on battery combo is brand new

full boot log for turbonerds:

pre:

>> NetBSD/vax boot [1.11 Sat Nov  6 19:40:01 UTC 2010] <<
>> Press any key to abort autoboot 0
nfs_open: must mount first.
open netbsd.vax: Device not configured
> boot netbsd
2403628+308444 [194288+184371]=0x2f2c58
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
    2006, 2007, 2008, 2009, 2010
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 5.1 (GENERIC) #0: Sat Nov  6 19:48:36 UTC 2010
        builds@b8.netbsd.org:/home/builds/ab/netbsd-5-1-RELEASE/vax/201011061943Z-obj/home/builds/ab/netbsd-5-1-RELEASE/src/sys/arch/vax/compile/GENERIC
MicroVAX 3100/m{30,40}
total memory = 32508 KB
avail memory = 28216 KB
mainbus0 (root)
cpu0 at mainbus0: KA48, SOC, 6KB L1 cache
vsbus0 at mainbus0
vsbus0: 32K entry DMA SGMAP at PA 0x440000 (VA 0x80440000)
vsbus0: interrupt mask 0
le0 at vsbus0 csr 0x200e0000 vec 770 ipl 17 maskbit 1 buf 0x0-0xffff
le0: address 08:00:2b:3a:15:90
le0: 32 receive buffers, 8 transmit buffers
dz0 at vsbus0 csr 0x200a0000 vec 124 ipl 17 maskbit 4
dz0: 4 lines
lkms0 at dz0
wsmouse0 at lkms0 mux 0
asc0 at vsbus0 csr 0x200c0080 vec 774 ipl 17 maskbit 0
asc0: NCR53C94, 25MHz, SCSI ID 6
scsibus0 at asc0: 8 targets, 8 luns per target
scsibus0: waiting 2 seconds for devices to settle...
sd0 at scsibus0 target 0 lun 0: <DEC, RRD40, 1> disk fixed
sd0: 2048 MB, 261 cyl, 255 head, 63 sec, 512 bytes/sect x 4194304 sectors
sd0: async, 8-bit transfers
sd1 at scsibus0 target 1 lun 0: <DEC, RZ35, 0016> disk fixed
sd1: 2048 MB, 261 cyl, 255 head, 63 sec, 512 bytes/sect x 4194304 sectors
sd1: async, 8-bit transfers
sd2 at scsibus0 target 2 lun 0: <DEC, RZ35, 0016> disk fixed
sd2: 2048 MB, 261 cyl, 255 head, 63 sec, 512 bytes/sect x 4194304 sectors
sd2: async, 8-bit transfers
sd3 at scsibus0 target 3 lun 0: <DEC, RZ35, 0016> disk fixed
sd3: 2048 MB, 261 cyl, 255 head, 63 sec, 512 bytes/sect x 4194304 sectors
sd3: async, 8-bit transfers
Kernelized RAIDframe activated
label: 0
sd0: no disk label
label: 0
sd1: no disk label
label: 0
sd3: no disk label
boot device: sd2
root on sd2a dumps on sd2b
root file system type: ffs
WARNING: clock gained 2 days
WARNING: CHECK AND RESET THE DATE!
Tue Nov  9 07:30:57 PST 2010
swapctl: setting dump device to /dev/sd2b
swapctl: adding /dev/sd2b as swap device at priority 0
Starting file system checks:
/dev/rsd2a: file system is clean; not checking
Setting tty flags.
Setting sysctl variables:
kern.no_sa_support: 0 -> 1
ddb.onpanic: 1 -> 0
Starting network.
Hostname: ashe.redacted
IPv6 mode: host
Configuring network interfaces:.
Adding interface aliases:.
Starting dhcpcd.
Building databases: dev, utmp, utmpx done
Starting syslogd.
Checking for core dump...
savecore: no core dump
Mounting all filesystems...
Clearing temporary files.
Creating a.out runtime link editor directory cache.
Checking quotas: done.
Setting securelevel: kern.securelevel: 0 -> 1
swapctl: setting dump device to /dev/sd2b
Starting virecover.
Starting local daemons:.
Updating motd.

postfix/postfix-script: starting the Postfix mail system
Starting inetd.
Starting cron.
Tue Nov  9 07:33:39 PST 2010

NetBSD/vax (ashe.redacted) (console)

login:

Kazinsal fucked around with this message at 01:37 on Mar 15, 2022

# ¿ Mar 15, 2022 01:17

Adbot: ADBOT LOVES YOU

# ¿ May 22, 2024 09:17

Kazinsal: Dec 13, 2011

infernal machines posted:

you may have accidentally doxxed yourself with that host name

eschaton posted:

The ROM may not know about RZ35 and be hesitant to boot from it automatically but trust the operator. You could reconfigure your SCSI2SD to have all the disks start at the same blocks as now, but have the one you boot from claim to be an RZ26 (sized to match, of course) and it should Just Work.

it�s something on the SCSI2SD�s controller about how it handles async transfers. happens even with zero devices being emulated so I think it�s something that the ROM is expecting to work a certain way but the SCSI2SD is handling completely differently. I need to find the right esoteric DEC field service manual for this thing basically

# ¿ Mar 15, 2022 01:40

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > YOSPOS > show/tell me about DEC hardware (and x86 memory management)