Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

karoshi posted:

DEC promised a 1000x improvement in 10 years: 10x from clocks(possible, 150MHz -> 1.5GHz), 10x from arch(possible, in-order -> out-of-order), 10x from stolen socks or sth, I don't remember. There's an article by one of the original engineers, who then went to do the StrongARM processors. He commented on how for Alpha they had to innovate in power delivery, CPUs weren't 300W monsters back then, a DECstation with a 16MHz MIPS R3000 didn't need a 2 pound copper heatsink. Some of the OG Alphas had a weird heatsink connector made of 2 thick prongs coming off the CPU package, as seen on this page: https://www.cpu-world.com/CPUs/21064/index.html. So they were pushing X amps into the CPU which was unheard of, back then. (He then did the opposite for StrongARM).

The ISA was new, designed for the "21st century", 64-bit only when everything else was 32-bit. No legacy. Also designed for high-performance, memory accesses had to be aligned on the first CPUs and so on (this was corrected later on, IIRC, trapping on some poo poo-code is not good for performance). Memory coherency was also quite decoupled, oriented towards multi-core efficiencies. The kernel docs for memory barriers still say: "- And then there's the Alpha." https://www.kernel.org/doc/Documentation/memory-barriers.txt

It was much worse than just requiring aligned memory accesses. Plenty of 1980s and 1990s RISCs did that without fatal problems - you just write your C compilers to lay out structs with padding to maintain alignment for all members, and so on. (Padding for alignment is common even on x86, because even with HW support for misaligned accesses, they're still frequently slower than aligned accesses. This isn't something that can be worked around, really, you're always going to take a hit when a load spans two cache lines.)

The real issue was that DEC, in its infinite wisdom, decided that 32- and 64-bit loads and stores were all you got. Alpha had no support for reading and writing 16-bit or 8-bit values. This made certain things very, very difficult to do, most notably C strings.

According to another-dave's answer to a question about this, this was probably a blind spot on DEC's part. Their own software stack, with VMS as the OS and most software not written in C, didn't really need native support for 16- or 8-bit loads and stores.

But DEC also wanted to sell the Alpha on the open market, not just use it in their own systems, and if they were going to do that, they were going to have to run C-based operating systems like UNIX and Windows NT, both of which have a very high dependence on C strings, which demand byte granularity loads and stores to implement with any kind of efficiency.

So, the first-gen Alpha processors were fairly useless for anything other than VMS. This hurt Alpha's initial appeal quite a bit.

Adbot
ADBOT LOVES YOU

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

Kazinsal posted:

  • RISC-V is an open-source, constantly evolving ISA that anyone (with sufficient fabrication capabilities) can implement, manufacture, and market CPUs for. It's only been truly "stable" for a few years now so it's still a newcomer in all spaces, but offerings from companies like SiFive (who freely lets you design and test custom RISC-V cores based on their core IP in their online designer tool) are putting actual working RISC-V boards in the hands of developers and consumers through their own channels as well as through third parties such as the BeagleBoard.org project.
  • MIPS is another 80s RISC design that is still getting updated (the latest release of the MIPS64 ISA spec is from 2014). I don't know much about what MIPS is used for other than in a number of deeply embedded situations such as small networking appliances (Ubiquiti's routers are MIPS-based, for example), but it seems cool!

The status of MIPS is complicated.

MIPS64 ISA ownership has bounced around a bunch of times, and is a little murky. If you want to trawl through news articles on sites like this:

https://www.eejournal.com/?s=mips

maybe you can make sense out of it. Or maybe you can't. I can't.

As you can see from another story there, MIPS Technologies recently emerged from bankruptcy and announced that its new designs would be RISC-V. And in a sense, RISC-V is MIPS TNG: it was created by one of the OG founders of MIPS Technologies, and takes a lot of inspiration from MIPS, even though it isn't compatible with MIPS.

There's also Loongson! Loongson is kinda an extension of the Chinese state, and has been on a long program to make China independent of Western CPU IP, since changing Western political whims can affect whether Chinese companies are allowed to use said IP. For a long time, Loongson built MIPS64 CPUs, and some of the MIPS ISA ownership murkiness arises from various deals which seemed aimed at transferring ownership of MIPS to Chinese companies. They seem to have given up on gaining control of the actual MIPS ISA and recently announced LoongArch, which looks a lot like MIPS64 but with a new encoding and a bunch of other changes.

MIPS as we knew it is probably dead. It's technically still out there, so someone could pick it up and push it again, but currently the parties likely to be interested in that are instead pursuing forks which diverge from direct compatibility with the original MIPS ISA.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull
Also, on ARM:

It should be noted that 64-bit ARM, aka aarch64 or A64, isn't truly a descendent of the 1980s 32-bit ARM, which has been retroactively named aarch32 or A32. aarch64 is a mostly-clean-sheet redesign.

Why is this important to highlight? Because the other way is what AMD chose for x86-64. It used x86 prefix bytes to add new 64-bit only opcodes, but even when the CPU is in 64-bit mode it's still legal to execute old 32-bit instructions. You can write shims to allow 32-bit code to call into 64-bit libraries, and vice versa.

With aarch64, the CPU's decoders are either in aarch64 mode where they recognize only the new 64-bit instruction encoding, or in aarch32 mode where they only understand the old 32-bit encoding. The encodings are too incompatible to support both at the same time. Decoder mode switches are only possible as a side effect of privilege level changes - hypervisor entry/exit or kernel entry/exit - so userspace 64-bit code can never call 32-bit libraries, or vice versa.

More importantly, mode support is optional. The ARMv8 spec is written to allow both aarch32-only and aarch64-only implementations. Apple went 64-bit only in their A-series iPhone/iPad chips a long time ago, and hasn't changed course now that they're transitioning the Mac to Arm. Arm itself has made some announcements about future cores transitioning to 64-bit only. So, the future of Arm is a relatively clean break from 1980s Arm.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

feedmegin posted:

You sort of forgot to mention Thumb. The world now is basically AArch64 (big boy processors) or Thumb (microcontrollers), classic ARM is legacy but both of those are going forward. Cortex-M isn't going anywhere.

I did kinda skip it, yeah. It's part of aarch32 in the Arm v8 spec, so for the record, the full set of Arm v8 operating modes and instruction sets is:

aarch32 mode: T32 and A32 ISAs
aarch64 mode: A64 ISA

You're completely right that aarch32 is staying around for applications like microcontrollers.

(and who knows, maybe some of the platforms that use dual-mode CPUs today will find aarch32 too sticky to leave behind. Apple didn't, but they made that transition on iOS where they could just set a sunset date for allowing 32-bit code on the App Store.)

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull
Let's talk about a couple interesting Intel ISAs which are not x86!

====

The first is incredibly popular, yet also obscure if you aren't in certain segments of industry: MCS-51 aka 8051.

It was a family of 8-bit microcontrollers which Intel originally developed and sold in the 1980s. For its time, Intel did a lot of things right with the 8051, and it won big. They got designed into everything. Intel even licensed it widely, which only increased its reach and popularity. I would guess that most 8051 cores in existence weren't made by Intel.

It is somehow still with us today. An example I happen to know about is that the chips in many USB3-to-SATA devices (drive docks, enclosures, etc) contain a simple USB-SATA data bridge controlled by an 8051, which both directs dataflow and translates USB Mass Storage Class commands to SATA commands.

However, it is slowly but surely fading away. 8051 is this weird old architecture from an era when µCs were almost exclusively programmed with assembly code. It's losing design wins to architectures like the embedded profiles of RISC-V and ARM, which have much bigger and livelier software dev ecosystems. (Those USB-SATA bridge chips I mentioned? The newest gen chips from the same company have gone over to Cortex-M0.)

====

And now, let's talk about one of Intel's greatest failures!

I bet some readers are thinking "aha, he's going to talk about Itanium". Nope! The sad story of the iAPX-432 played out a couple decades before Itanium, and it was arguably a worse flop. Itanium actually had a real software ecosystem and multiple generations, but 432 just kinda sank into the swamp. Nobody wanted it. It was hard to write software for, expensive, and performance was terrible.

The 432 was Intel's take on a popular idea in 1970s computer architecture: closing the "semantic gap". Then and now, successful CPUs run conceptually simple instructions. But high level language (HLL) statements can encode much more complex ideas into even single statements. While it was possible to translate HLL statements into equivalent sequences of low level machine instructions, what if instead you designed your ISA to have very complex machine instructions that offer a more direct map to HLL semantics? Wouldn't that be a good thing? (narrator voice: It is not.)

The 432 ISA was designed to support Object Orientation and Capabilities. OOP was the fresh new HLL paradigm sweeping the CS world by storm, and the 432 tried to be the best possible CPU for running a 100% OOP software stack with capability-based protection.

Ironically, this caused one of the central problems of the 432: the architecture is extremely opinionated in favor of the 432 architects' ideas about what OOP should look like at a low level, and that means not all OOP environments need apply. For example, software isn't allowed to directly manipulate pointers to objects, that's limited to hardware and microcode. So if your particular OOP language has some level of friction with the 432's ideas about how pointers work, you are in for trouble.

A full dissection of how iAPX-432 collapsed under its own weight is far more :words: than I'd like to research and write. I'm not even sure the paragraphs I wrote above are that great a description, tbh. Just take a look at this contemporary academic paper on it, and contemplate how difficult it would be to port preexisting software to that thing.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

ConanTheLibrarian posted:

Something that would be cool is a comparison of the different features some of these architecture had and what workloads it made them good/bad at. For instance, what made SPARC worth choosing over other contemporary options, that kind of thing.

SPARC exists because Sun Microsystems saw the Berkeley RISC project and decided "We need that." There wasn't really anything in the SPARC ISA which made it inherently better than the other RISCs. In fact, many would argue it was one of the worst of the early RISCs, because its register window ISA feature turned out to be a bad idea. But it was Sun's RISC, so that's what Sun used, warts and all.

More generally, in the 1980s a whole bunch of things came together. The RISC concept demonstrated that a small team could design a competitive (by 1980s standards) CPU on a low budget. The rise of merchant fab services meant you didn't have to own your own fab to build chips. There was lots of demand for small-ish UNIX workstations, and these RISC CPUs were a killer feature for that type of machine. So everyone in the RISC workstation biz suddenly decided they needed their own private RISC ISA.

But only a couple decades later, this all fell apart. It was no longer enough to put a simple pipelined in-order core on a single chip. There was demand for higher and higher clocks, out-of-order superscalar cores, and so on. A lot of the boutique RISCs became unsustainable as design budgets grew; they just didn't have the sales volume to stay in the game.

So we got exit strategies. To pick two prominent examples, IBM went into partnership with Apple and Motorola in hopes of gaining a much larger consumer-scale market for POWER, its RISC - the resulting "new" architecture was named PowerPC, but it was really just a slightly altered POWER. HP was already working on a post-RISC architecture to replace its PA-RISC, realized they couldn't go it alone, and offered a partnership to Intel -- that's how we got Itanium.

In the end, everyone got steamrolled by the Intel fab tech and x86 engineering budget made possible by the mass market Wintel juggernaut. This included Intel's own Itanium division, even though it was the golden boy of senior executives.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

feedmegin posted:

Not actually EVERYONE as it happens. Your phone isn't running on Intel and its not like they haven't tried. These days, neither is your Mac.

I meant at the time when x86 made it look like things were over, of course.

What's happening now is arguably history repeating itself. The success and huge sales volume of the IBM PC (and its clones) allowed the x86 ISA to invade and ultimately dominate markets way above its humble origins. Today, Arm is in that role: it's the traditionally lower-spec architecture taking advantage of living in a much higher volume market to invade upwards.

You even see people making the same mental mistakes. Back in the day, people used to scoff at the idea that x86 could take on the workstation and server markets. Recently, we've seen many similar opinions about Arm; people assume there's some intrinsic property which makes x86 faster.

It's not inevitable that all PCs will be using Arm CPUs in 5 or 10 years, but I do think we're in an inflection point where that could happen. It depends a great deal on Microsoft having the desire and competence to push it forwards.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

Farmer Crack-rear end posted:

What was it about UNIX workstations that made RISC CPUs particularly well-suited to them?

Speed. Early RISC performance numbers made it clear that the most commonly used CISC UNIX workstation CPU family, Motorola's 680x0 series, was going to be left in the dust.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull
I don't think you're going to find any commercial computers which implement exotic number systems in hardware. Most are nightmarish to implement and nearly all have zero practical value, so why?

Phobeste posted:

Does this count? http://www.6502.org/users/dieter/bcd2/bcd2_6.htm Or like a 1s-complement system?

By the way, IBM is still big on decimal math right up to this day. They threw their weight around enough to get decimal FP formats into IEEE 754-2008, and POWER9 supports decimal in several places.

A lot of the demand for decimal math is related to IBM's mainstay for big computers, finance. There are tons of trivial and important decimal fractions which cannot be expressed in a finite number of binary digits. (0.1 base 10 = 0.0001100110011001100... binary). Accountants like it better when the computer's results are an exact match to what they'd see with old school manual computations. Most companies have regarded software implementation of decimal fractions as good enough to support such applications, but not IBM.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

karoshi posted:

Intel fanboys might optimistically say Intel wanted to replace the crusty, dead x86 arch with a new, modern baroque unproven kitchen sink arch.

"Baroque" somehow manages to understate the insanity of Itanium. There's layers. The amazing thing is that it was all justified by claiming that it would be less complicated to implement than OoO RISC - but that clearly wasn't the case in reality.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

priznat posted:

I would like to know how they do the interposer. Is it a separate process with 2 Max dies to attach them? Really wild.

If they're using the word "interposer" the way it's been used in industry before, it'll be something similar to this:

https://www.xilinx.com/support/documentation/white_papers/wp380_Stacked_Silicon_Interconnect_Technology.pdf

The interposer die has to be giant since it's as large as all the logic die put together, but that's relatively OK here. It doesn't need any active circuitry, or any of the expensive fine pitch metal layers - you can make it on old fully amortized process nodes, and yield should be high. The one exciting thing it needs is TSVs.

The animations Apple made hinted at something a bit more like EMIB, where the interconnect die is a thin rectangle covering just the area used for the chip-to-chip bridge. But EMIB requires burying the interconnect die in the package organic substrate, which is a whole other thing, and I wouldn't think they'd use "interposer" to describe that kind of tech. An interposer is something which is completely interposed between the logic silicon and package substrate.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

NewFatMike posted:

Ian Cutress published a video I’m about to watch on it, I’m very excited:

https://youtu.be/1QVqjMVJL8I

I couldn't make it through this, it's as disappointing as his writing for AnandTech always was

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

icantfindaname posted:

Why PC clones and not Amiga clones or Mac clones or whatever?

IBM built the PC from off-the-shelf chips anyone could buy, so it was easy to reverse engineer the motherboard and design highly compatible clone hardware. The BIOS was proprietary, but quite simple, making it relatively easy to do cleanroom reverse engineering designed to withstand potential legal challenges. Finally, the PC's MS-DOS operating system was supplied by Microsoft, and they were of course quite interested in selling their OS outside IBM's channel.

Mac hardware was harder to clone - the original 128K Mac used a bunch of PAL/GAL gate array chips with custom logic inside, and later Macs moved to full custom chips. The Mac ROM was a big chunk of its OS, which was incredibly complex compared to BIOS and not easy to clone. And Apple had little interest (early on, anyways) in licensing the disk-loaded part of its Mac System software to clone companies.

Amiga was much the same story as the Mac, except with a full custom chipset from the start.

So, there were just a lot more technical and legal hurdles involved in cloning Macs and Amigas. The lack of barriers (especially legal ones) to cloning the PC probably happened by accident - the PC was a weird lightly funded maverick division which senior IBM management didn't take seriously at first. IBM had lots of experience fending off cloners in the mainframe world, so if they'd seen the PC as a real thing before it became an extremely real thing, I'm sure they would've taken steps to make it harder to clone.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull
Ceramic 64-pin DIP is a chonker of a chip package.

I just did a bit of google searching on "Palo Alto Shipping Company" and apparently that was a little Forth startup. Forth is such a fascinating bit of computing history; it (or maybe Chuck Moore's genius) convinced a small dedicated following that it could and should be the basis for everything, but realistically Forth had no chance of actually doing that.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

phongn posted:

As an aside, I kinda wish that IBM had chosen the M68000 instead of the 8086 for the original IBM PC; it was a much cleaner design with a vastly nicer orthogonal ISA. Some people even made an out-of-order, 64-bit version.

If you're what-ifing that, you have to change how the ISA evolved. 68K lost badly to x86 in the mid-1980s, not just the early 80s when IBM selected the 8088 because it was available and cheap at a time when the 68K was neither.

68K took a sharp turn for the worse with the 68020. Motorola's architects got blinded by that orthogonality and beauty and tried to continue the old "close the semantic gap between assembly and high level languages" CPU design philosophy that had led to what we now call CISC. The changes they made were all very pretty on paper, but made it hard to design chips with advanced microarchitectural features. This played a part in 68K falling well behind instead of keeping pace with x86.

(Apollo manages to be OoO because it's a bunch of Amiga cultists with no completely agreed upon project goal other than making something they think is cool to run AmigaOS on. With no commercial pressures, you don't have to simultaneously worry about things like clock speed and power, which makes it easier to do OoO just because you can.)

You can learn more by finding more old Mashey usenet posts! He had a neat series breaking down what makes a RISC a RISC, down to detailed tables comparing ISA features. x86 ends up being substantially closer to RISC than 68020, and in one of the most important ways (addressing modes).

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

phongn posted:

I know there were a lot of sound business reasons for IBM picking the Intel processor, not the least price, second source availability, etc. I just know it was a candidate, and for its later ISA faults it did have a lot going for it that wouldn't really appear on Intel until the 386. Not having to deal with all the different types of memory models on x86 from the start would've been nice (though of course 68K had its own problems with developers using the upper address byte because it "wasn't used" at first). Not having to deal with the weird x87 stack-based FPU would also be nice.

For sure. The original 68000 was so much cleaner than x86!

quote:

While the 68020 started getting over-complex, Intel also made its own mistakes with the 286 (ref. Gates' reference to it being "brain-dead"). Motorola did seemingly realize its mistakes and removed some of those instructions later on, so I think some of these design issues could've been overcome? I don't think it was as complex as, say, VAX or iAPX 432. The 68060 was competitive with P5, at least.

Not sure you can say the 68060 was truly competitive with P5. It was extremely late to market and its clock speed was disappointing.

It wasn't new instructions that were the problem, it was new addressing modes, made available to all existing instructions. They were quite fancy. Stuff like (iirc) double indirection - dereference a pointer to a pointer. For many reasons (which Mashey gets into at some point) it's difficult to make high performance implementations of an ISA which generates anything more than a single memory reference per instruction. Despite all its ugliness, this is something x86 actually got right.

Motorola wasn't able to get rid of this stuff in 68K proper. Instead, they defined a cut-down version and called it a new and incompatible CPU architecture, ColdFire. I think this even extended to removing stuff from the baseline 68000 ISA - the idea was "let's review 68K and remove everything which makes it obviously not-a-RISC". It could not boot unmodified 68K operating systems.

Oddly enough, Intel got away with its 286 mistakes because they were so bad almost nobody tried to use them. The market generally treated the 286 as just a faster 8086. IIRC, OS support was limited to an early (and not very widely used) version range of OS/2. Maybe things would have moved eventually, but the 386 offered an obviously superior alternate idea for extending x86, at which point more or less everyone dropped all plans to use the 286 model.

Still, AFAIK, Intel has kept all the 286 weirdness hiding in the dusty corners of every CPU they've made. I think they've only recently started talking about removing some of the legacy stuff. It's very hard to subtract features from an ISA once they're fielded.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

in a well actually posted:

It isn’t, really (cisc decoders etc. don’t eat that many joules relative to alu or cache.) A lot of the perf diff is that designs on the market for ARM are optimized for power efficiency, and also that Intel process teams ate poo poo for a decade while TSMC and Samsung got ahead.

One of the secrets to Apple’s efficiency is that since they are the purchasers of their cpus they can optimize design for most performant design over density for maximum cores per die.

I think you might be downplaying x86 decoder power a bit too much. One reason why uop caches are popular in x86 designs, and essentially nowhere else, is that cache hits allow decoders to go idle to save power.

But more significant than the decoders themselves are the implications for everything else. An ultra-wide backend would just get bottlenecked by the decoders it's practical to build, so Intel and AMD haven't explored the wider/slower corner of the design space. They've settled on building medium width backends with very high clocks, and that has consequences.

One is that x86 cores are enormous, probably thanks to deep pipelining. Look at these two annotated die photos of M1 and Alder Lake chips with eight P CPUs each.




As you say, Apple is willing to burn area with reckless abandon. Their P cores are the big chunguses of the Arm world. And yet, they're still quite small relative to x86 P cores, even accounting for all the confounding factors in those die photos (m1 pro is about 250mm^2, the alder lake about 210, different nodes, etc).

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

hobbesmaster posted:

edit: I suspect this will render many in this thread speechless:

Not surprising to me at all, but then I've done some time working at a place that designed electronics for vehicles. Embedded electronics for harsh environments is a very different world.

For example, how many fast CPUs do you know of which are rated for operation down to -40C and up to at least +100C? These are common baseline requirements for automotive applications, even for boxes which live inside the passenger compartment rather than the engine bay.

Another: most consumer silicon disappears only two or three years after launch. But designing and qualifying electronics for the harsher environment inside a vehicle is expensive, so you don't want to constantly re-do it - you want to design something really solid and just keep making it for five years, or more. That narrows the list of components you can possibly use quite a bit.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

phongn posted:

As I said, I am not arguing for a radical change like IA64, but wondering if something more than the "bolt on 64-bit to IA32" could be done, too. There is a continuum between those options. And sure, lots of ARMv8 processors are running in aarch32 mode. If anything that demonstrates that performant backwards compatibility with legacy code could be maintained while migrating to a nicer future?

Sure, it's technically possible. Lots of things are.

Would a cleaner break from x86 have been a market success? I have doubts. At the time, AMD was a very small player trying to punch above its weight, and Intel was the monopolist pushing a clean break from x86 in the form of Itanium. If AMD proposed its own new thing, it would have been an uphill battle. AMD needed to do something to differentiate their approach from Intel's. Designing it as an extension of IA32 rather than a replacement helped them get their foot in the door.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

eschaton posted:

If Itanium had been a 64-bit RISC, or even a 64-bit equivalent of i860, it probably would have taken off. Instead it was The Bizarro CPU and while it was eventually able to get some serious throughput (my Itanium 2 VMS box does pretty well running FORTRAN) the compiler problem was grossly underestimated as a factor.

And how. Bob Colwell is probably a somewhat biased source, given that he was part of the x86 faction in Intel, but the following has the ring of truthiness because how else could a disaster like Itanium happen?

Robert P. Colwell Oral History posted:

Anyway, for some reason, there was an organizational meaning at which Albert Yu could not appear. He designated Fred Pollack, but Fred could not appear, so Fred designated me, and I showed up. So first of all I am two organizational levels down from who is supposed to be sitting there and I ended up sitting next to Gordon Moore. This was probably about 1994 or so. The presenter happened to be the same guy who was in the front of the car from when I interviewed with the Santa Clara design team; same guy. He's presenting and he's predicting some performance numbers that looked astronomically too high to me. I did not know anything about how they expected to get there, I just knew what I thought was reasonable, what would be an aggressive boost forward and what would be just wishful thinking. The predictions being shown were in the ludicrous camp as far as I could tell. So I'm sitting and staring at this presentation, wondering what are they doing, how is it humanly possible to get what he's promising. And if it is, is it possible for this particular design team to do it. I was intensely thinking about what's happening here. Finally I just couldn't stand it anymore and I put my hand up. There was some discussion, but you have to realize none of these people were really chip designers or computer architects, with the exception of Gelsinger and Dadi Perlmutter.

0:13:53 PE: Sorry Dadi

0:13:54 BC: Dadi Perlmutter, he's one of the executive VPs in charge of all the micros right now.

0:13:58 PE: D A D I

0:14:00 BC: Yeah, his real name is David, he’s an Israeli. Everybody calls him Dadi. And then Pat Gelsinger who was the chip architect, designer in 386 and 486. But most of those guys at this presentation haven't designed anything themselves, they know how to manage complicated large expensive efforts, which is a different animal. Anyway this chip architect guy is standing up in front of this group promising the moon and stars. And I finally put my hand up and said I just could not see how you're proposing to get to those kind of performance levels. And he said well we've got a simulation, and I thought Ah, ok. That shut me up for a little bit, but then something occurred to me and I interrupted him again. I said, wait I am sorry to derail this meeting. But how would you use a simulator if you don't have a compiler? He said, well that's true we don't have a compiler yet, so I hand assembled my simulations. I asked "How did you do thousands of line of code that way?" He said “No, I did 30 lines of code”. Flabbergasted, I said, "You're predicting the entire future of this architecture on 30 lines of hand generated code?" [chuckle], I said it just like that, I did not mean to be insulting but I was just thunderstruck. Andy Grove piped up and said "we are not here right now to reconsider the future of this effort, so let’s move on". I said "Okay, it's your money, if that's what you want."

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

Subjunctive posted:

I love this, and choose to believe it. major corporate strategy has been set on grounds much weaker than 30 lines of simulated instructions. where can I read more?

It's from this:

https://www.sigmicro.org/media/oralhistories/colwell.pdf

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

in a well actually posted:

Yeah vliw is like the worst unless you’re doing a dsp (or doing hand optimized science code); for general purpose servers you couldn’t choose a worse architecture*. I2 tried to fix the problems with the architecture by putting a shitload of bandwidth in including an astounding for the time 9 mb cache.

* anything that came to a commercial product; I’m sure some academics have done far worse

Have you ever done a deep or shallow dive into itanium? I did a shallow dive once (googled technical docs and skimmed for a while), and I can't say that I came out thinking it even qualifies as a VLIW.

Don't get me wrong, there's aspects which seem VLIW-inspired, but overall it seems like its own thing. They were trying hard to make something novel, I'll give them that much! Like eschaton said though, what they actually made was the Bizarro CPU. Everything's weird or bad or both, and not in a subtle way.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

Rescue Toaster posted:

Are there any non-chinese companies leaning into RISC-V other than SiFive? I was getting really excited about RISC-V from a security perspective, having a new option not plagued by closed 'management engine' processors or whatever in god's name Pluton will be doing once it's integrated in all AMD & Intel x86 machines.

But at the end of the day you need someone to actually fab RISC-V silicon and making a choice not to infect it with sketchy poo poo. With SiFive cozying up to Intel I'm losing hope fast that there will actually be anything of decent performance (like with real virtualization features, for example) and actually be clean of bullshit.

Western Digital, for disk drive controllers. Probably not the answer you're hoping for.

It's not clear to me how RISC-V can escape from the deep embedded world. It's lacking in several areas compared to 64-bit Arm, and perhaps more importantly, the reason 64-bit Arm is making any headway against x86 in desktop computing is the giant boost it got from cell phones.

Pluton paranoia is generally a bit overwrought. https://mjg59.dreamwidth.org/58125.html

If you want a non-x86 personal computer with real virtualization features that's clean of bullshit, it's already here. Just buy an Apple Silicon Mac. There's nothing equivalent to SMM or TrustZone, meaning the OS doesn't have a hypervisor running above it stealing cycles to do random bullshit now and then. And Apple's secure boot is a very clean design with some novel features.

Most notably, every OS on the machine has its own independent boot security state, and the minimum state amounts to the user attesting to the machine "yes you should boot this binary because I, the computer's owner, trust it". When you do this it locally signs the binary and stores secrets in the Secure Enclave (Apple's TPM equivalent), making it possible to check for tampering at boot time.

Since it's able to check the integrity of a binary not signed by Apple, anyone can build a secure boot chain for a third party OS on top of Apple's infrastructure without asking Apple. This is a breath of fresh air compared to Secure Boot on the PC, which requires a vendor public key to be preloaded into the firmware before it will trust anything. Most PCs ship with only Microsoft public keys, so these days Linux distros have to hand Microsoft some money to get their bootloaders signed.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

Kazinsal posted:

Did SiFive give up the ghost or something, because it seems like they actually had enough of a technical foothold to be able to poo poo out plans for RISC-V cores for everyone and their grandma on request five years ago but nobody has come up with a realistically useful RISC-V based appliance board yet.

Two GigE ports on some mysterious prototype from Shenzhen is better than what we've seen until now but it's still basically loving nothing. We're a year out from PC Engines shutting down because they can't get parts for their frankly archaic GX-412TC based board but at least that could do 8Gbps of firewalled throughput for the 4x1GbE ports on the board just fine.

Since they're giving up the ghost and the only apparent successor is fly by night Aliexpress brands making GBS threads out unconfirmed spec-adjacent RV64 boards, I have no idea what the plan is from here on out for low cost open source network edge machines. And as someone who is a network person professionally and an OSS network person personally, that concerns me.

SiFive's website is up and their corp twitter account has posted in the last few days, they seem to have a pulse.

I think you expected way too much if you thought SiFive alone was going to make RISC-V a huge thing in under a decade (or maybe ever). It's a company led by the same academics broadly responsible for the flawed (imo) RISC-V ISA spec. I have no doubt they can do real things, and possibly even good things, but competing head to head with relative giant Arm Holdings was and is a tall order.

I'm not just hot taking on the known deficiencies of the ISA and the tendency of academics to underestimate how difficult it is to ship things in the commercial world. It's just very hard to attack a successful incumbent in this space. Hardened Arm IP cores are available on every process node anyone wants to use, they're all debugged to hell and back, Arm's suite of non-CPU IP is extensive (which helps people bolt together SoCs fast), design technical support is excellent, and the software ecosystem is far more diverse and mature. If you're a SoC design house looking at which to use for an upcoming project, the extra cost of going with Arm buys you a lot less engineering risk, more options, and demonstrably better cores. Only thing RISC-V has going for it is price. Sometimes that's enough to win, sometimes it's not...

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

VostokProgram posted:

What are the problems with the ISA

sorry for the delayed reply. Here's a decent writeup someone did a while ago which covers a lot of it:

https://gist.github.com/erincandescent/8a10eeeea1918ee4f9d9982f7618ef68

arm64 is a much better designed 64-bit RISC ISA, IMO.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

feedmegin posted:

Or even a Cortex-M0. Those things are like 1mm square these days, cheap as chips and you can do a lot with 16k of sram or whatever.
(Source: did the software side of this two jobs ago, writing, yes, a fairly complicated FSM sitting on an FPGA to orchestrate some hardware)

This but 1mm^2 is a massive overestimate. Arm says M0 needs 0.11mm^2 in a 180nm process (presumably TSMC, 25 years old and still commercially relevant), 0.03mm^2 at 90nm, and 0.008mm^2 at 40nm.

That's why modern SoCs end up with dozens of microcontroller-class cores - they're tiny and they reduce the risks involved in letting people like me try to write complex state machines in Verilog. Sometimes it's not even logic designer mistakes, it's that the state machine implements a specification defined by an outside standards body, and they decided to change it a bit after your tapeout.

Also, even when you bake a microcontroller's program into mask ROM, it's relatively cheap to patch in a stepping since you only have to change one or two metal layers. Another attraction is that mask ROM is a regular structure that's relatively easy to edit with a FIB machine (basically a write-capable electron microscope), so when you're bringing a chip up with buggy µC firmware you can test changes before finalizing metal layer changes for the next stepping.

So many choices in chip design are driven by optimizing the time and cost of fixing bugs found after you get first silicon back rather than optimizing purely for gate count, power, etc.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

Hadlock posted:

Also most of the RISC-V base instruction set is based on the patented ARM Cortex M0 instruction set... whose patent has expired.

wat

Where'd you get this insane idea from? Certainly not from reading ISA manuals; there are few points of similarity beyond the bare minimum you'd expect between two ISAs that both get classified as RISCs. (We're talking big differences in some pretty fundamental design choices here - think stuff as basic as "the number of general purpose registers" and "how decide whether to branch".)

Also the stuff you wrote about China makes no sense to me.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

Hadlock posted:

It's specific to arm3. I'll have to dig it up later today

It's going to be a waste of time.

Two universities in Northern California, UC Berkeley and Stanford, started the big RISC revolution in the early 1980s. RISC-V is named the way it is because its creators see it as the "fifth major RISC ISA design from UC Berkeley" (see the footnote at the bottom of the "Introduction" page).

ARM got its start eight timezones away on another continent. Its designers faced severe constraints which led them to create something that stood apart from the rest of the 1980s RISCs. Early ARM was such an oddball that it hasn't had much influence on modern ISA designs, including Arm's own: the 64-bit Arm ISA is a nearly clean break from its own past.

Do you start to see why I'm reacting like this? The ISAs are observably quite different, nobody sane would even want to crib from ARMv3 (excpet perhaps in extremely narrow ways), the backgrounds of RISC-V's architects wouldn't lead them to take much from ARM, and said architects have confirmed that their work most closely derives from the Northern California RISC tradition.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull
I think I heard basically the same story about Loongson and MIPS, but am equally fuzzy. If wikipedia's article on Loongson is to be trusted they did get some form of official licensing at some point, but are now off doing their own thing again, where "their own thing" still appears to be MIPS with the serial numbers filed off.

I've never gotten the impression that Loongson processors are widely used, even in China. It feels like one of those cases where the Chinese government is throwing money at trying to develop a domestic version of a key technology, but the outfit they're pumping money into is maybe a bit more about graft than doing anything effective.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull
Q64-22 is only 69W according to Ampere, so it'd do fine in an ATX tower. I'm guessing this board exists because people working on software to deploy to arm64 cloud servers need development & test workstations.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

Hadlock posted:

RISC-V is almost exclusively supported by Linux, so it's (probably) just a RISC-V Linux binary and talks to the kernel for audio video bindings. Presumably whatever custom kernel was compiled for the device should have adequate GPU support and whatnot. Ubuntu has had official support for RISC-V since the April 2020 release so presumably by the time this ships RISC-V will have been mainline for 4 years

I'm increasingly of the opinion that the US government missed the boat on containing Chinas CPU capacity by about five years

I'm increasingly of the opinion that you have a weird boomer-like obsession with China, including Clancy-esque ideas about somehow preventing them from developing domestic CPU design capability. Also that you just don't understand how CPU design works, which ties in with actually believing the clancy nonsense.

RISC-V hasn't helped China at all. It's not even a very good ISA, but more importantly, ISA is the least important thing here. x86 may be sitting a bit uneasy on its throne, but for now it still rules the PC world despite being an objectively bad ISA design.

What matters is putting together a team of talented and experienced design engineers who can make a high quality, low power, and fast implementation of any ISA. The experience part of that takes iteration (AKA practice), same as anywhere else in the world.

There is no way to stop China from practicing. Even if it was necessary for them to practice with the best commercially relevant ISA (it's not), they could always just buy an Arm architectural license. By the time they have one or more world class implementation teams, they can worry about designing their own RISC ISA, and retarget for that.

Of course when I say "they" about China, I'm falling victim to the weird narrative you're forcing on things, in which China is a unitary executive. What's actually going on is that China is a capitalist state. Yes, it has an authoritarian single-party government that kinda pretends to still be communist, but it's got a ton of very rich people who privately own a lot of the means of production. Some of these capitalists are trying to break into CPUs. They often get some support from the Chinese government, in fact some of them seem to be entirely propped up by it (something we do for strategic industries in the West too), but ultimately they are capitalists. They want to make chips which people will buy. Lots of the market for CPUs is outside of China. Hence the interest in an up-and-coming ISA which might appeal to the West, yet does not require paying licensing fees or unit royalties to Western companies.

That's it, that's the main reason Chinese companies like RISC-V. It's just markets and money. As far as I know there's no Chinese company with a world-class ISA design or CPU implementation team yet, but it's a fool's errand to try to stop it from happening.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

ConanTheLibrarian posted:

What does and does not contribute to a good ISA design? Is it the inevitable fate of any sufficiently old ISA to accumulate enough cruft to bog it down?

It doesn't have to be inevitable, a good steward can avoid putting dumb things in.

x86 didn't need decades of cruft accumulation, though. It had bad things from the word go. Some of the original design flaws kinda got shoved off into a corner by the 386, so long as you ran only 32-bit software. (Something which took more than the market lifespan of the 386 to actually achieve since Microsoft was so incredibly slow to fully take advantage of the 386.) Others are still with us today.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

Subjunctive posted:

unrelated: did anyone collate all the RISC-V bashing from this thread anywhere? I’d love to have it in one place

Here's two links I've collected, one old and one from a month ago.

https://gist.github.com/erincandescent/8a10eeeea1918ee4f9d9982f7618ef68
https://queue.acm.org/detail.cfm?id=3639445

The second isn't exclusively about RISC-V and contains little outright bashing, but given who the author is, when he critiques RISC-V, it hits hard.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull
The Oryon CPU was designed by a team led by Gerard Williams III, Apple's former CPU lead. He left Apple after A14/M1 because he wanted to make a server chip using Apple's CPU cores, but got shot down by other Apple execs. So he founded a startup, Nuvia, and managed to recruit some of Apple's CPU engineers. But then he accepted Qualcomm's offer to buy Nuvia and now he's back in consumer products.

There's lots of hot takes out there on this, people think losing Williams to QC means QC's going to blow Apple away. However, great man theory sucks. CPU design teams are a lot more than one senior manager, or even the subset of engineers he got to leave with him.

There's also all the intellectual property - Nuvia obviously didn't get to legally take any of Apple's, and a lot of Apple's success during Williams' tenure resulted from steady incremental improvements rather than every CPU generation being a clean slate. Starting over from scratch implies a lot of extra work.

I wouldn't be surprised if at least some of the brains departed with some :filez: too, just because this always happens to some extent. However, shipping any of it in Qualcomm products is risky, as Apple absolutely will be dissecting everything to see if there's any excuse to sue. They already tried suing Williams for various things, and that was when Nuvia wasn't even going to be a direct competitor.

So there's reasons to expect that QC's new CPUs should be good. Can they beat M3 or M4? I have doubts; reaching parity with something as refined as Apple's Arm core in a single generation is a tall order even if Williams got literally all the key people.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull
I wonder how much 88K Macintosh hardware is out there.

(for those who don't know, when Apple was investigating options for transitioning away from 68K, 88K was one of them, and it got as far as them manufacturing a bunch of prototype 88K Macs for software development work.)

Actually I wonder how much 88K hardware is out there at all. Not a wildly successful ISA!

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

minidracula posted:

From what I understand when I last looked into it, MVME was probably the most manufactured and deployed m88k form factor? Data General also originally built AViiON systems on m88k, before switching to x86 (Pentium-era, initially, I think). There were some other small scale users, OMRON's LUNA being another, mostly in Japan, and some use in telcos, etc. (Nortel had some use of m88k in some part/version/edition of DMS at one point). Beyond that I'm not sure. I know some CMU folks used m88k for some Mach projects. The sense I got was once AIM "took off" and settled on PowerPC, m88k was well and truly dead inside Motorola, and it had already had a late start compared to SPARC and MIPS in the RISC space of the era, etc., etc.

I didn't know about the prototype Mac m88k HW at all, or that they even did that!

If wikipedia is to be believed, m88k was a product for just 3 years, 1988 to 1991. 1991 was about when the AIM alliance formed up, and yes, PowerPC absolutely killed m88k - it hadn't gotten much adoption and PowerPC had a built-in volume customer.

Take a look at this CHM history page, which is about Gary Davidian's m68k emulator projects at Apple. It has some pictures of m88k Mac hardware - a Mac LC box with the 3-chip original generation m88k stuffed in it.

https://computerhistory.org/blog/transplanting-the-macs-central-processor-gary-davidian-and-his-68000-emulator/

The oral history interviews with Davidian are neat and a big chunk does concern m88k. Hard to judge from it how many 88K machines they actually built, but probably not many - sounds like the project proved itself in that 68K emulation on 88K worked well, but then the AIM deal happened and swept away any need to distribute m88k Macs to a bigger team.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull
I've been listening to this 2021 Twitter Spaces recording that was a retrospective / requiem for SPARC, made by a bunch of ex-Sun/Oracle people.

https://www.youtube.com/watch?v=79NNXn5Kr90

It's a bit scattershot but fascinating. Lots of 'lmao our CPUs were so poo poo and doomed'. They confirmed my one of my gut reactions in a big way - I've never personally done anything with SPARC, but from a distance the register windows always looked like a terrible idea. Turns out that lots of the insiders think they were bad too.

Adbot
ADBOT LOVES YOU

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

Ceyton posted:

edit: the cheaper Artix-7 XC7A200T FPGA will probably work too, but beware of exchange rates between Ultrascale+ LUTs and 7-series LUTs. It would be nice if you had access to Vivado/Quartus so that you could do trial synthesis runs with different FPGAs and find the cheapest one that works. But if you don't have access to them through your work/school, non-locked-down licenses are not cheap

You can generate free Vivado licenses for "WebPack" Xilinx parts, AKA the devices small and cheap enough that they don't want the cost of tools to be a barrier to people designing with them. I don't remember if the XC7A200T is a WebPack part, though.

I don't think there's any significant difference between US+ and 7-series LUTs, btw? It's still a 6-input LUT architecture with roughly the same non-LUT resources. US+ routing and clock trees are much improved, though, makes timing closure much easier.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply