Non-x86 Platforms Thread: for all your ARM, RISC-V, SPARC, and PDP-11 needs

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > Non-x86 Platforms Thread: for all your ARM, RISC-V, SPARC, and PDP-11 needs

phongn: Oct 21, 2006

BobHoward posted:

I just did a bit of google searching on "Palo Alto Shipping Company" and apparently that was a little Forth startup. Forth is such a fascinating bit of computing history; it (or maybe Chuck Moore's genius) convinced a small dedicated following that it could and should be the basis for everything, but realistically Forth had no chance of actually doing that.

It did strongly influence Postscript, right?

# ¿ Sep 29, 2022 18:46

Adbot: ADBOT LOVES YOU

# ¿ May 14, 2024 20:49

phongn: Oct 21, 2006

ExcessBLarg! posted:

But what if Intel hadn't announced Itanium and exclusively pushed along x86 designs? Would there have been any significant difference in the outcome? I don't know how much the DEC/Compaq sale was the result of Itanium's annoucnement--maybe DEC would've held on and try to compete against Intel in the server market?

It just seems to me at the end of the day that other vendors couldn't afford to build fabs to outbuild Xeons, and by the early 00s the underlying architecture didn't really matter so much as process node and yields.

There's a nice thread by John Mashey on why DEC ultimately abandoned VAX (a very complex instruction set): Intel and AMD had huge volume that fed back in a virtuous cycle, and all the other guys didn't. And each generation of CPU became more and more expensive to do. HP's PA-RISC, SGI 'big' MIPS, Sun/Fujitsu SPARC, DEC Alpha: not one of them had the volume to compete.

PowerPC tried to hold on a bit with Apple, but the collapse of the PowerPC Reference Platform meant it would be forever volume-constrained, too. Neither Motorola nor IBM could possibly hope to compete with x86/x86-64 on volume. The biggest volume drivers, the consoles (XB360, PS3, GameCube/Wii) were all using relatively constrained processor designs, too, not the high-performance ones Apple demanded. Even the supercomputers were mostly using slower designs but massively parallel. Some of the IBM big iron used fast POWER designs, but they had no application to the wider market and there were never that many of them.

Who does have the volume to compete? The same story that let x86 win: something sold in enormous quantities no matter how slow, driving a virtuous investment cycle. That something is ARM.

JawnV6 posted:

you can talk about dedicating resources to the high end, and I distinctly recall Otellini talking up "if they sell 100 smartphones, we sold a $600 server core to the backend" explanation for why getting eaten alive from below was fine and dandy, but that was the big bet in my recollection

in a well actually posted:

That Intel decided the high margin low volume / low margin high volume split model for leading edge fab utilization economics that worked so well for servers/desktops wouldn�t be threatened by someone doing that with phones in volume was puzzling.

Sure, they still sell a lot of desktop chips (at probably higher margins than phone chips) but if TSMC (and Samsung) didn�t have customers and volume they�d have a hard time building leadership fabs.

Otellini considers not going for mobile when Apple asked them to be his biggest mistake. They could've had the 100 smartphones and the $600 server backend core. They then thought they could make up for lost time with process superiority in shrinking Atom and were never able to make it work.

Worse, it meant that TSMC would receive a flood of money from Apple, which eventually let them take the lead in fabrication process (assisted by Intel's failed bet on cobalt in their 10nm process). Intel's commanding lead was entirely based on process superiority, which is now gone. At best they'll probably be able to compete again with TSMC if they can execute well over the next few years.

phongn fucked around with this message at 22:20 on Oct 3, 2022

# ¿ Oct 3, 2022 20:20

phongn: Oct 21, 2006

feedmegin posted:

That's not what that post says at all. They could not keep up with RISC chips in the late 80s, which is why they moved to the DEC Alpha. Nobody in DEC was worried about 386's and poo poo in the server market yet. Not least because it would be a good decade and a half before before 64-bit x86 was a thing,

Note also he says "INTEL AND AMD CAN MAKE FAST X86S BECAUSE THEY HAVE VOLUME." This applies to both why DEC could not, and would not, scale VAX to a high-speed out-of-order CISC design, and why all the workstation RISC designs ultimately failed to compete with the ugly duckling x86 (and later x86-64). Does the quoted threat explicitly say it? No. Does the same lesson apply? Yes.

ExcessBLarg! posted:

I think it's generally accepted that RISC designs were outpacing CISC designs until they reached the knee-in-the-curve that building a processor around a RISC-ish internal architecture and an instruction set translator (or even just striaght-up do software emulation) became competitive again. For x86 that happened, with the P6/Pentium Pro? I'm not really an architecture guy.

Yes, it began with the P6. The Pentium Pro was a real shock to the RISC guys, because it was pretty damned fast despite being CISC. Decreasing transistor costs meant that implementing complicated decode stages became feasible (unless you want something really low power).

quote:

I'm sure DEC engineers in 1986 considered that and felt it was infeasible and just pushed on with a new architecture. Maybe they didn't expect VAX to have such a long tail in the market.

Now, if they had obstinately stuck with VAX through the 90s, could they have considered that? Or is there's something specific about the VAX architecture that makes it wildly difficult to implement that way compared to x86 or 68k.

In the Mashey post I linked earlier, if you search for "PART 3. Why it seems difficult to make an OOO VAX competitive" there's a pretty thorough post on why it would be difficult to implement a 'modern' VAX that decoded into micro-ops a'la x86. They also suspected that many instructions would have to be implemented in microcode, too. DEC just didn't have the resources to tackle all of these problems at once, and went for the hail-mary move of a new clean-sheet architecture.

As ugly as it was, what was then x86 was substantially simpler and easier to try and break down, and benefitted from high sales volume to keep the money firehose going. And of course, AMD grafted on a 64-bit extension that was inelegant but more or less worked and was easy to port a compiler to.

As an aside, I kinda wish that IBM had chosen the M68000 instead of the 8086 for the original IBM PC; it was a much cleaner design with a vastly nicer orthogonal ISA. Some people even made an out-of-order, 64-bit version.

phongn fucked around with this message at 20:35 on Oct 4, 2022

# ¿ Oct 4, 2022 20:27

phongn: Oct 21, 2006

BobHoward posted:

If you're what-ifing that, you have to change how the ISA evolved. 68K lost badly to x86 in the mid-1980s, not just the early 80s when IBM selected the 8088 because it was available and cheap at a time when the 68K was neither.

68K took a sharp turn for the worse with the 68020. Motorola's architects got blinded by that orthogonality and beauty and tried to continue the old "close the semantic gap between assembly and high level languages" CPU design philosophy that had led to what we now call CISC. The changes they made were all very pretty on paper, but made it hard to design chips with advanced microarchitectural features. This played a part in 68K falling well behind instead of keeping pace with x86.

(Apollo manages to be OoO because it's a bunch of Amiga cultists with no completely agreed upon project goal other than making something they think is cool to run AmigaOS on. With no commercial pressures, you don't have to simultaneously worry about things like clock speed and power, which makes it easier to do OoO just because you can.)

You can learn more by finding more old Mashey usenet posts! He had a neat series breaking down what makes a RISC a RISC, down to detailed tables comparing ISA features. x86 ends up being substantially closer to RISC than 68020, and in one of the most important ways (addressing modes).

I know there were a lot of sound business reasons for IBM picking the Intel processor, not the least price, second source availability, etc. I just know it was a candidate, and for its later ISA faults it did have a lot going for it that wouldn't really appear on Intel until the 386. Not having to deal with all the different types of memory models on x86 from the start would've been nice (though of course 68K had its own problems with developers using the upper address byte because it "wasn't used" at first). Not having to deal with the weird x87 stack-based FPU would also be nice.

While the 68020 started getting over-complex, Intel also made its own mistakes with the 286 (ref. Gates' reference to it being "brain-dead"). Motorola did seemingly realize its mistakes and removed some of those instructions later on, so I think some of these design issues could've been overcome? I don't think it was as complex as, say, VAX or iAPX 432. The 68060 was competitive with P5, at least. As for Apollo, I know it's made by Amiga fanatics and not a 'real' design with real commercial constraints. It's just kind of a neat project? There are people who dream of the WDC 65C832, too, for the sole reason they liked the accumulator-style MOS 6502.

(I've read a good amount of Mashey's posts on yarchive; I actually discovered that site first for all its rocketry tidbits).

phongn fucked around with this message at 22:50 on Oct 4, 2022

# ¿ Oct 4, 2022 22:46

phongn: Oct 21, 2006

I suppose my fondness for M68K is because it was my first assembly language (early CS courses; advanced courses used MIPS) and it powered a bunch of systems I had only good memories of (Macintosh, TI-89, etc.)

I wonder if Intel also avoided some of those mistakes (286 aside) because they were going to do all the super-CISC close-language-coupling in iAPX 432 instead.

# ¿ Oct 6, 2022 17:41

phongn: Oct 21, 2006

The variable length instructions of x86 make decoding more complicated than the fixed-length instructions seen on ARM. As BobHoward noted, x86 decoders are more of a bottleneck and this is one reason why.

Unfortunately x86-64 was built to be easy to port x86 compilers over to, and so kept some of the old ugliness like variable length instructions, made relatively few named registers, etc. aarch64 instead went with a cleaner slate when they rethought ARM.

# ¿ Jan 14, 2023 02:00

phongn: Oct 21, 2006

JawnV6 posted:

ILD is not a big problem. like the theoretical worst case is a bubble or two and there's a tradeoff with some fast path logic. but it's really not That Bad like everyone acts.

And yet M1 has gone wider than any x86-64 microarchitecture, and without much trouble feeding that extra-wide design?

quote:

right, right, it was a total mistake for x86-64 to not burn it all down and start from a totally fresh ISA.

how's itanium doing anyway??

AMD made the right decision given the market at the time, which was to make it easy for everyone to port over existing IA32 compilers. It also meant they brought in a lot of old cruft that could've been perhaps rethought. My surely obvious point was that perhaps a more aggressive ISA design could've been done, given the example of ARMv8's change from ARMv7.

Why bring IA64 into this except as a strawman?

# ¿ Jan 18, 2023 21:45

phongn: Oct 21, 2006

If anyone wants to see a few fun thoughts, Cliff Maier, who worked on both K6 and K8 (as well as , more or less bums around here (and sorta on MacRumors) and has nice little insights on how the sausage gets made.

He is (more than a bit) biased about Intel and AMD, so take what he says with some grains of salt, but it's not unlike reading yarchive's CPU section.

# ¿ Jan 18, 2023 21:52

phongn: Oct 21, 2006

hobbesmaster posted:

It�s not a strawman, it�s an example of how a big push for a radical change from x86 was unlikely to gain market acceptance.

Even with the example of arm, there�s a lot of armv8 processors running in the aarch32 execution state out there.

As I said, I am not arguing for a radical change like IA64, but wondering if something more than the "bolt on 64-bit to IA32" could be done, too. There is a continuum between those options. And sure, lots of ARMv8 processors are running in aarch32 mode. If anything that demonstrates that performant backwards compatibility with legacy code could be maintained while migrating to a nicer future?

# ¿ Jan 18, 2023 21:57

phongn: Oct 21, 2006

JawnV6 posted:

hahaha c'mon are you doing fixed-length decode or not?? this is acting like you can do both trivially instead of sharing those resources, either I'm selling a really lovely 64-bit chip with dead transistors leaking power or I'm not running 32-bit programs. neither would have sold well!

Nah, that's fair: got in over my head. BobHoward described things pretty well.

# ¿ Jan 20, 2023 05:21

phongn: Oct 21, 2006

eschaton posted:

If Itanium had been a 64-bit RISC, or even a 64-bit equivalent of i860, it probably would have taken off. Instead it was The Bizarro CPU and while it was eventually able to get some serious throughput (my Itanium 2 VMS box does pretty well running FORTRAN) the compiler problem was grossly underestimated as a factor.

Wasn't i860 also also a VLIW processor with a bunch of compiler-dependent scheduling and pipelining voodoo? DeMone at RWT wrote 22(!) years ago that the promises of IA64 reminded him of Intel's overblown ones for i860 years before.

in a well actually posted:

Yeah vliw is like the worst unless you�re doing a dsp (or doing hand optimized science code); for general purpose servers you couldn�t choose a worse architecture*. I2 tried to fix the problems with the architecture by putting a shitload of bandwidth in including an astounding for the time 9 mb cache.

I recall reading papers where the architects thought that the enormous transistor budgets going to out of order execution could not continue to scale and that it would be better used making a huge number of named registers and using magical compiler powers to explicitly schedule highly-threaded code. As you note, it ended up working very well for online transactional processing and various database tasks and hand-written HPC code, and atrociously bad for typical branchy, business-logic pointer-chasing code.

Intel eventually had to abandon pure VLIW/EPIC and their Poulson microarchitecture put back in dynamic scheduling and out of order execution (and SMT), but by then it was rather too late.

quote:

* anything that came to a commercial product; I�m sure some academics have done far worse

The Mill guys?

phongn fucked around with this message at 05:29 on Jan 21, 2023

# ¿ Jan 21, 2023 05:23

Adbot: ADBOT LOVES YOU

# ¿ May 14, 2024 20:49

phongn: Oct 21, 2006

Bunch of microprocessor guys and Linus Torvalds hang out on the forums at https://www.realworldtech.com but the main site is a shadow of what it once was. Chips and Cheese feels like its spiritual successor.

Ars� deep dive guys are long gone (which included the guy mentioned above). A number of people left to form Tech Report when Ars shifted to become more mainstream; many of TR�s people left to join industry and it too withered.

phongn fucked around with this message at 23:55 on Jan 18, 2024

# ¿ Jan 18, 2024 23:50

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > Non-x86 Platforms Thread: for all your ARM, RISC-V, SPARC, and PDP-11 needs