Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
arm64 is a really nice isa. in contrast, risc-v is garbage trash for idiots

Adbot
ADBOT LOVES YOU

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

ate all the Oreos posted:

itanium had this great idea that like, man, what if the processor could do 6 instructions at once mannn

turns out that's actually a bad idea because figuring out at any given point in a program you *can* do instructions simultaneously is real hard unless you specifically code poo poo with this special snowflake platform in mind which nobody's gonna do

modern cpus are great at doing a million things at once, that's part of the problem. instruction bundles are like branch delay slots, a short-sighted attempt to work around microarchitecture limitations that engineers were already fixing by the time the hardware came out. vliw lets you trivially perform a couple operations at once, but a modern cpu wants to be doing way more in parallel than that, so it's going to end up scheduling arbitrary operations and resolving data dependencies and speculating all over the place, at which point what exactly are you getting from vliw? meanwhile there will always be places where the compiler is just like, well gently caress i need to do a poo poo-ton of loads here and i've got pretty limited flexibility to reschedule them so i guess i'm not filling all these bundles and code size is going to hell

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

The Management posted:

but that's garbage and actively defeats modern microarchitectures.

x86 is a terrible isa but it's not because it's got a million instructions, it's because it's a variable-length unaligned format with a ton of prefixes and suffixes and special case operand encodings many of which vary depending on the current processor mode, so you can't even figure out where the next instruction starts without fully decoding the current one which is extremely complex to do

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

this is a really good read, thank you

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

Mr.Radar posted:

if you don't mind could you elaborate on why riscv is trash?

the core isa is super-riscy in that "yay we can run a lot of instructions now which is important because it'll take us twice as many instructions to do anything" sort of way. they made sure they covered all the basic c operations but anything even closely related like add-and-test-overflow or add-witih-carry is impossible to do efficiently with the base instructions. but mostly it's like, just read the instruction specifications and you'll see all sorts of bizarre and wasteful crap

like the branch-immediate instruction (jal) has a 20-bit immediate operand, but it's stored in a crazy order where 0bTSRQPONMLKJIHGFEDCBA is actually reordered as 0bTJIHGFEDCBAKSRQPONML for as far as i can tell no reason at all. instructions are 32-bit so the pc is generally required to be 4-byte-aligned but the immediate offset is only implicitly multiplied by 2 so the instruction only has ±1MB range instead of ±2MB. instead of burning 1 bit on branch vs. branch-and-link it burns 5 bits so that you can use an arbitrary gpr as the link register (but you won't get return-address prediction unless you use x1), which i can kindof imagine ways to use but not for anything important enough to justify dropping 4 bits from the range of this instruction. and really it should be 5 because both of these are super-common instructions and it's worth burning a second opcode on them

ok, next. the branch-register instruction (jalr) takes a 12-bit immediate offset. that offset is not scaled at all. the lowest bit of the target address is defined to be ignored, but not the lowest two bits so this can still fail dynamically from mis-alignment. as far as i can tell this immediate exists solely because they wanted to use a two-operand instruction format; i am really blanking on what it would be used for. the spec suggests it could be used to implement fast library calls by doing absolute branches to ±2KB, which is like, yes please let me just give memcpy a small integer absolute address, i am doing research into how easy i can make it to write security exploits

  • Locked thread