|
arm64 is a really nice isa. in contrast, risc-v is garbage trash for idiots
|
# ¿ May 12, 2017 19:54 |
|
|
# ¿ May 16, 2024 06:47 |
|
ate all the Oreos posted:itanium had this great idea that like, man, what if the processor could do 6 instructions at once mannn modern cpus are great at doing a million things at once, that's part of the problem. instruction bundles are like branch delay slots, a short-sighted attempt to work around microarchitecture limitations that engineers were already fixing by the time the hardware came out. vliw lets you trivially perform a couple operations at once, but a modern cpu wants to be doing way more in parallel than that, so it's going to end up scheduling arbitrary operations and resolving data dependencies and speculating all over the place, at which point what exactly are you getting from vliw? meanwhile there will always be places where the compiler is just like, well gently caress i need to do a poo poo-ton of loads here and i've got pretty limited flexibility to reschedule them so i guess i'm not filling all these bundles and code size is going to hell
|
# ¿ May 12, 2017 20:22 |
|
The Management posted:but that's garbage and actively defeats modern microarchitectures. x86 is a terrible isa but it's not because it's got a million instructions, it's because it's a variable-length unaligned format with a ton of prefixes and suffixes and special case operand encodings many of which vary depending on the current processor mode, so you can't even figure out where the next instruction starts without fully decoding the current one which is extremely complex to do
|
# ¿ May 12, 2017 20:33 |
|
travelling wave posted:the whole interview is good if you're into that sorta thing this is a really good read, thank you
|
# ¿ May 13, 2017 19:25 |
|
Mr.Radar posted:if you don't mind could you elaborate on why riscv is trash? the core isa is super-riscy in that "yay we can run a lot of instructions now which is important because it'll take us twice as many instructions to do anything" sort of way. they made sure they covered all the basic c operations but anything even closely related like add-and-test-overflow or add-witih-carry is impossible to do efficiently with the base instructions. but mostly it's like, just read the instruction specifications and you'll see all sorts of bizarre and wasteful crap like the branch-immediate instruction (jal) has a 20-bit immediate operand, but it's stored in a crazy order where 0bTSRQPONMLKJIHGFEDCBA is actually reordered as 0bTJIHGFEDCBAKSRQPONML for as far as i can tell no reason at all. instructions are 32-bit so the pc is generally required to be 4-byte-aligned but the immediate offset is only implicitly multiplied by 2 so the instruction only has ±1MB range instead of ±2MB. instead of burning 1 bit on branch vs. branch-and-link it burns 5 bits so that you can use an arbitrary gpr as the link register (but you won't get return-address prediction unless you use x1), which i can kindof imagine ways to use but not for anything important enough to justify dropping 4 bits from the range of this instruction. and really it should be 5 because both of these are super-common instructions and it's worth burning a second opcode on them ok, next. the branch-register instruction (jalr) takes a 12-bit immediate offset. that offset is not scaled at all. the lowest bit of the target address is defined to be ignored, but not the lowest two bits so this can still fail dynamically from mis-alignment. as far as i can tell this immediate exists solely because they wanted to use a two-operand instruction format; i am really blanking on what it would be used for. the spec suggests it could be used to implement fast library calls by doing absolute branches to ±2KB, which is like, yes please let me just give memcpy a small integer absolute address, i am doing research into how easy i can make it to write security exploits
|
# ¿ May 15, 2017 18:20 |