|
Of course, we have emulators, that work by 'mimicking' the specified program's architecture to another one. Would it be impossible to create a program that translates the opcodes to match another architecture? Has this already been done before automated? How would this work?
|
# ? Aug 6, 2015 00:26 |
|
|
# ? May 15, 2024 04:42 |
|
Transmeta Crusoe and a lot of RISC hardware springs to mind. e: Code morphing apparently.
|
# ? Aug 6, 2015 00:28 |
|
MrMoo posted:Transmeta Crusoe and a lot of RISC hardware springs to mind. Thank you! Wish it was open source, though. I think I could probably figure out how it works given the proper documentation of the architectures.
|
# ? Aug 6, 2015 00:57 |
|
I don't exactly understand, "a program that translates the opcodes to match another architecture" is an emulator. That would be dynamic translation, i.e. translating the program as it runs. If what you are looking for is to turn a program of one architecture into that of another in "one go", i.e. static recompilation, then that is harder. I know of https://github.com/trailofbits/mcsema that can recompile x86 into LLVM (from where it can be translated to all LLVM compiler backend).
|
# ? Aug 6, 2015 12:33 |
|
A lot of emulators essentially do this on the fly, but it's borderline impossible to get something that works in 100% of situations, and doing it on a compiled program might need a lot of metadata to be possible at all. The biggest problem is that instructions are variable-sized, a given sequence of bytes could represent multiple valid instruction sequences depending on where in the sequence you start execution, and jumps are to arbitrary locations in memory. Together, these create a ton of problems for figuring out what to do in cases like jump tables where the program might be expected to compute a jump offset from the current instruction pointer. There are ways around that, like generating address translation tables, but they're much slower and consume much more memory than if you had compiled for that architecture in the first place, and it only works if you have enough information to determine where each instruction starts and are much more complicated if the program ever does nasty things like jumping into the middle of an instruction.
|
# ? Aug 6, 2015 16:17 |
|
Also stuff like https://en.wikipedia.org/wiki/Mac_68k_emulator https://en.wikipedia.org/wiki/Rosetta_%28software%29 Transitive was a company that did a lot of work in this area back in the day.
|
# ? Aug 6, 2015 16:44 |
|
OneEightHundred posted:A lot of emulators essentially do this on the fly, but it's borderline impossible to get something that works in 100% of situations, and doing it on a compiled program might need a lot of metadata to be possible at all. The biggest problem is that instructions are variable-sized, a given sequence of bytes could represent multiple valid instruction sequences depending on where in the sequence you start execution, and jumps are to arbitrary locations in memory. Together, these create a ton of problems for figuring out what to do in cases like jump tables where the program might be expected to compute a jump offset from the current instruction pointer. Only on x86! Barring stuff like Thumb, most RISC architectures use 32-bit-wide instructions that are 4-byte aligned.
|
# ? Aug 6, 2015 16:46 |
|
Complex instruction sets are basically all emulated today anyway. The instruction set the CPU exposes gets translated down into microinstructions for the actual CPU core to execute. It turns out this is overall a good thing, because you can make a microinstruction set that executes really fast when you don't have to worry about things like "being just as performant on next year's CPU model" and "being compact enough that you're not saturating your memory bandwidth just by reading the instructions you're trying to execute".
|
# ? Aug 6, 2015 17:35 |
|
feedmegin posted:Only on x86! Barring stuff like Thumb, most RISC architectures use 32-bit-wide instructions that are 4-byte aligned. Also, how difficult any of this is also very dependent on the functionality needed by the original program. Jumping into the middle of an instruction or even just rewriting code at runtime is the kind of thing that seriously increases the complexity of translating the code while not usually being required. Signals and memory behavior are other places that may require serious performance sacrifices to emulate accurately. A lot of programs may require emulating extremely quirky behavior to work properly.
|
# ? Aug 6, 2015 19:19 |
|
I read an interesting article about someone trying to do this with a NES game a few years ago - http://andrewkelley.me/post/jamulator.html.
|
# ? Aug 6, 2015 20:15 |
|
My masters thesis was on translating compiled CUDA to OpenCL, although that's a bit higher level than normal assembly. SPIR (i.e. compiled OpenCL) is heavily based on LLVM while PTX (i.e. compiled CUDA) is mostly based on shader languages, which means that you have things like infinite registers and such (register allocation is done later in drivers, well PTX does it when translating into binary form but that's functionally the same thing). It's not actually all that difficult and is done surprisingly often in emulators (PPSSPP comes to mind). The main problem is in handling differences in architecture capabilities (different instruction sets et. al.) and things like rounding and register allocation/ register types. Apple has a patent on doing it in hardware even, I think it was from when they were switching to x86 with Macs link. If you want to look up more it's sometimes called dynamic recompilation as well, although that's more of an umbrella term that includes things like post-compiler optimisation and auto-vectorisation etc. Jabor posted:Complex instruction sets are basically all emulated today anyway. The instruction set the CPU exposes gets translated down into microinstructions for the actual CPU core to execute. The problem is you can't just feed the microinstructions into the processor you want to translate to (assuming you're on a CISC-type system). You have to code around the idiosyncrasies of whatever platform you're translating from and whatever platform you're translating to, even if both internally execute a very similar set of microinstructions. The other thing is you're probably not going to be able to use things like media and vector instructions even if both platforms support them, because the recompiler won't have enough information to translate them unless both instruction sets are very similar. And then you have things like interrupt behaviour, protected execution, jumps etc. which can be really difficult, or even practically impossible to deal with. That's the actual hard bit, not your run-of-the-mill arithmetic instructions. e: Basically the same thing as OneEightHundred said pretty much, at least the second part of my post Private Speech fucked around with this message at 12:16 on Aug 14, 2015 |
# ? Aug 14, 2015 10:43 |
|
dlr posted:Of course, we have emulators, that work by 'mimicking' the specified program's architecture to another one. Would it be impossible to create a program that translates the opcodes to match another architecture? Has this already been done before automated? How would this work? In addition to what everyone else has said, there have been tools to do this directly: The Z-80 and 8088 were "assembly compatible" with the 8080, which I think was similarly compatible with the 8008. The idea being you can just reassemble existing code because they all support the same mnemonics (and then some) just using different opcodes. There have also been "assembly translators" that take idiomatic assembly for one system and output assembly, C, etc. for another. There was one for 68K to PowerPC conversion on the Mac, I don't recall its name, but WordPerfect was rumored to have been ported to PowerPC this way and still maintained as a 68K assembly app. (Too bad T/Maker didn't do this with WriteNow.) And OpenGenera from Symbolics was implemented as an emulator for their Ivory CPU in DEC Alpha assembly—which someone was then able to itself treat as a source language, to generate x86_64 code, and bring the emulator from OSF/1 on Alpha workstations to 64-bit Linux on PCs.
|
# ? Apr 24, 2016 05:33 |
|
|
# ? May 15, 2024 04:42 |
|
Jabor posted:Complex instruction sets are basically all emulated today anyway. The instruction set the CPU exposes gets translated down into microinstructions for the actual CPU core to execute. This is how many CPUs have worked for a very long time. The innovation of RISC was to not do that, and instead expose far simpler one-cycle instructions and let the compiler handle things like ordering constraints to avoid pipeline stalls, etc. What the major CPUs do these days is a lot more like the kind of JIT binary translation (say) HotSpot does than just translation of macroinstructions to microinstructions. Trivia: Some systems had writable microstore, such as Lisp Machines, and in some cases (LMI) you could even compile a function to microcode if you were able to meet its space, access, etc. constraints and needed the speed. More trivia: IBM's PC/370, which was an IBM 370 mainframe on a pair of ISA cards for the original IBM PC (intended primarily as a developer workstation) used a Motorola 68000 with custom microcode to implement the 370 CPU.
|
# ? Apr 24, 2016 05:40 |