Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
dlr
Jul 9, 2015

I'm fluid.
Of course, we have emulators, that work by 'mimicking' the specified program's architecture to another one. Would it be impossible to create a program that translates the opcodes to match another architecture? Has this already been done before automated? How would this work?

Adbot
ADBOT LOVES YOU

MrMoo
Sep 14, 2000

Transmeta Crusoe and a lot of RISC hardware springs to mind.

e: Code morphing apparently.

dlr
Jul 9, 2015

I'm fluid.

MrMoo posted:

Transmeta Crusoe and a lot of RISC hardware springs to mind.

e: Code morphing apparently.

Thank you! Wish it was open source, though. I think I could probably figure out how it works given the proper documentation of the architectures.

fryzoy
Sep 21, 2005
What.
I don't exactly understand, "a program that translates the opcodes to match another architecture" is an emulator. That would be dynamic translation, i.e. translating the program as it runs. If what you are looking for is to turn a program of one architecture into that of another in "one go", i.e. static recompilation, then that is harder. I know of https://github.com/trailofbits/mcsema that can recompile x86 into LLVM (from where it can be translated to all LLVM compiler backend).

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!
A lot of emulators essentially do this on the fly, but it's borderline impossible to get something that works in 100% of situations, and doing it on a compiled program might need a lot of metadata to be possible at all. The biggest problem is that instructions are variable-sized, a given sequence of bytes could represent multiple valid instruction sequences depending on where in the sequence you start execution, and jumps are to arbitrary locations in memory. Together, these create a ton of problems for figuring out what to do in cases like jump tables where the program might be expected to compute a jump offset from the current instruction pointer.

There are ways around that, like generating address translation tables, but they're much slower and consume much more memory than if you had compiled for that architecture in the first place, and it only works if you have enough information to determine where each instruction starts and are much more complicated if the program ever does nasty things like jumping into the middle of an instruction.

feedmegin
Jul 30, 2008

Also stuff like

https://en.wikipedia.org/wiki/Mac_68k_emulator
https://en.wikipedia.org/wiki/Rosetta_%28software%29

Transitive was a company that did a lot of work in this area back in the day.

feedmegin
Jul 30, 2008

OneEightHundred posted:

A lot of emulators essentially do this on the fly, but it's borderline impossible to get something that works in 100% of situations, and doing it on a compiled program might need a lot of metadata to be possible at all. The biggest problem is that instructions are variable-sized, a given sequence of bytes could represent multiple valid instruction sequences depending on where in the sequence you start execution, and jumps are to arbitrary locations in memory. Together, these create a ton of problems for figuring out what to do in cases like jump tables where the program might be expected to compute a jump offset from the current instruction pointer.

There are ways around that, like generating address translation tables, but they're much slower and consume much more memory than if you had compiled for that architecture in the first place, and it only works if you have enough information to determine where each instruction starts and are much more complicated if the program ever does nasty things like jumping into the middle of an instruction.

Only on x86! Barring stuff like Thumb, most RISC architectures use 32-bit-wide instructions that are 4-byte aligned.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
Complex instruction sets are basically all emulated today anyway. The instruction set the CPU exposes gets translated down into microinstructions for the actual CPU core to execute.

It turns out this is overall a good thing, because you can make a microinstruction set that executes really fast when you don't have to worry about things like "being just as performant on next year's CPU model" and "being compact enough that you're not saturating your memory bandwidth just by reading the instructions you're trying to execute".

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!

feedmegin posted:

Only on x86! Barring stuff like Thumb, most RISC architectures use 32-bit-wide instructions that are 4-byte aligned.
That solves jumping into the middle of instructions, but it doesn't resolve possibly needing to do address translation because an instruction on the source architecture may require multiple instructions on the translated architecture.

Also, how difficult any of this is also very dependent on the functionality needed by the original program. Jumping into the middle of an instruction or even just rewriting code at runtime is the kind of thing that seriously increases the complexity of translating the code while not usually being required. Signals and memory behavior are other places that may require serious performance sacrifices to emulate accurately. A lot of programs may require emulating extremely quirky behavior to work properly.

robostac
Sep 23, 2009
I read an interesting article about someone trying to do this with a NES game a few years ago - http://andrewkelley.me/post/jamulator.html.

Private Speech
Mar 30, 2011

I HAVE EVEN MORE WORTHLESS BEANIE BABIES IN MY COLLECTION THAN I HAVE WORTHLESS POSTS IN THE BEANIE BABY THREAD YET I STILL HAVE THE TEMERITY TO CRITICIZE OTHERS' COLLECTIONS

IF YOU SEE ME TALKING ABOUT BEANIE BABIES, PLEASE TELL ME TO

EAT. SHIT.


My masters thesis was on translating compiled CUDA to OpenCL, although that's a bit higher level than normal assembly. SPIR (i.e. compiled OpenCL) is heavily based on LLVM while PTX (i.e. compiled CUDA) is mostly based on shader languages, which means that you have things like infinite registers and such (register allocation is done later in drivers, well PTX does it when translating into binary form but that's functionally the same thing).

It's not actually all that difficult and is done surprisingly often in emulators (PPSSPP comes to mind). The main problem is in handling differences in architecture capabilities (different instruction sets et. al.) and things like rounding and register allocation/ register types. Apple has a patent on doing it in hardware even, I think it was from when they were switching to x86 with Macs link. If you want to look up more it's sometimes called dynamic recompilation as well, although that's more of an umbrella term that includes things like post-compiler optimisation and auto-vectorisation etc.

Jabor posted:

Complex instruction sets are basically all emulated today anyway. The instruction set the CPU exposes gets translated down into microinstructions for the actual CPU core to execute.

It turns out this is overall a good thing, because you can make a microinstruction set that executes really fast when you don't have to worry about things like "being just as performant on next year's CPU model" and "being compact enough that you're not saturating your memory bandwidth just by reading the instructions you're trying to execute".

The problem is you can't just feed the microinstructions into the processor you want to translate to (assuming you're on a CISC-type system). You have to code around the idiosyncrasies of whatever platform you're translating from and whatever platform you're translating to, even if both internally execute a very similar set of microinstructions.

The other thing is you're probably not going to be able to use things like media and vector instructions even if both platforms support them, because the recompiler won't have enough information to translate them unless both instruction sets are very similar. And then you have things like interrupt behaviour, protected execution, jumps etc. which can be really difficult, or even practically impossible to deal with. That's the actual hard bit, not your run-of-the-mill arithmetic instructions.

e: Basically the same thing as OneEightHundred said pretty much, at least the second part of my post

Private Speech fucked around with this message at 12:16 on Aug 14, 2015

eschaton
Mar 7, 2007

Don't you just hate when you wind up in a store with people who are in a socioeconomic class that is pretty obviously about two levels lower than your own?

dlr posted:

Of course, we have emulators, that work by 'mimicking' the specified program's architecture to another one. Would it be impossible to create a program that translates the opcodes to match another architecture? Has this already been done before automated? How would this work?

In addition to what everyone else has said, there have been tools to do this directly:

The Z-80 and 8088 were "assembly compatible" with the 8080, which I think was similarly compatible with the 8008. The idea being you can just reassemble existing code because they all support the same mnemonics (and then some) just using different opcodes.

There have also been "assembly translators" that take idiomatic assembly for one system and output assembly, C, etc. for another. There was one for 68K to PowerPC conversion on the Mac, I don't recall its name, but WordPerfect was rumored to have been ported to PowerPC this way and still maintained as a 68K assembly app. (Too bad T/Maker didn't do this with WriteNow.)

And OpenGenera from Symbolics was implemented as an emulator for their Ivory CPU in DEC Alpha assembly—which someone was then able to itself treat as a source language, to generate x86_64 code, and bring the emulator from OSF/1 on Alpha workstations to 64-bit Linux on PCs.

Adbot
ADBOT LOVES YOU

eschaton
Mar 7, 2007

Don't you just hate when you wind up in a store with people who are in a socioeconomic class that is pretty obviously about two levels lower than your own?

Jabor posted:

Complex instruction sets are basically all emulated today anyway. The instruction set the CPU exposes gets translated down into microinstructions for the actual CPU core to execute.

It turns out this is overall a good thing, because you can make a microinstruction set that executes really fast when you don't have to worry about things like "being just as performant on next year's CPU model" and "being compact enough that you're not saturating your memory bandwidth just by reading the instructions you're trying to execute".

This is how many CPUs have worked for a very long time. The innovation of RISC was to not do that, and instead expose far simpler one-cycle instructions and let the compiler handle things like ordering constraints to avoid pipeline stalls, etc.

What the major CPUs do these days is a lot more like the kind of JIT binary translation (say) HotSpot does than just translation of macroinstructions to microinstructions.

Trivia: Some systems had writable microstore, such as Lisp Machines, and in some cases (LMI) you could even compile a function to microcode if you were able to meet its space, access, etc. constraints and needed the speed.

More trivia: IBM's PC/370, which was an IBM 370 mainframe on a pair of ISA cards for the original IBM PC (intended primarily as a developer workstation) used a Motorola 68000 with custom microcode to implement the 370 CPU.

  • Locked thread