Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!
A lot of emulators essentially do this on the fly, but it's borderline impossible to get something that works in 100% of situations, and doing it on a compiled program might need a lot of metadata to be possible at all. The biggest problem is that instructions are variable-sized, a given sequence of bytes could represent multiple valid instruction sequences depending on where in the sequence you start execution, and jumps are to arbitrary locations in memory. Together, these create a ton of problems for figuring out what to do in cases like jump tables where the program might be expected to compute a jump offset from the current instruction pointer.

There are ways around that, like generating address translation tables, but they're much slower and consume much more memory than if you had compiled for that architecture in the first place, and it only works if you have enough information to determine where each instruction starts and are much more complicated if the program ever does nasty things like jumping into the middle of an instruction.

Adbot
ADBOT LOVES YOU

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!

feedmegin posted:

Only on x86! Barring stuff like Thumb, most RISC architectures use 32-bit-wide instructions that are 4-byte aligned.
That solves jumping into the middle of instructions, but it doesn't resolve possibly needing to do address translation because an instruction on the source architecture may require multiple instructions on the translated architecture.

Also, how difficult any of this is also very dependent on the functionality needed by the original program. Jumping into the middle of an instruction or even just rewriting code at runtime is the kind of thing that seriously increases the complexity of translating the code while not usually being required. Signals and memory behavior are other places that may require serious performance sacrifices to emulate accurately. A lot of programs may require emulating extremely quirky behavior to work properly.

  • Locked thread