Simplify cpu_loop by doing all of the decode in translate.
This fixes a bug in that cpu_loop was not handling the
different layout of the R6 version of break16. This fixes
a bug in that cpu_loop extracted the wrong bits for the
mips16e break16 instruction.