This change adds optimized versions of the core memory functions, relying on 4-alignment, 2-alignment, and the SH4's unaligned move instruction to (hopefully) attain good performance in all situations.