add a detailed README

This commit is contained in:
Lephenixnoir 2020-02-16 00:22:05 +01:00
parent 54a79ca4b4
commit b8faddce5b
Signed by: Lephenixnoir
GPG Key ID: 1BBA026E13FC0495
2 changed files with 230 additions and 1 deletions

View File

@ -105,7 +105,6 @@ install: $(TARGETS)
uninstall:
rm -f $(TARGETS:%=$(PREFIX)/%)
rm -rf $(PREFIX)/share/fxos
#
# Cleaning

230
README.md Normal file
View File

@ -0,0 +1,230 @@
# fxos
fxos is an extended disassembler specifically used to reverse-engineer the OS,
the bootcode, and syscalls. It used to be part of the
[fxSDK](/Lephenixnoir/fxsdk). If you have a use for fxos, then be sure to also
check the [Planète Casio bible](https://bible.planet-casio.com/), which gathers
most of the reverse-engineering knowledge and research of the community.
fxos runs on Linux and should build successfully on MacOS. If there are
compatibility issues with your favorite system, let me know.
fxos is not currently complete; it's definitely good enough for many practical
uses, but the overly broken analysis tools are not there yet. Hang on.
## Building
fxos is mainly standalone; to build, you will need the following tools. The
versions indicated are the ones I use, and clearly not the minimum
requirements.
* g++ (9.2.0)
* flex (2.6.4) and bison (3.5)
* make (eg. 4.2.1)
The only configure option is the install path; it is specified on the
command-line to make. By default the only installed file is the fxos binary,
which goes to `$PREFIX/bin`. The default prefix is `$HOME/.local`.
```sh
% make
% make install
# or, for instance:
% make PREFIX=/usr
% make install PREFIX=/usr
```
## Setting up the library
fxos works with a library of files ranging from OS binaries to assembler
instruction tables to lists of named syscalls. These resources are usually
public for the most part, but some of the reverse-engineering results of the
community are kept private.
A set of base files for a working library can be found [on my section of the
Planète Casio bible](https://bible.planet-casio.com/lephenixnoir/fxos-library/).
You can use your own files, but you probably want the assembler tables anyway.
Next, fxos should be told where to find these files. A small configuration file
should be added at `$HOME/.config/fxos/config` to do this. The configuration
file specifies two types of information:
* Where are the library folders; this is used to resolve relative paths.
* Which folders in the library contain fxos data files.
With the default library, the configuration file should look like this:
```
library: /path/to/fxos-library
load: /path/to/fxos-library/asm
load: /path/to/fxos-library/targets
load: /path/to/fxos-library/symbols
```
This means that fxos data files will be automatically loaded at startup from
the `asm`, `targets` and `symbols` directories. Targets refer to OS files and
RAM dumps by path, and these paths will be interpreted relatively to the
`fxos-library` folder. If you create `$PREFIX/share/fxos`, it will also be used
as if mentioned on a `library:` line.
## Working with fxos data files
fxos data files are used to input documentation into fxos. There are currently
three types of data files:
* Assembler decoding tables (`type: assembly`);
* Target descriptions (`type: target`);
* Symbol definitions to name registers and syscalls (`type: symbols`).
They all consist of a short dictionary-like header ended with three dashes, and
a body whose syntax varies depending on the type of file. Here is the data file
`targets/fx@3.10.txt`:
```
type: target
name: fx@3.10
---
ROM: os/fx/3.10/3.10.bin
ROM_P2: os/fx/3.10/3.10.bin
RAM: os/fx/3.10/RAM.bin
RAM_P2: os/fx/3.10/RAM.bin
RS: os/fx/3.10/RS.bin
```
The header indicates the type (needed to select the proper parser to read the
body!) and the name of the target. The concept of target is detailed below.
This file references other files from the `os` folder of the library.
At startup, directories mentioned as `load:` in the configuration file are
traversed recursively and all files there are loaded as data files.
## Targets
A target is the system that you want to study. Usually, it's an OS file, but it
occurs at several places in memory (namely at the start of P1 and P2), and it
can use data in RAM and RS memory. A target keeps all these memory regions
together.
The header of a target must contain:
* `type: target`
* A value for the `name` property, which is used to refer to that target.
The body of target consists of a list of *bindings*, which are mappings of
files into areas of the virtual memory. The syntax to specify a binding is
`<region>: <file>`, where:
* The region can be a named region such as `ROM` or `RAM_P2`. The name and
definitions of the available memory regions can be found in
[`lib/memory.cpp`](lib/memory.cpp).
* The region can be `<address>(<size>)`, where both address and size are
specified in hexadecimal without prefix. For example, `fd800000(800)` is
equivalent to `RS`.
* The file path must be relative to one of the library directories.
An example is shown above.
The target can then be referred to by name on the command-line. For instance,
general information about version 3.10 of the fx-9860G III OS can be queried by
running `fxos info fx@3.10`.
## Assembly tables
Assembly tables describe the binary instruction set of the processor. It is
unlikely that they will need to be modified any time soon.
The header of an assembly table consists of:
* `type: assembly`
* Optionally, a name, used to track files in case an opcode conflict occurs
(when two instructions can be instantiated into the same 16-bit opcode).
The body is a list of instructions. Each line consists of:
* The opcode pattern, a 16-character string using `01nmdi`.
* A mnemonic.
* Zero, one or two arguments among a finite set.
Here is an excerpt from the SH-4A extensions table.
```
type: assembly
name: sh-4a-extensions
---
0000nnnn01110011 movco.l r0, @rn
0000mmmm01100011 movli.l @rm, r0
0100mmmm10101001 movua.l @rm, r0
0100mmmm11101001 movua.l @rm+, r0
0000nnnn11000011 movca.l r0, @rn
```
Internally, fxos keeps a table with all 65k opcodes and fills it with instances
of instructions described in assembly tables.
## Symbol tables
Symbol tables help keep things symbolic by giving names to objects that arise
during disassembly. Currently it tracks syscalls and raw addresses (typically
of peripheral modules).
The header of a symbol table consists of:
* `type: symbols`
* Optionally, a name for the table.
The body is a list of symbols described as `<source> <name>`, where:
* The source can be a raw hexadecimal address, for example `ff2f0004`.
* The source can be a syscall number, written in hexadecimal with a leading
percent sign, for example `%03b`.
* The name should be vaguely C-compliant. Dots are allowed.
Here is a mixed example with both syscalls and address.
```
type: symbols
name: mixed-example
---
ff000020 TRA
ff000024 EXPEVT
ff000028 INTEVT
ff2f0004 EXPMASK
%42c Bfile_OpenFile_OS
%42d Bfile_CloseFile_OS
%42e Bfile_GetMediaFree_OS
%42f Bfile_GetFileSize_OS
```
## Command-line interface
The command-line interface (currently) has three commands, which are detailed
in the interactive help.
* The `library` command show the targets and assembly tables found in the
library, with minimal information. There is a lot of room to make it more
versatile.
* The `info` command shows a summary of an OS target. This includes versions,
checksums, and basic syscall autodetection.
* The `disasm` command is the main powerhouse of the tool. It disassembles
functions with smart function end detection, resolves references to jumps,
computes PC-relative loads, and identifies syscalls and peripheral
registers.
Some of the advertised interface is not yet implemented:
* The `analyze` command is conceived as a way to dig deep into a particular
object to understand what it is used for. An example would be: given a 32-bit
value, find all places in the code where it is loaded from memory, and match
these places with the known OS structure to see what kind of code uses it.
* The location specified `<address>:<len>` is not supported right now, though I
don't know how long I'll last without it.
## Reporting issues and results
Any bug reports, issues and improvement suggestions are welcome. See the
[bug tracker](/Lephenixnoir/fxos/issues).
If you have reverse-engineering results so share, the best place to do so is on
the [Planète Casio bible](https://bible.planet-casio.com). Ping me or
Breizh_craft on the Planète Casio shoutbox to have an SSH access set up for
you.