README for the shell interface
This commit is contained in:
parent
2394725074
commit
54721cac93
233
README.md
233
README.md
|
@ -1,16 +1,30 @@
|
|||
# fxos
|
||||
|
||||
fxos is an extended disassembler specifically used to reverse-engineer the OS,
|
||||
the bootcode, and syscalls. It used to be part of the
|
||||
bootcode, and syscalls of CASIO fx and fx-CG series. It used to be part of the
|
||||
[fxSDK](/Lephenixnoir/fxsdk). If you have a use for fxos, then be sure to also
|
||||
check the [Planète Casio bible](https://bible.planet-casio.com/), which gathers
|
||||
most of the reverse-engineering knowledge and research of the community.
|
||||
|
||||
If you're familiar with IDA, Ghidra, or other industry-grade
|
||||
reverse-engineering tools, then fxos won't be able to complete. This is more of
|
||||
a scripting playground with very OS-centric features for me. Some of the things
|
||||
it can do that usual tools might not do directly include:
|
||||
|
||||
* Finding OS-specific data like bootcode/OS headers/footers, dates, versions
|
||||
* Computing and checking checksums
|
||||
* Analyzing syscall tables and consistently identifying syscall table entries
|
||||
* (TODO) Comparing functions across OS versions to find changes
|
||||
|
||||
On the other hand, there are no call graph, cross-references, or function type
|
||||
analysis (yet). I have plans for a simple abstract interpreter to bridge some
|
||||
of the gap between pure disassembly and decompilation.
|
||||
|
||||
fxos runs on Linux and should build successfully on MacOS. If there are
|
||||
compatibility issues with your favorite system, let me know.
|
||||
|
||||
fxos is not currently complete; it's definitely good enough for many practical
|
||||
uses, but the overly broken analysis tools are not there yet. Hang on.
|
||||
**Note**: The [fxdoc repository](/Lephenixnoir/fxdoc) is not up-to-date with
|
||||
this version of fxos (yet).
|
||||
|
||||
## Building
|
||||
|
||||
|
@ -19,7 +33,7 @@ versions indicated are the ones I use, and clearly not the minimum
|
|||
requirements.
|
||||
|
||||
* g++ (9.2.0)
|
||||
* flex (2.6.4) and bison (3.5)
|
||||
* flex (2.6.4)
|
||||
* CMake (3.15) and make (eg. 4.2.1)
|
||||
|
||||
The only real configure option is the install path. CMake's default is
|
||||
|
@ -33,110 +47,78 @@ The only real configure option is the install path. CMake's default is
|
|||
## Setting up the library
|
||||
|
||||
fxos works with a library of files ranging from OS binaries to assembler
|
||||
instruction tables to lists of named syscalls. These resources are usually
|
||||
public for the most part, but some of the reverse-engineering results of the
|
||||
community are kept private.
|
||||
instruction tables to scripts. The library is formed of one or more folders
|
||||
defined in the `FXOS_PATH` environment variable.
|
||||
|
||||
A set of base files for a working library can be found in the
|
||||
[`base-library` folder](base-library) of this repository, which includes a
|
||||
suitable configuration file (but not the actual OS files because Git would not
|
||||
appreciate it). But unless you want to redo the research by yourself, I suggest
|
||||
using shared community data from the [fxdoc repository](/Lephenixnoir/fxdoc).
|
||||
Folders in the path serve two purposes:
|
||||
* Any `fxosrc` file at the root of a folder is executed at startup.
|
||||
* All paths are interpreted relative to the `FXOS_PATH`.
|
||||
|
||||
Next, fxos should be told where to find these files. A small configuration file
|
||||
should be added at `$HOME/.config/fxos/config` to do this. The configuration
|
||||
file specifies two types of information:
|
||||
Unless you want to redo the research by yourself, I suggest using shared
|
||||
community data from the [fxdoc repository](/Lephenixnoir/fxdoc). New folders
|
||||
could be created easily on the same model; read the `fxosrc` script to see how
|
||||
it is structured.
|
||||
|
||||
* Where are the library folders; this is used to resolve relative paths.
|
||||
* Which folders in the library contain fxos data files.
|
||||
**TODO**: fxdoc is still using an older version of fxos.
|
||||
|
||||
With the default library, the configuration file should look like this:
|
||||
## Main concepts
|
||||
|
||||
fxos has a command-line interface *kind of* like rizin. Type `?` to get a list
|
||||
of commands, and any command name followed by `?` to get help on a particular
|
||||
command (eg. `vc?`).
|
||||
|
||||
The dot command `.` is used to run a script, which is a file with a series of
|
||||
fxos commands. This is used at startup to run every `fxosrc` script found in
|
||||
the `FXOS_PATH`.
|
||||
|
||||
**Notations**
|
||||
|
||||
* Identifiers/names are C identifiers but dots (`.`) are allowed.
|
||||
* Usual decimal, hex (`0x`), binary (`0b`) values.
|
||||
* Syscalls are identified with `%<hex>`, such as `%01e`.
|
||||
* `$` is the current position in the selected virtual space.
|
||||
* Commands accept arithmetic but only within parentheses; you can write
|
||||
`e (1+2)` but not `e 1+2`.
|
||||
* Ranges can be specified as `<start>:<length>` or `<start>..<end>`.
|
||||
* Paths should use quotes: `"/os/fx/3.10/3.10.bin"`. Only identifiers/names can
|
||||
be written without quotes in commands.
|
||||
* Commands can be chained with `;`.
|
||||
* Anything from a `#` to end of line is a comment.
|
||||
|
||||
**Virtual spaces**
|
||||
|
||||
A *virtual space* is an emulation of the calculator's virtual memory. Usually
|
||||
there is one for each OS being studied. Each virtual space has a number of
|
||||
*bindings*, which is a mapping from a virtual address to a file (usually a dump
|
||||
of the calculator's ROM or RAM). Use `vl` to show the virtual spaces and their
|
||||
bindings. The name of the current virtual space is shown in the prompt along
|
||||
with the current position, for instance:
|
||||
|
||||
```
|
||||
library: /path/to/base-library
|
||||
load: /path/to/base-library/asmtables
|
||||
load: /path/to/base-library/targets
|
||||
load: /path/to/base-library/symbols
|
||||
cg_3.60 @ 0x80000000>
|
||||
```
|
||||
|
||||
This means that fxos data files will be automatically loaded at startup from
|
||||
the `asmtables`, `targets` and `symbols` directories. Targets refer to OS files
|
||||
and RAM dumps by path, and these paths will be interpreted relatively to the
|
||||
`base-library` folder.
|
||||
A new empty space can be created with `vc`, and then files can be mapped
|
||||
manually with `vm`. File paths are interpreted relative to `FXOS_PATH` folders
|
||||
even if they start with `/`. Alternatively, a new virtual space can be created
|
||||
and initialized by running a script with the `vct` command.
|
||||
|
||||
## Working with fxos data files
|
||||
Finally, the `vs` command is used to switch between different virtual spaces.
|
||||
|
||||
fxos data files are used to input documentation into fxos. There are currently
|
||||
three types of data files:
|
||||
**Symbols**
|
||||
|
||||
* Assembler decoding tables (`type: assembly`);
|
||||
* Target descriptions (`type: target`);
|
||||
* Symbol definitions to name registers and syscalls (`type: symbols`).
|
||||
Each virtual space can have symbols defined, which are names associated to
|
||||
either addresses or syscall numbers. `sa` will define a new symbol at an
|
||||
explicit address, `ss` will define a new symbol at a syscall entry (which is
|
||||
kept symbolic, ie. it will work across different OS versions) and `sl` lists
|
||||
all symbols for the current virtual space.
|
||||
|
||||
They all consist of a short dictionary-like header ended with three dashes, and
|
||||
a body whose syntax varies depending on the type of file. Here is the data file
|
||||
`targets/fx@3.10.txt`:
|
||||
## File formats
|
||||
|
||||
```
|
||||
type: target
|
||||
name: fx@3.10
|
||||
---
|
||||
Besides fxos scripts and the actual binary files being used, there is currently
|
||||
only one other type of data file: assembly instruction listings. See
|
||||
`asm/sh3.txt` for an explanation of the syntax; essentially each line has:
|
||||
|
||||
ROM: os/fx/3.10/3.10.bin
|
||||
ROM_P2: os/fx/3.10/3.10.bin
|
||||
|
||||
RAM: os/fx/3.10/RAM.bin
|
||||
RAM_P2: os/fx/3.10/RAM.bin
|
||||
|
||||
RS: os/fx/3.10/RS.bin
|
||||
```
|
||||
|
||||
The header indicates the type (needed to select the proper parser to read the
|
||||
body!) and the name of the target. The concept of target is detailed below.
|
||||
This file references other files from the `os` folder of the library.
|
||||
|
||||
At startup, directories mentioned as `load:` in the configuration file are
|
||||
traversed recursively and all files there are loaded as data files.
|
||||
|
||||
## Targets
|
||||
|
||||
A target is the system that you want to study. Usually, it's an OS file, but it
|
||||
occurs at several places in memory (namely at the start of P1 and P2), and it
|
||||
can use data in RAM and RS memory. A target keeps all these memory regions
|
||||
together.
|
||||
|
||||
The header of a target must contain:
|
||||
* `type: target`
|
||||
* A value for the `name` property, which is used to refer to that target.
|
||||
|
||||
The body of target consists of a list of *bindings*, which are mappings of
|
||||
files into areas of the virtual memory. The syntax to specify a binding is
|
||||
`<region>: <file>`, where:
|
||||
* The region can be a named region such as `ROM` or `RAM_P2`. The names and
|
||||
definitions of defined memory regions can be found in
|
||||
[`lib/memory.cpp`](lib/memory.cpp).
|
||||
* The region can be `<address>(<size>)`, where both address and size are
|
||||
specified in hexadecimal without prefix. For example, `fd800000(800)` is
|
||||
equivalent to `RS`.
|
||||
* The file path must be relative to one of the library directories.
|
||||
|
||||
An example is shown above.
|
||||
|
||||
The target can then be referred to by name on the command-line. For instance,
|
||||
general information about version 3.10 of the fx-9860G III OS can be queried by
|
||||
running `fxos info fx@3.10`.
|
||||
|
||||
## Assembly tables
|
||||
|
||||
Assembly tables describe the binary instruction set of the processor. It is
|
||||
unlikely that they will need to be modified any time soon.
|
||||
|
||||
The header of an assembly table consists of:
|
||||
* `type: assembly`
|
||||
* Optionally, a name, used to track files in case an opcode conflict occurs
|
||||
(when two instructions can be instantiated into the same 16-bit opcode).
|
||||
|
||||
The body is a list of instructions. Each line consists of:
|
||||
* The opcode pattern, a 16-character string using `01nmdi`.
|
||||
* A mnemonic.
|
||||
* Zero, one or two arguments among a finite set.
|
||||
|
@ -155,65 +137,20 @@ name: sh-4a-extensions
|
|||
0000nnnn11000011 movca.l r0, @rn
|
||||
```
|
||||
|
||||
Internally, fxos keeps a table with all 65k opcodes and fills it with instances
|
||||
of instructions described in assembly tables.
|
||||
Internally, fxos keeps a table with all 65536 opcodes and fills it with
|
||||
instances of instructions described in assembly tables.
|
||||
|
||||
## Symbol tables
|
||||
|
||||
Symbol tables help keep things symbolic by giving names to objects that arise
|
||||
during disassembly. Currently it tracks syscalls and raw addresses (typically
|
||||
of peripheral modules).
|
||||
|
||||
The header of a symbol table consists of:
|
||||
* `type: symbols`
|
||||
* Optionally, a name for the table.
|
||||
|
||||
The body is a list of symbols described as `<source> <name>`, where:
|
||||
* The source can be a raw hexadecimal address, for example `ff2f0004`.
|
||||
* The source can be a syscall number, written in hexadecimal with a leading
|
||||
percent sign, for example `%03b`.
|
||||
* The name should be vaguely C-compliant. Dots are allowed.
|
||||
|
||||
Here is a mixed example with both syscalls and address.
|
||||
(TODO) Disassembly listings are intended to be produced and maintained by fxos
|
||||
while still being edited by hand. In order for this to work properly, manual
|
||||
edits should only use `#`-comments, either at the start of a line or with a `#`
|
||||
symbol followed by a space (to distinguish from constants like `#3`):
|
||||
|
||||
```
|
||||
type: symbols
|
||||
name: mixed-example
|
||||
---
|
||||
|
||||
ff000020 TRA
|
||||
ff000024 EXPEVT
|
||||
ff000028 INTEVT
|
||||
ff2f0004 EXPMASK
|
||||
|
||||
%42c Bfile_OpenFile_OS
|
||||
%42d Bfile_CloseFile_OS
|
||||
%42e Bfile_GetMediaFree_OS
|
||||
%42f Bfile_GetFileSize_OS
|
||||
# Set SR.BL = 1 (block interrupt) and SR.IMASK = 0x00*0 (error ?)
|
||||
4143a: 04 02 stc sr,r4 # get SR register.
|
||||
4143c: e5 10 mov #16,r5 # r5 = 0x00000010
|
||||
```
|
||||
|
||||
## Command-line interface
|
||||
|
||||
The command-line interface (currently) has three commands, which are detailed
|
||||
in the interactive help.
|
||||
|
||||
* The `library` command show the targets and assembly tables found in the
|
||||
library, with minimal information. There is a lot of room to make it more
|
||||
versatile.
|
||||
* The `info` command shows a summary of an OS target. This includes versions,
|
||||
checksums, and basic syscall autodetection.
|
||||
* The `disasm` command is the main powerhouse of the tool. It disassembles
|
||||
functions with smart function end detection, resolves references to jumps,
|
||||
computes PC-relative loads, and identifies syscalls and peripheral
|
||||
registers.
|
||||
|
||||
Some of the advertised interface is not yet implemented:
|
||||
|
||||
* The `analyze` command is conceived as a way to dig deep into a particular
|
||||
object to understand what it is used for. An example would be: given a 32-bit
|
||||
value, find all places in the code where it is loaded from memory, and match
|
||||
these places with the known OS structure to see what kind of code uses it.
|
||||
|
||||
## Reporting issues and results
|
||||
|
||||
Any bug reports, issues and improvement suggestions are welcome. See the
|
||||
|
|
|
@ -63,6 +63,8 @@ static void disassemble(Session &session, Disassembly &disasm,
|
|||
|
||||
static uint32_t parse_d(Session &session, Parser &parser)
|
||||
{
|
||||
if(!session.current_space)
|
||||
return 0;
|
||||
uint32_t address = session.current_space->cursor;
|
||||
|
||||
if(!parser.at_end())
|
||||
|
|
|
@ -158,7 +158,7 @@ void _dot(Session &s, std::vector<std::string> const &files, bool absolute)
|
|||
|
||||
static std::string read_interactive(Session const &s, bool &leave)
|
||||
{
|
||||
std::string prompt = "(empty)> ";
|
||||
std::string prompt = "no_vspace> ";
|
||||
if(s.current_space) {
|
||||
std::string name = "(none)";
|
||||
|
||||
|
|
Loading…
Reference in New Issue