8.2 KiB
Source format
This format is yet to stabilize. If you just want to use the reference for
conversions between FONTCHARACTER and other character sets (which should
be managed by libcasio anyway), check out the latest binary
format (BINARYx.md
).
YAML has been chosen to store the information, as it's a storage format that a machine and a human can read and write quite easily.
Main file
main.yml
is the file containing the main information about the source
reference. It only contains two fields for now:
version
is the version of the source reference (0.1
corresponds to this version);source
is the link to the FONTCHARACTER reference's source repository, managed through a VCS (Git, for that matter).
Sets
A set is basically a pack of characters appeared at the same time on CASIO calculators, or in an extension (alternative CASIO Basic interpreters/compilers).
sets.yml
is the sets file. For each set:
- the
description
field is the description of the set; - if the
default
field is there, then it is the default set to use (generally the most recent set made by CASIO); - if the
leading
field is there, the list of leading characters is in it, separated by commas; - if the
parent
field is there, then the set inherits all of the characters of its parents, and, if the child has noleading
field, its parent's leading characters.
Categories
categories.yml
is the categories file. Each category has an id
field, which
is the identification string, an optional prefix
field and an optional sub
list, which is the subcategories with each an id
and a prefix
fields.
To access the subcategory "Latin Capital" in the category "Letter", the
category
field in the character information will have to be
Letter/Latin Capital/Mini
. The name of the character will then be prefixed by
Mini Latin Capital Letter
(with the spaces between prefixes and an ending
space); the subcategory prefix goes first. If there is a suffix, a space then
it are appended to the character name, for example, Digit
.
There are some more fields -- see the Embedded CASIO BASIC documentation section.
Characters
There are two systems of characters on CASIO calculators: Simon Lothar calls them the "characters" and the "opcodes". The "characters" are simple characters with a display, and the "opcodes", which are defined by a set of characters (e.g. "Locate "). The two are described in two different tables on the calculator, but the two describe the same encoding, so that's why this reference considers all "characters" and "opcodes" as characters ("opcodes" are here called multi-characters).
characters.yml
is the file containing data about the characters. For each
character, the code
field is its FONTCHARACTER
code, the name
field is
the complete description of the character, the flags
are the character flags
and the category
field is the category(/subcategory) ID (see in the last
paragraph). If there is no category field, the category is "Other", with no
prefix.
Flags is a list of flag strings. Current flags are:
nl
: the character should be followed by a newline;esc
: the character's CTF token is escaped with a reverse solidus;sep
: the character is a Basic separator;base
: only accessible in BASE programs.
Some characters have an ASCII token representation, mostly for the cat,
newcat, ctf and casemul formats. If the tokens
field exists, then
it is a dictionary of the tokens in the different formats.
- If the
cat
field of the dictionary doesn't exist, its value is deduced recursively using themulti
field is there, or from theunicode
field (if all-ASCII
), and prefixed by a reverse solidus '\'; - If the
newcat
field of the dictionary doesn't exist, it takes its value from thecat
field; - If the
ctf
field of the dictionary doesn't exist, it takes its value from thecat
field if it was not deduced, otherwise, it is deduced the same way as thecat
field, but it is not prefixed with a reverse solidus '\'; - If the
casemul
field of the dictionary doesn't exist, it is deduced the same way than thectf
field; - If the
ref
field of the dictionary doesn't exist, it takes the (first) value of thectf
field.
There can be multiple tokens for one format; in this case, the value of the format field is a list.
It is possible to obtain an ASCII/HTML representation of most characters:
- If tokens exist, take the
ref
token; - Otherwise, if the
multi
field is specified, then the representation can be obtained recursively by querying this field's elements; - Otherwise, no ASCII representation is available.
The id
field is an identifier for the character, composed of letters,
numbers and underscores. It can be used for C defines.
If there is no id
field, it is the value in the ascii
field if it can
be deduced (or the name
field if it can't), with hyphens turned into
underscores, and other non-valid characters removed (spaces, parenthesis, ...).
You have to distinguish multi-characters opcodes and simple opcodes.
Multi-character opcodes are characters that simply are a sequence of simple
characters. You can distinguish them from simple opcodes by checking the
presence of a multi
field, which then is the FONTCHARACTER
codes of the
characters in the sequence, separated with commas.
Multi-characters are distinguishable from simple characters by checking the
presence of a multi
field. The multi
field is the FONTCHARACTER
codes of
the characters composing it, separated by commas. Be careful: there can be
only one character for the multi-character, and Yaml won't interpret this as
a string, but as a number directly!
If the character is simple, then if there is a unicode sequence equivalent of
the character, the Unicode codes of the sequences separated with commas will be
in the unicode
field; otherwise, the field doesn't exist.
If the character data has a set
field, then the character is in a set;
otherwise, it should be considered as part of the default set.
Embedded CASIO BASIC documentation
Some characters will have the type
field. This type means they have a special
meaning in CASIO Basic. There are two types: function
and object
. There is
an associated syntax, which is either <name>(arg1, arg2)
or
<name> arg1,arg2
, the first syntax is when par
is true
and the second one
is when it is false
.
Note that for the first syntax, the ending parenthesis is not mandatory.
If par
is false
(or non-existent), then the fix
field can be
set to infix
, which means the function will be used with either
arg1 <name>
or arg1 <name> arg2
.
If the function/object should receive arguments, it can be documented using the
args
field, and if it has, after these arguments, optional arguments, it can
be documented with the optn
field. These fields receives a list of argument
strings. An argument type can be imposed by add-in :<code>
at the end of the
argument string; for example, here are the For
and To
entries:
-
code: 0xF704
name: For
category: Statement
args: ["to:0xF705"]
action: ...
multi: [0x46, 0x6F, 0x72, 0x20]
-
code: 0xF705
name: To
category: Operator
args: ["assign:0x0E"]
optn: ["step:0xF706"]
action: ...
multi: [0x20, 0x54, 0x6F, 0x20]
If the function is supposed to make an action, this action can be documented
using the action
field. If it is supposed to return something, it should can
be documented using the return
field.
Fonts
fonts.yml
is the file containing the fonts information. For each font,
id
is the ID string, name
is the complete name, author
is the complete
author name, width
and height
are the dimensions of each character in
the font.
For each font, there is a corresponding folder, named with the font ID.
This folder contains the characters images, organized by the leading multi-byte
character; if there is none, the file 0xXX.pbm
will be chosen, otherwise,
the file 0xLLXX.pbm
will be chosen, where 0xLL
is the leading character.
If the file doesn't exist, the character is to be considered as blank.
Each existing file is a set of 256 tiles of width * height
each. Each row is
the tiles going from 0xR0
to 0xRF
, where 0xR
is the row number
(0x0 to 0xF).