casio_doc/fontcharacter/formats/SOURCE.md

8.2 KiB

Source format

This format is yet to stabilize. If you just want to use the reference for conversions between FONTCHARACTER and other character sets (which should be managed by libcasio anyway), check out the latest binary format (BINARYx.md).

YAML has been chosen to store the information, as it's a storage format that a machine and a human can read and write quite easily.

Main file

main.yml is the file containing the main information about the source reference. It only contains two fields for now:

  • version is the version of the source reference (0.1 corresponds to this version);
  • source is the link to the FONTCHARACTER reference's source repository, managed through a VCS (Git, for that matter).

Sets

A set is basically a pack of characters appeared at the same time on CASIO calculators, or in an extension (alternative CASIO Basic interpreters/compilers).

sets.yml is the sets file. For each set:

  • the description field is the description of the set;
  • if the default field is there, then it is the default set to use (generally the most recent set made by CASIO);
  • if the leading field is there, the list of leading characters is in it, separated by commas;
  • if the parent field is there, then the set inherits all of the characters of its parents, and, if the child has no leading field, its parent's leading characters.

Categories

categories.yml is the categories file. Each category has an id field, which is the identification string, an optional prefix field and an optional sub list, which is the subcategories with each an id and a prefix fields. To access the subcategory "Latin Capital" in the category "Letter", the category field in the character information will have to be Letter/Latin Capital/Mini. The name of the character will then be prefixed by Mini Latin Capital Letter (with the spaces between prefixes and an ending space); the subcategory prefix goes first. If there is a suffix, a space then it are appended to the character name, for example, Digit.

There are some more fields -- see the Embedded CASIO BASIC documentation section.

Characters

There are two systems of characters on CASIO calculators: Simon Lothar calls them the "characters" and the "opcodes". The "characters" are simple characters with a display, and the "opcodes", which are defined by a set of characters (e.g. "Locate "). The two are described in two different tables on the calculator, but the two describe the same encoding, so that's why this reference considers all "characters" and "opcodes" as characters ("opcodes" are here called multi-characters).

characters.yml is the file containing data about the characters. For each character, the code field is its FONTCHARACTER code, the name field is the complete description of the character, the flags are the character flags and the category field is the category(/subcategory) ID (see in the last paragraph). If there is no category field, the category is "Other", with no prefix.

Flags is a list of flag strings. Current flags are:

  • nl: the character should be followed by a newline;
  • esc: the character's CTF token is escaped with a reverse solidus;
  • sep: the character is a Basic separator;
  • base: only accessible in BASE programs.

Some characters have an ASCII token representation, mostly for the cat, newcat, ctf and casemul formats. If the tokens field exists, then it is a dictionary of the tokens in the different formats.

  • If the cat field of the dictionary doesn't exist, its value is deduced recursively using the multi field is there, or from the unicode field (if all-ASCII), and prefixed by a reverse solidus '\';
  • If the newcat field of the dictionary doesn't exist, it takes its value from the cat field;
  • If the ctf field of the dictionary doesn't exist, it takes its value from the cat field if it was not deduced, otherwise, it is deduced the same way as the cat field, but it is not prefixed with a reverse solidus '\';
  • If the casemul field of the dictionary doesn't exist, it is deduced the same way than the ctf field;
  • If the ref field of the dictionary doesn't exist, it takes the (first) value of the ctf field.

There can be multiple tokens for one format; in this case, the value of the format field is a list.

It is possible to obtain an ASCII/HTML representation of most characters:

  • If tokens exist, take the ref token;
  • Otherwise, if the multi field is specified, then the representation can be obtained recursively by querying this field's elements;
  • Otherwise, no ASCII representation is available.

The id field is an identifier for the character, composed of letters, numbers and underscores. It can be used for C defines. If there is no id field, it is the value in the ascii field if it can be deduced (or the name field if it can't), with hyphens turned into underscores, and other non-valid characters removed (spaces, parenthesis, ...).

You have to distinguish multi-characters opcodes and simple opcodes. Multi-character opcodes are characters that simply are a sequence of simple characters. You can distinguish them from simple opcodes by checking the presence of a multi field, which then is the FONTCHARACTER codes of the characters in the sequence, separated with commas.

Multi-characters are distinguishable from simple characters by checking the presence of a multi field. The multi field is the FONTCHARACTER codes of the characters composing it, separated by commas. Be careful: there can be only one character for the multi-character, and Yaml won't interpret this as a string, but as a number directly!

If the character is simple, then if there is a unicode sequence equivalent of the character, the Unicode codes of the sequences separated with commas will be in the unicode field; otherwise, the field doesn't exist.

If the character data has a set field, then the character is in a set; otherwise, it should be considered as part of the default set.

Embedded CASIO BASIC documentation

Some characters will have the type field. This type means they have a special meaning in CASIO Basic. There are two types: function and object. There is an associated syntax, which is either <name>(arg1, arg2) or <name> arg1,arg2, the first syntax is when par is true and the second one is when it is false. Note that for the first syntax, the ending parenthesis is not mandatory.

If par is false (or non-existent), then the fix field can be set to infix, which means the function will be used with either arg1 <name> or arg1 <name> arg2.

If the function/object should receive arguments, it can be documented using the args field, and if it has, after these arguments, optional arguments, it can be documented with the optn field. These fields receives a list of argument strings. An argument type can be imposed by add-in :<code> at the end of the argument string; for example, here are the For and To entries:

-
 code: 0xF704
 name: For
 category: Statement
 args: ["to:0xF705"]
 action: ...
 multi: [0x46, 0x6F, 0x72, 0x20]
-
 code: 0xF705
 name: To
 category: Operator
 args: ["assign:0x0E"]
 optn: ["step:0xF706"]
 action: ...
 multi: [0x20, 0x54, 0x6F, 0x20]

If the function is supposed to make an action, this action can be documented using the action field. If it is supposed to return something, it should can be documented using the return field.

Fonts

fonts.yml is the file containing the fonts information. For each font, id is the ID string, name is the complete name, author is the complete author name, width and height are the dimensions of each character in the font.

For each font, there is a corresponding folder, named with the font ID. This folder contains the characters images, organized by the leading multi-byte character; if there is none, the file 0xXX.pbm will be chosen, otherwise, the file 0xLLXX.pbm will be chosen, where 0xLL is the leading character. If the file doesn't exist, the character is to be considered as blank.

Each existing file is a set of 256 tiles of width * height each. Each row is the tiles going from 0xR0 to 0xRF, where 0xR is the row number (0x0 to 0xF).