2
0
Fork 0
textout/README.rst

3.5 KiB

Planète Casio's textout() BBcode markup language translator

Warning

If you are accessing this repository from <https://git.planet-casio.com>_, keep in mind that it is only a mirror and that the real repository is located at <https://forge.touhey.fr/pc/textout.git>_ for now.

BBcode has been invented in the 90s/2000s for bulletin board systems. It has been implemented in Planète Casio during its first years (although some research has to be made on how that choice was done…).

On Planète Casio, which is coded in PHP at the time I'm writing this, we have our own custom version of BBcode, which we pass through an internal utility named textout().

I, Thomas “Cakeisalie5” Touhey, rewrote it recently, and it works pretty well while being secure, but as the next version of Planète Casio (the ”v5”) will be written from scratch, I figured out I could rewrite the textout() utility in Python, and improve the language parsing to be more practical and add features that are in the original BBcode markup language.

As this is a rewrite, the vulnerabilities and bug will not be common to this project and the online version of the transcoder.

Usage

To use this module, simply use the to<language>() functions once imported:

text = "Hello, [i]beautiful [b]world[/i]!"
print(textoutpc.tohtml(text))
print("---")
print(textoutpc.tolightscript(text))

The supported output types are:

Tweaks

The tohtml() and tolightscript() can take additional keywords that tags can read so that they can adapt their behaviour. The name of the tweaks are case-insensitive and non-alphanumeric characters are ignored: for example, label_prefix, LABELPREFIX and __LaBeL___PRE_FIX__ are all equivalent.

The following tweaks are read by the translator and built-in tags:

  • label_prefix (HTML): prefix to be used by the [label] and [target] tags, e.g. msg45529-. Defaults to "" for PCv42 compatibility;
  • obsolete_tags (HTML): use obsolete HTML tags for old browsers (e.g. lynx) compatibility, e.g. <b>, <i>, <center>, and others. Defaults to True.

An example call would be:

print(textoutpc.tohtml("Hello, [i]beautiful[/i]!", obsolete__TAGS=False))

What is left to do

  • Correct the translator until all the tests pass;
  • Manage blocks superseeding each other;
  • Implement BBcode lists using [*], [**], …;
  • Manage lightscript (or even markdown?) as output languages;
  • Check where the errors are to display them to the user:
    • Count character offset, line number and column number in the lexer;
    • Produce readable exceptions;
    • Make a clean interface to transmit them;
  • Check why exceptions on raw tags effectively escape the content, as it shouldn't…?
  • Look for security flaws (we really don't want stored XSS flaws!).