Commit Graph

40 Commits

Author SHA1 Message Date
Damien George f2040bfc7e py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing.  They are also
smaller on disk.

But the real benefit of .mpy files comes when they are frozen into the
firmware.  This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device.  These C data structures can be executed in-place, ie directly from
ROM.  This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).

The downside of frozen code is that it requires recompiling and reflashing
the entire firmware.  This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).

This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware.  The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place.  If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.

With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).

The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded.  Instead only a small qstr table needs to be built (and put in RAM)
at import time.  This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory.  Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).

In more detail, in the VM what used to be (schematically):

    qst = DECODE_QSTR_VALUE;

is now (schematically):

    idx = DECODE_QSTR_INDEX;
    qst = qstr_table[idx];

That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values.  Only qstr_table needs to be linked
when the .mpy is loaded.

Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.

The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before

The qstr indirection in the bytecode has only a small impact on VM
performance.  For stm32 on PYBv1.0 the performance change of this commit
is:

diff of scores (higher is better)
N=100 M=100             baseline -> this-commit  diff      diff% (error%)
bm_chaos.py               371.07 ->  357.39 :  -13.68 =  -3.687% (+/-0.02%)
bm_fannkuch.py             78.72 ->   77.49 :   -1.23 =  -1.563% (+/-0.01%)
bm_fft.py                2591.73 -> 2539.28 :  -52.45 =  -2.024% (+/-0.00%)
bm_float.py              6034.93 -> 5908.30 : -126.63 =  -2.098% (+/-0.01%)
bm_hexiom.py               48.96 ->   47.93 :   -1.03 =  -2.104% (+/-0.00%)
bm_nqueens.py            4510.63 -> 4459.94 :  -50.69 =  -1.124% (+/-0.00%)
bm_pidigits.py            650.28 ->  644.96 :   -5.32 =  -0.818% (+/-0.23%)
core_import_mpy_multi.py  564.77 ->  581.49 :  +16.72 =  +2.960% (+/-0.01%)
core_import_mpy_single.py  68.67 ->   67.16 :   -1.51 =  -2.199% (+/-0.01%)
core_qstr.py               64.16 ->   64.12 :   -0.04 =  -0.062% (+/-0.00%)
core_yield_from.py        362.58 ->  354.50 :   -8.08 =  -2.228% (+/-0.00%)
misc_aes.py               429.69 ->  405.59 :  -24.10 =  -5.609% (+/-0.01%)
misc_mandel.py           3485.13 -> 3416.51 :  -68.62 =  -1.969% (+/-0.00%)
misc_pystone.py          2496.53 -> 2405.56 :  -90.97 =  -3.644% (+/-0.01%)
misc_raytrace.py          381.47 ->  374.01 :   -7.46 =  -1.956% (+/-0.01%)
viper_call0.py            576.73 ->  572.49 :   -4.24 =  -0.735% (+/-0.04%)
viper_call1a.py           550.37 ->  546.21 :   -4.16 =  -0.756% (+/-0.09%)
viper_call1b.py           438.23 ->  435.68 :   -2.55 =  -0.582% (+/-0.06%)
viper_call1c.py           442.84 ->  440.04 :   -2.80 =  -0.632% (+/-0.08%)
viper_call2a.py           536.31 ->  532.35 :   -3.96 =  -0.738% (+/-0.06%)
viper_call2b.py           382.34 ->  377.07 :   -5.27 =  -1.378% (+/-0.03%)

And for unix on x64:

diff of scores (higher is better)
N=2000 M=2000        baseline -> this-commit     diff      diff% (error%)
bm_chaos.py          13594.20 ->  13073.84 :  -520.36 =  -3.828% (+/-5.44%)
bm_fannkuch.py          60.63 ->     59.58 :    -1.05 =  -1.732% (+/-3.01%)
bm_fft.py           112009.15 -> 111603.32 :  -405.83 =  -0.362% (+/-4.03%)
bm_float.py         246202.55 -> 247923.81 : +1721.26 =  +0.699% (+/-2.79%)
bm_hexiom.py           615.65 ->    617.21 :    +1.56 =  +0.253% (+/-1.64%)
bm_nqueens.py       215807.95 -> 215600.96 :  -206.99 =  -0.096% (+/-3.52%)
bm_pidigits.py        8246.74 ->   8422.82 :  +176.08 =  +2.135% (+/-3.64%)
misc_aes.py          16133.00 ->  16452.74 :  +319.74 =  +1.982% (+/-1.50%)
misc_mandel.py      128146.69 -> 130796.43 : +2649.74 =  +2.068% (+/-3.18%)
misc_pystone.py      83811.49 ->  83124.85 :  -686.64 =  -0.819% (+/-1.03%)
misc_raytrace.py     21688.02 ->  21385.10 :  -302.92 =  -1.397% (+/-3.20%)

The code size change is (firmware with a lot of frozen code benefits the
most):

       bare-arm:  +396 +0.697%
    minimal x86: +1595 +0.979% [incl +32(data)]
       unix x64: +2408 +0.470% [incl +800(data)]
    unix nanbox: +1396 +0.309% [incl -96(data)]
          stm32: -1256 -0.318% PYBV10
         cc3200:  +288 +0.157%
        esp8266:  -260 -0.037% GENERIC
          esp32:  -216 -0.014% GENERIC[incl -1072(data)]
            nrf:  +116 +0.067% pca10040
            rp2:  -664 -0.135% PICO
           samd:  +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS

As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.

In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place.  Performance is not impacted too much.  Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM.  This will
essentially be able to replace frozen code for most applications.

Signed-off-by: Damien George <damien@micropython.org>
2022-02-24 18:08:43 +11:00
Emil Renner Berthing 6324c3e054 py/scope: Name and use id_kind_type_t.
The function scope_find_or_add_id used to take a scope_kind_t enum and
save it in an uint8_t. Saving an enum in a uint8_t is fine, but
everywhere this function is called it is not actually given a
scope_kind_t but an anonymous enum instead. Let's give this enum a name
and use that as the argument type.

This doesn't change the generated code, but is a C type mismatch that
unfortunately doesn't show up unless you enable -Wenum-conversion.
2020-10-22 11:40:56 +02:00
Romain Goyet bd63c26dd5 py/scope: Add assert to check that low numbered qstrs do fit in uint8_t. 2020-04-13 22:27:27 +10:00
Damien George 69661f3343 all: Reformat C and Python source code with tools/codeformat.py.
This is run with uncrustify 0.70.1, and black 19.10b0.
2020-02-28 10:33:03 +11:00
Damien George e328a5d469 py/scope: Optimise scope_find_or_add_id to not need "added" arg.
Taking the address of a local variable is mildly expensive, in code size
and stack usage.  So optimise scope_find_or_add_id() to not need to take a
pointer to the "added" variable, and instead take the kind to use for newly
added identifiers.
2018-10-28 00:38:18 +11:00
Damien George 9201f46cc8 py/compile: Fix case of eager implicit conversion of local to nonlocal.
This ensures that implicit variables are only converted to implicit
closed-over variables (nonlocals) at the very end of the function scope.
If variables are closed-over when first used (read from, as was done prior
to this commit) then this can be incorrect because the variable may be
assigned to later on in the function which means they are just a plain
local, not closed over.

Fixes issue #4272.
2018-10-28 00:33:08 +11:00
Alexander Steffen 55f33240f3 all: Use the name MicroPython consistently in comments
There were several different spellings of MicroPython present in comments,
when there should be only one.
2017-07-31 18:35:40 +10:00
Damien George 0d10517a45 py/scope: Factor common code to find locals and close over them.
Saves 50-100 bytes of code.
2016-09-30 13:53:00 +10:00
Damien George 3dea8c9e92 py/scope: Use lookup-table to determine a scope's simple name.
Generates slightly smaller and more efficient code.
2016-09-30 12:34:05 +10:00
Damien George dd5353a405 py: Add MICROPY_ENABLE_COMPILER and MICROPY_PY_BUILTINS_EVAL_EXEC opts.
MICROPY_ENABLE_COMPILER can be used to enable/disable the entire compiler,
which is useful when only loading of pre-compiled bytecode is supported.
It is enabled by default.

MICROPY_PY_BUILTINS_EVAL_EXEC controls support of eval and exec builtin
functions.  By default they are only included if MICROPY_ENABLE_COMPILER
is enabled.

Disabling both options saves about 40k of code size on 32-bit x86.
2015-12-18 12:35:44 +00:00
Damien George 65dc960e3b unix-cpy: Remove unix-cpy. It's no longer needed.
unix-cpy was originally written to get semantic equivalent with CPython
without writing functional tests.  When writing the initial
implementation of uPy it was a long way between lexer and functional
tests, so the half-way test was to make sure that the bytecode was
correct.  The idea was that if the uPy bytecode matched CPython 1-1 then
uPy would be proper Python if the bytecodes acted correctly.  And having
matching bytecode meant that it was less likely to miss some deep
subtlety in the Python semantics that would require an architectural
change later on.

But that is all history and it no longer makes sense to retain the
ability to output CPython bytecode, because:

1. It outputs CPython 3.3 compatible bytecode.  CPython's bytecode
changes from version to version, and seems to have changed quite a bit
in 3.5.  There's no point in changing the bytecode output to match
CPython anymore.

2. uPy and CPy do different optimisations to the bytecode which makes it
harder to match.

3. The bytecode tests are not run.  They were never part of Travis and
are not run locally anymore.

4. The EMIT_CPYTHON option needs a lot of extra source code which adds
heaps of noise, especially in compile.c.

5. Now that there is an extensive test suite (which tests functionality)
there is no need to match the bytecode.  Some very subtle behaviour is
tested with the test suite and passing these tests is a much better
way to stay Python-language compliant, rather than trying to match
CPy bytecode.
2015-08-17 12:51:26 +01:00
Damien George 51dfcb4bb7 py: Move to guarded includes, everywhere in py/ core.
Addresses issue #1022.
2015-01-01 20:32:09 +00:00
Damien George 584ba6762f py: Move global/nonlocal decl code to compiler for proper SyntaxError.
This patch gives proper SyntaxError exceptions for bad global/nonlocal
declarations.  It also reduces code size: 304 bytes on unix x64, 132
bytes on stmhal.
2014-12-21 17:26:45 +00:00
Damien George 7ff996c237 py: Convert [u]int to mp_[u]int_t in emit.h and associated .c files.
Towards resolving issue #50.
2014-09-08 23:05:16 +01:00
Damien George 4abff7500f py: Change uint to mp_uint_t in runtime.h, stackctrl.h, binary.h.
Part of code cleanup, working towards resolving issue #50.
2014-08-30 14:59:21 +01:00
Damien George 6be0b0a8ec py: Clean up and simplify functions in scope; add STATIC in compiler.
Some small code clean-ups that result in about 80 bytes ROM saving for
stmhal.
2014-08-15 14:30:52 +01:00
Paul Sokolovsky 59c675a64c py: Include mpconfig.h before all other includes.
It defines types used by all other headers.

Fixes #691.
2014-06-21 22:43:22 +03:00
Damien George 58ebde4664 Tidy up some configuration options.
MP_ALLOC_* -> MICROPY_ALLOC_*
MICROPY_PATH_MAX -> MICROPY_ALLOC_PATH_MAX
MICROPY_ENABLE_REPL_HELPERS -> MICROPY_HELPER_REPL
MICROPY_ENABLE_LEXER_UNIX -> MICROPY_HELPER_LEXER_UNIX
MICROPY_EXTRA_* -> MICROPY_PORT_*

See issue #35.
2014-05-21 20:32:59 +01:00
Damien George 66e18f04d8 py: Turn down amount of RAM parser and compiler use.
There are 2 locations in parser, and 1 in compiler, where memory
allocation is not precise.  In the parser it's the rule stack and result
stack, in the compiler it's the array for the identifiers in the current
scope.  All other mallocs are exact (ie they don't allocate more than is
needed).

This patch adds tuning options (MP_ALLOC_*) to mpconfig.h for these 3
inexact allocations.

The inexact allocations in the parser should actually be close to
logarithmic: you need an exponentially larger script (absent pathological
cases) to use up more room on the rule and result stacks.  As such, the
default allocation policy for these is now to start with a modest sized
stack, but grow only in small increments.

For the identifier arrays in the compiler, these now start out quite
small (4 entries, since most functions don't have that many ids), and
grow incrementally by 6 (since if you have more ids than 4, you probably
have quite a few more, but it wouldn't be exponentially more).

Partially addresses issue #560.
2014-05-05 13:19:03 +01:00
Damien George 04b9147e15 Add license header to (almost) all files.
Blanket wide to all .c and .h files.  Some files originating from ST are
difficult to deal with (license wise) so it was left out of those.

Also merged modpyb.h, modos.h, modstm.h and modtime.h in stmhal/.
2014-05-03 23:27:38 +01:00
Damien George 2827d62e8b py: Implement keyword-only args.
Implements 'def f(*, a)' and 'def f(*a, b)', but not default
keyword-only args, eg 'def f(*, a=1)'.

Partially addresses issue #524.
2014-04-27 15:50:52 +01:00
Damien George df8127a17e py: Remove unique_codes from emitglue.c. Replace with pointers.
Attempt to address issue #386.  unique_code_id's have been removed and
replaced with a pointer to the "raw code" information.  This pointer is
stored in the actual byte code (aligned, so the GC can trace it), so
that raw code (ie byte code, native code and inline assembler) is kept
only for as long as it is needed.  In memory it's now like a tree: the
outer module's byte code points directly to its children's raw code.  So
when the outer code gets freed, if there are no remaining functions that
need the raw code, then the children's code gets freed as well.

This is pretty much like CPython does it, except that CPython stores
indexes in the byte code rather than machine pointers.  These indices
index the per-function constant table in order to find the relevant
code.
2014-04-13 11:04:33 +01:00
Damien George 11d8cd54c9 py, compiler: Turn id_info_t.param into a set of flags.
So we can add more flags.
2014-04-09 14:42:51 +01:00
Damien George 78035b995f py, compiler: Clean up and compress scope/compile structures.
Convert int types to uint where sensible, and then to uint8_t or
uint16_t where possible to reduce RAM usage.
2014-04-09 12:27:39 +01:00
xbe efe3422394 py: Clean up includes.
Remove unnecessary includes. Add includes that improve portability.
2014-03-17 02:43:40 -07:00
Damien George 8725f8f7de py: Pass all scope flags through to runtime. 2014-02-15 19:33:11 +00:00
Paul Sokolovsky ab5d08280b Allow qstr's with non-ident chars, construct good identifier for them.
Also, add qstr's for string appearing in unix REPL loop, gross effect
being less allocations for each command run.
2014-01-24 02:34:22 +02:00
Paul Sokolovsky fd31358505 mp_compile(): Properly free module_scope and all nested scopes. 2014-01-23 23:16:18 +02:00
Damien George 55baff4c9b Revamp qstrs: they now include length and hash.
Can now have null bytes in strings.  Can define ROM qstrs per port using
qstrdefsport.h
2014-01-21 21:40:13 +00:00
Damien George cbd2f7482c py: Add module/function/class name to exceptions.
Exceptions know source file, line and block name.

Also tidy up some debug printing functions and provide a global
flag to enable/disable them.
2014-01-19 11:48:48 +00:00
Damien George 6baf76e28b py: make closures work. 2013-12-30 22:32:17 +00:00
Damien 732407f1bf Change memory allocation API to require size for free and realloc. 2013-12-29 19:33:23 +00:00
Damien d99b05282d Change object representation from 1 big union to individual structs.
A big change.  Micro Python objects are allocated as individual structs
with the first element being a pointer to the type information (which
is itself an object).  This scheme follows CPython.  Much more flexible,
not necessarily slower, uses same heap memory, and can allocate objects
statically.

Also change name prefix, from py_ to mp_ (mp for Micro Python).
2013-12-21 18:17:45 +00:00
Damien 9ecbcfff99 py: work towards working closures. 2013-12-11 00:41:43 +00:00
Damien 27fb45eb1c Add local_num skeleton framework to deref/closure emit calls. 2013-10-20 15:07:49 +01:00
Damien c025ebb2dc Separate out mpy core and unix version. 2013-10-12 14:30:21 +01:00
Damien 6cdd3af601 Implement built-in decorators to select emit type. 2013-10-05 18:08:26 +01:00
Damien b05d707b23 Further factorise PASS_1 out of specific emit code. 2013-10-05 13:37:10 +01:00
Damien 415eb6f850 Restructure emit so it goes through a method table. 2013-10-05 12:19:06 +01:00
Damien 429d71943d Initial commit. 2013-10-04 19:53:11 +01:00