PythonExtra

Commit Graph

Author	SHA1	Message	Date
Matthias Urlichs	3fb1bb131f	py/vm: Don't emit warning when using "raise ... from None". "Raise SomeException() from None" is a common Python idiom to suppress chained exceptions and thus shouldn't trigger a warning on a version of Python that doesn't support them in the first place.	2023-10-09 09:46:02 +11:00
Damien George	d54208a2ff	py/scheduler: Implement VM abort flag and mp_sched_vm_abort(). This is intended to be used by the very outer caller of the VM/runtime. It allows setting a top-level NLR handler that can be jumped to directly, in order to forcefully abort the VM/runtime. Enable using: #define MICROPY_ENABLE_VM_ABORT (1) Set up the handler at the top level using: nlr_buf_t nlr; nlr.ret_val = NULL; if (nlr_push(&nlr) == 0) { nlr_set_abort(&nlr); // call into the VM/runtime ... nlr_pop(); } else { if (nlr.ret_val == NULL) { // handle abort ... } else { // handle other exception that propagated to the top level ... } } nlr_set_abort(NULL); Schedule an abort, eg from an interrupt handler, using: mp_sched_vm_abort(); Signed-off-by: Damien George <damien@micropython.org>	2023-03-21 18:08:57 +11:00
Damien George	b878fc042f	py/vm: Consistently indent #if guards to match the code they surround. Signed-off-by: Damien George <damien@micropython.org>	2022-07-12 22:48:55 +10:00
Damien George	893a5c8341	py/vm: In YIELD_FROM opcode, expand helper macros and remove them. The GENERATOR_EXIT_IF_NEEDED macro is only used once and it's easier to read and understand the code if this macro body is written in the code. Then the comment just before it makes more sense. Signed-off-by: Damien George <damien@micropython.org>	2022-07-12 22:48:07 +10:00
Damien George	d84220b8c6	py/vm: Remove check for ip being NULL when handling StopIteration. This check for code_state->ip being NULL was added in `a7c02c4538` with a commit message that "When generator raises exception, it is automatically terminated (by setting its code_state.ip to 0)". It was also added without any tests to test for this particular case. (The commit did mention that CPython's test_pep380.py triggered a bug, but upon re-running this test it did not show any need for this NULL check of code_state->ip.) It is true that generators that have completed (either by running to their end or raising an exception) set "code_state.ip = 0". But there is an explicit check at the start of mp_obj_gen_resume() to return immediately for any attempt to resume an already-stopped generator. So the VM can never execute a generator with NULL ip (and this was true at the time of the above-referenced commit). Furthermore, the other parts of the VM just before and after this piece of code do require (or at least assume) code_state->ip is non-NULL. Signed-off-by: Damien George <damien@micropython.org>	2022-07-12 18:17:44 +10:00
Jim Mussared	158f1794e8	py/vm: Document internal SELECTIVE_EXC_IP option. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>	2022-07-12 16:13:14 +10:00
Jim Mussared	8db99f11a7	py/scheduler: De-inline and fix race with pending exception / scheduler. The optimisation that allows a single check in the VM for either a pending exception or non-empty scheduler queue doesn't work when threading is enabled, as one thread can clear the sched_state if it has no pending exception, meaning the thread with the pending exception will never see it. This removes that optimisation for threaded builds. Also fixes a race in non-scheduler builds where get-and-clear of the pending exception is not protected by the atomic section. Also removes the bulk of the inlining of pending exceptions and scheduler handling from the VM. This just costs code size and complexity at no performance benefit. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>	2022-07-12 15:54:33 +10:00
Damien George	0db046b67b	py/vm: Change comparison for finally handler search from > to >=. The search in these cases should include all finally handlers that are after the current ip. If a handler starts at exactly ip then it is considered "after" the ip. This can happen when END_FINALLY is followed immediately by a finally handler (from a different finally). Consider the function: def f(): try: return 0 finally: print(1) The current bytecode emitter generates the following code: 00 SETUP_FINALLY 5 02 LOAD_CONST_SMALL_INT 0 03 RETURN_VALUE 04 LOAD_CONST_NONE ** 05 LOAD_GLOBAL print 07 LOAD_CONST_SMALL_INT 1 08 CALL_FUNCTION n=1 nkw=0 10 POP_TOP 11 END_FINALLY 12 LOAD_CONST_NONE 13 RETURN_VALUE The LOAD_CONST_NONE marked with ** is dead code because it follows a RETURN_VALUE, and nothing jumps to this LOAD_CONST_NONE. If the emitter could remove this this dead code it would produce: 00 SETUP_FINALLY 4 02 LOAD_CONST_SMALL_INT 0 03 RETURN_VALUE 04 LOAD_GLOBAL print 06 LOAD_CONST_SMALL_INT 1 07 CALL_FUNCTION n=1 nkw=0 09 POP_TOP 10 END_FINALLY 11 LOAD_CONST_NONE 12 RETURN_VALUE In this case the finally block (which starts at offset 4) immediately follows the RETURN_VALUE. When RETURN_VALUE executes ip will point to offset 4 in the bytecode (because the dispatch of the opcode does ip++) and so the finally handler will only be found if a >= comparison is used. It's a similar story for break/continue: while True: try: break finally: print(1) Although technically in this case the > comparison still works because the extra byte from the UNWIND_JUMP (encoding the number of exception handlers to unwind) doesn't have a ip++ (just a *ip) so ip remains pointing within the UNWIND_JUMP opcode, and not at the start of the following finally handler. Nevertheless, the change is made to use >= for consistency with the RETURN_VALUE change. Signed-off-by: Damien George <damien@micropython.org>	2022-06-20 22:28:18 +10:00
David Lechner	783b1a868f	py/runtime: Allow multiple *args in a function call. This is a partial implementation of PEP 448 to allow unpacking multiple star args in a function or method call. This is implemented by changing the emitted bytecodes so that both positional args and star args are stored as positional args. A bitmap is added to indicate if an argument at a given position is a positional argument or a star arg. In the generated code, this new bitmap takes the place of the old star arg. It is stored as a small int, so this means only the first N arguments can be star args where N is the number of bits in a small int. The runtime is modified to interpret this new bytecode format while still trying to perform as few memory reallocations as possible. Signed-off-by: David Lechner <david@pybricks.com>	2022-03-31 16:59:30 +11:00
David Lechner	1e99d29f36	py/runtime: Allow multiple args in a function call. This is a partial implementation of PEP 448 to allow multiple unpackings when calling a function or method. The compiler is modified to encode the argument as a None: obj key-value pair (similar to how regular keyword arguments are encoded as str: obj pairs). The extra object that was pushed on the stack to hold a single ** unpacking object is no longer used and is removed. The runtime is modified to decode this new format. Signed-off-by: David Lechner <david@pybricks.com>	2022-03-31 16:54:00 +11:00
Damien George	bb70874111	py/vm: Prevent array bound warning when using -MP_OBJ_ITER_BUF_NSLOTS. This warning can happen on clang 13.0.1 building mpy-cross: ../py/vm.c:748:25: error: array index -3 refers past the last possible element for an array in 64-bit address space containing 64-bit (8-byte) elements (max possible 2305843009213693952 elements) [-Werror,-Warray-bounds] sp[-MP_OBJ_ITER_BUF_NSLOTS + 1] = MP_OBJ_NULL; ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Using pointer access instead of array access works around this warning. Fixes issue #8467. Signed-off-by: Damien George <damien@micropython.org>	2022-03-31 10:59:55 +11:00
Damien George	6d11c69983	py: Change jump-if-x-or-pop opcodes to have unsigned offset argument. These jumps are always forwards, and it's more efficient in the VM to decode an unsigned argument. These opcodes are already optimised versions of the sequence "dup-top pop-jump-if-x pop" so it doesn't hurt generality to optimise them further. Signed-off-by: Damien George <damien@micropython.org>	2022-03-28 15:43:09 +11:00
Damien George	538c3c0a55	py: Change jump opcodes to emit 1-byte jump offset when possible. This commit introduces changes: - All jump opcodes are changed to have variable length arguments, of either 1 or 2 bytes (previously they were fixed at 2 bytes). In most cases only 1 byte is needed to encode the short jump offset, saving bytecode size. - The bytecode emitter now selects 1 byte jump arguments when the jump offset is guaranteed to fit in 1 byte. This is achieved by checking if the code size changed during the last pass and, if it did (if it shrank), then requesting that the compiler make another pass to get the correct offsets of the now-smaller code. This can continue multiple times until the code stabilises. The code can only ever shrink so this iteration is guaranteed to complete. In most cases no extra passes are needed, the original 4 passes are enough to get it right by the 4th pass (because the 2nd pass computes roughly the correct labels and the 3rd pass computes the correct size for the jump argument). This change to the jump opcode encoding reduces .mpy files and RAM usage (when bytecode is in RAM) by about 2% on average. The performance of the VM is not impacted, at least within measurment of the performance benchmark suite. Code size is reduced for builds that include a decent amount of frozen bytecode. ARM Cortex-M builds without any frozen code increase by about 350 bytes. Signed-off-by: Damien George <damien@micropython.org>	2022-03-28 15:41:38 +11:00
Damien George	1692cad673	py/showbc: Remove global variables and make DECODE_PTR work correctly. The bytecode state variables mp_showbc_code_start and mp_showbc_constants have been removed and made local variables passed into the various functions. As part of this, the DECODE_PTR macro is fixed so it extracts the relevant pointer from the child_table (a regression introduced in `f2040bfc7e`). Signed-off-by: Damien George <damien@micropython.org>	2022-03-16 11:59:46 +11:00
Damien George	f2040bfc7e	py: Rework bytecode and .mpy file format to be mostly static data. Background: .mpy files are precompiled .py files, built using mpy-cross, that contain compiled bytecode functions (and can also contain machine code). The benefit of using an .mpy file over a .py file is that they are faster to import and take less memory when importing. They are also smaller on disk. But the real benefit of .mpy files comes when they are frozen into the firmware. This is done by loading the .mpy file during compilation of the firmware and turning it into a set of big C data structures (the job of mpy-tool.py), which are then compiled and downloaded into the ROM of a device. These C data structures can be executed in-place, ie directly from ROM. This makes importing even faster because there is very little to do, and also means such frozen modules take up much less RAM (because their bytecode stays in ROM). The downside of frozen code is that it requires recompiling and reflashing the entire firmware. This can be a big barrier to entry, slows down development time, and makes it harder to do OTA updates of frozen code (because the whole firmware must be updated). This commit attempts to solve this problem by providing a solution that sits between loading .mpy files into RAM and freezing them into the firmware. The .mpy file format has been reworked so that it consists of data and bytecode which is mostly static and ready to run in-place. If these new .mpy files are located in flash/ROM which is memory addressable, the .mpy file can be executed (mostly) in-place. With this approach there is still a small amount of unpacking and linking of the .mpy file that needs to be done when it's imported, but it's still much better than loading an .mpy from disk into RAM (although not as good as freezing .mpy files into the firmware). The main trick to make static .mpy files is to adjust the bytecode so any qstrs that it references now go through a lookup table to convert from local qstr number in the module to global qstr number in the firmware. That means the bytecode does not need linking/rewriting of qstrs when it's loaded. Instead only a small qstr table needs to be built (and put in RAM) at import time. This means the bytecode itself is static/constant and can be used directly if it's in addressable memory. Also the qstr string data in the .mpy file, and some constant object data, can be used directly. Note that the qstr table is global to the module (ie not per function). In more detail, in the VM what used to be (schematically): qst = DECODE_QSTR_VALUE; is now (schematically): idx = DECODE_QSTR_INDEX; qst = qstr_table[idx]; That allows the bytecode to be fixed at compile time and not need relinking/rewriting of the qstr values. Only qstr_table needs to be linked when the .mpy is loaded. Incidentally, this helps to reduce the size of bytecode because what used to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices. If the module uses the same qstr more than two times then the bytecode is smaller than before. The following changes are measured for this commit compared to the previous (the baseline): - average 7%-9% reduction in size of .mpy files - frozen code size is reduced by about 5%-7% - importing .py files uses about 5% less RAM in total - importing .mpy files uses about 4% less RAM in total - importing .py and .mpy files takes about the same time as before The qstr indirection in the bytecode has only a small impact on VM performance. For stm32 on PYBv1.0 the performance change of this commit is: diff of scores (higher is better) N=100 M=100 baseline -> this-commit diff diff% (error%) bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%) bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%) bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%) bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%) bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%) bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%) bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%) core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%) core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%) core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%) core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%) misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%) misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%) misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%) misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%) viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%) viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%) viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%) viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%) viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%) viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%) And for unix on x64: diff of scores (higher is better) N=2000 M=2000 baseline -> this-commit diff diff% (error%) bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%) bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%) bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%) bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%) bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%) bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%) bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%) misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%) misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%) misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%) misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%) The code size change is (firmware with a lot of frozen code benefits the most): bare-arm: +396 +0.697% minimal x86: +1595 +0.979% [incl +32(data)] unix x64: +2408 +0.470% [incl +800(data)] unix nanbox: +1396 +0.309% [incl -96(data)] stm32: -1256 -0.318% PYBV10 cc3200: +288 +0.157% esp8266: -260 -0.037% GENERIC esp32: -216 -0.014% GENERIC[incl -1072(data)] nrf: +116 +0.067% pca10040 rp2: -664 -0.135% PICO samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS As part of this change the .mpy file format version is bumped to version 6. And mpy-tool.py has been improved to provide a good visualisation of the contents of .mpy files. In summary: this commit changes the bytecode to use qstr indirection, and reworks the .mpy file format to be simpler and allow .mpy files to be executed in-place. Performance is not impacted too much. Eventually it will be possible to store such .mpy files in a linear, read-only, memory- mappable filesystem so they can be executed from flash/ROM. This will essentially be able to replace frozen code for most applications. Signed-off-by: Damien George <damien@micropython.org>	2022-02-24 18:08:43 +11:00
Damien George	8412568e7b	py: Add wrapper macros so hot VM functions can go in fast code location. For example, on esp32 they can go in iRAM to improve performance. Signed-off-by: Damien George <damien@micropython.org>	2021-10-15 23:31:19 +11:00
Jim Mussared	b326edf68c	all: Remove MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE. This commit removes all parts of code associated with the existing MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE optimisation option, including the -mcache-lookup-bc option to mpy-cross. This feature originally provided a significant performance boost for Unix, but wasn't able to be enabled for MCU targets (due to frozen bytecode), and added significant extra complexity to generating and distributing .mpy files. The equivalent performance gain is now provided by the combination of MICROPY_OPT_LOAD_ATTR_FAST_PATH and MICROPY_OPT_MAP_LOOKUP_CACHE (which has been enabled on the unix port in the previous commit). It's hard to provide precise performance numbers, but tests have been run on a wide variety of architectures (x86-64, ARM Cortex, Aarch64, RISC-V, xtensa) and they all generally agree on the qualitative improvements seen by the combination of MICROPY_OPT_LOAD_ATTR_FAST_PATH and MICROPY_OPT_MAP_LOOKUP_CACHE. For example, on a "quiet" Linux x64 environment (i3-5010U @ 2.10GHz) the change from CACHE_MAP_LOOKUP_IN_BYTECODE, to LOAD_ATTR_FAST_PATH combined with MAP_LOOKUP_CACHE is: diff of scores (higher is better) N=2000 M=2000 bccache -> attrmapcache diff diff% (error%) bm_chaos.py 13742.56 -> 13905.67 : +163.11 = +1.187% (+/-3.75%) bm_fannkuch.py 60.13 -> 61.34 : +1.21 = +2.012% (+/-2.11%) bm_fft.py 113083.20 -> 114793.68 : +1710.48 = +1.513% (+/-1.57%) bm_float.py 256552.80 -> 243908.29 : -12644.51 = -4.929% (+/-1.90%) bm_hexiom.py 521.93 -> 625.41 : +103.48 = +19.826% (+/-0.40%) bm_nqueens.py 197544.25 -> 217713.12 : +20168.87 = +10.210% (+/-3.01%) bm_pidigits.py 8072.98 -> 8198.75 : +125.77 = +1.558% (+/-3.22%) misc_aes.py 17283.45 -> 16480.52 : -802.93 = -4.646% (+/-0.82%) misc_mandel.py 99083.99 -> 128939.84 : +29855.85 = +30.132% (+/-5.88%) misc_pystone.py 83860.10 -> 82592.56 : -1267.54 = -1.511% (+/-2.27%) misc_raytrace.py 21490.40 -> 22227.23 : +736.83 = +3.429% (+/-1.88%) This shows that the new optimisations are at least as good as the existing inline-bytecode-caching, and are sometimes much better (because the new ones apply caching to a wider variety of map lookups). The new optimisations can also benefit code generated by the native emitter, because they apply to the runtime rather than the generated code. The improvement for the native emitter when LOAD_ATTR_FAST_PATH and MAP_LOOKUP_CACHE are enabled is (same Linux environment as above): diff of scores (higher is better) N=2000 M=2000 native -> nat-attrmapcache diff diff% (error%) bm_chaos.py 14130.62 -> 15464.68 : +1334.06 = +9.441% (+/-7.11%) bm_fannkuch.py 74.96 -> 76.16 : +1.20 = +1.601% (+/-1.80%) bm_fft.py 166682.99 -> 168221.86 : +1538.87 = +0.923% (+/-4.20%) bm_float.py 233415.23 -> 265524.90 : +32109.67 = +13.756% (+/-2.57%) bm_hexiom.py 628.59 -> 734.17 : +105.58 = +16.796% (+/-1.39%) bm_nqueens.py 225418.44 -> 232926.45 : +7508.01 = +3.331% (+/-3.10%) bm_pidigits.py 6322.00 -> 6379.52 : +57.52 = +0.910% (+/-5.62%) misc_aes.py 20670.10 -> 27223.18 : +6553.08 = +31.703% (+/-1.56%) misc_mandel.py 138221.11 -> 152014.01 : +13792.90 = +9.979% (+/-2.46%) misc_pystone.py 85032.14 -> 105681.44 : +20649.30 = +24.284% (+/-2.25%) misc_raytrace.py 19800.01 -> 23350.73 : +3550.72 = +17.933% (+/-2.79%) In summary, compared to MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE, the new MICROPY_OPT_LOAD_ATTR_FAST_PATH and MICROPY_OPT_MAP_LOOKUP_CACHE options: - are simpler; - take less code size; - are faster (generally); - work with code generated by the native emitter; - can be used on embedded targets with a small and constant RAM overhead; - allow the same .mpy bytecode to run on all targets. See #7680 for further discussion. And see also #7653 for a discussion about simplifying mpy-cross options. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>	2021-09-16 16:04:03 +10:00
Jim Mussared	7b89ad8dbf	py/vm: Add a fast path for LOAD_ATTR on instance types. When the LOAD_ATTR opcode is executed there are quite a few different cases that have to be handled, but the common case is accessing a member on an instance type. Typically, built-in types provide methods which is why this is common. Fortunately, for this specific case, if the member is found in the member map then there's no further processing. This optimisation does a relatively cheap check (type is instance) and then forwards directly to the member map lookup, falling back to the regular path if necessary. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>	2021-09-16 16:02:15 +10:00
Damien George	b8255dd2e0	py/vm: Simplify handling of MP_OBJ_STOP_ITERATION in yield-from opcode. Signed-off-by: Damien George <damien@micropython.org>	2021-07-15 00:12:41 +10:00
Jeff Epler	413f34cd8f	all: Fix signed shifts and NULL access errors from -fsanitize=undefined. Fixes the following (the line numbers match commit `0e87459e2b`): ../../extmod/crypto-algorithms/sha256.c:49:19: runtime error: left shif... ../../extmod/moduasyncio.c:106:35: runtime error: member access within ... ../../py/binary.c:210:13: runtime error: left shift of negative value -... ../../py/mpz.c:744:16: runtime error: negation of -9223372036854775808 ... ../../py/objint.c:109:22: runtime error: left shift of 1 by 31 places c... ../../py/objint_mpz.c:374:9: runtime error: left shift of 4611686018427... ../../py/objint_mpz.c:374:9: runtime error: left shift of negative valu... ../../py/parsenum.c:106:14: runtime error: left shift of 46116860184273... ../../py/runtime.c:395:33: runtime error: left shift of negative value ... ../../py/showbc.c:177:28: runtime error: left shift of negative value -... ../../py/vm.c:321:36: runtime error: left shift of negative value -1``` Testing was done on an amd64 Debian Buster system using gcc-8.3 and these settings: CFLAGS += -g3 -Og -fsanitize=undefined LDFLAGS += -fsanitize=undefined The introduced TASK_PAIRHEAP macro's conditional (x ? &x->i : NULL) assembles (under amd64 gcc 8.3 -Os) to the same as &x->i, since i is the initial field of the struct. However, for the purposes of undefined behavior analysis the conditional is needed. Signed-off-by: Jeff Epler <jepler@gmail.com>	2021-06-24 23:01:04 +10:00
David Lechner	ca920f7218	py/mpstate: Make exceptions thread-local. This moves mp_pending_exception from mp_state_vm_t to mp_state_thread_t. This allows exceptions to be scheduled on a specific thread. Signed-off-by: David Lechner <david@pybricks.com>	2021-06-19 09:43:44 +10:00
Damien George	42cf77f48b	py/vm: For tracing use mp_printf, and print state when thread enabled. mp_printf should be used to print the prefix because it's also used in mp_bytecode_print2 (otherwise, depending on the system, different output streams may be used). Also print the current thread state when threading is enabled to easily see which thread executes what opcode. Signed-off-by: Damien George <damien@micropython.org>	2021-03-17 12:13:53 +11:00
Damien George	85f2b239d8	py/showbc: Pass in an mp_print_t struct to all bytecode-print functions. So the output can be redirected if needed. Signed-off-by: Damien George <damien@micropython.org>	2020-09-11 17:22:28 +10:00
Jim Mussared	243805d776	py/scheduler: Fix race in checking scheduler pending state. Because the atomic section starts after checking whether the scheduler state is pending, it's possible it can become a different state by the time the atomic section starts. This is especially likely on ports where MICROPY_BEGIN_ATOMIC_SECTION is implemented with a mutex (i.e. it might block), but the race exists regardless, i.e. if a context switch occurs between those two lines.	2020-04-13 21:55:47 +10:00
Jim Mussared	def76fe4d9	all: Use MP_ERROR_TEXT for all error messages.	2020-04-05 15:02:06 +10:00
Damien George	3f39d18c2b	all: Add FORMAT-OFF in various places. This string is recognised by uncrustify, to disable formatting in the region marked by these comments. This is necessary in the qstrdef*.h files to prevent modification of the strings within the Q(...). In other places it is used to prevent excessive reformatting that would make the code less readable.	2020-02-28 10:31:07 +11:00
Damien George	95473980ef	py/vm: Fix comment to refer to MP_BC_RAISE_OBJ instead of RAISE_VARARGS.	2019-12-20 14:57:06 +11:00
Damien George	809d89c794	py/runtime: Fix PEP479 behaviour throwing StopIteration into yield from. Commit `3f6ffe059f` implemented PEP479 but did not catch the case fixed in this commit. Found by coverage analysis, that the VM had uncovered code.	2019-10-04 23:27:00 +10:00
Damien George	82c494a97e	py/vm: Fix handling of unwind jump out of active finally. Prior to this commit, when unwinding through an active finally the stack was not being correctly popped/folded, which resulting in the VM crashing for complicated unwinding of nested finallys. This should be fixed with this commit, and more tests for return/break/ continue within a finally have been added to exercise this.	2019-10-04 23:01:29 +10:00
Damien George	c8c0fd4ca3	py: Rework and compress second part of bytecode prelude. This patch compresses the second part of the bytecode prelude which contains the source file name, function name, source-line-number mapping and cell closure information. This part of the prelude now begins with a single varible length unsigned integer which encodes 2 numbers, being the byte-size of the following 2 sections in the header: the "source info section" and the "closure section". After decoding this variable unsigned integer it's possible to skip over one or both of these sections very easily. This scheme saves about 2 bytes for most functions compared to the original format: one in the case that there are no closure cells, and one because padding was eliminated.	2019-10-01 12:26:22 +10:00
Damien George	b5ebfadbd6	py: Compress first part of bytecode prelude. The start of the bytecode prelude contains 6 numbers telling the amount of stack needed for the Python values and exceptions, and the signature of the function. Prior to this patch these numbers were all encoded one after the other (2x variable unsigned integers, then 4x bytes), but using so many bytes is unnecessary. An entropy analysis of around 150,000 bytecode functions from the CPython standard library showed that the optimal Shannon coding would need about 7.1 bits on average to encode these 6 numbers, compared to the existing 48 bits. This patch attempts to get close to this optimal value by packing the 6 numbers into a single, varible-length unsigned integer via bit-wise interleaving. The interleaving scheme is chosen to minimise the average number of bytes needed, and at the same time keep the scheme simple enough so it can be implemented without too much overhead in code size or speed. The scheme requires about 10.5 bits on average to store the 6 numbers. As a result most functions which originally took 6 bytes to encode these 6 numbers now need only 1 byte (in 80% of cases).	2019-10-01 12:26:22 +10:00
Damien George	81d04a0200	py: Add n_state to mp_code_state_t struct. This value is used often enough that it is better to cache it instead of decode it each time.	2019-10-01 12:26:22 +10:00
Damien George	4c5e1a0368	py/bc: Change mp_code_state_t.exc_sp to exc_sp_idx. Change from a pointer to an index, to make space in mp_code_state_t.	2019-10-01 12:26:22 +10:00
Damien George	02db91a7a3	py: Split RAISE_VARARGS opcode into 3 separate ones. From the beginning of this project the RAISE_VARARGS opcode was named and implemented following CPython, where it has an argument (to the opcode) counting how many args the raise takes: raise # 0 args (re-raise previous exception) raise exc # 1 arg raise exc from exc2 # 2 args (chained raise) In the bytecode this operation therefore takes 2 bytes, one for RAISE_VARARGS and one for the number of args. This patch splits this opcode into 3, where each is now a single byte. This reduces bytecode size by 1 byte for each use of raise. Every byte counts! It also has the benefit of reducing code size (on all ports except nanbox).	2019-09-26 15:39:50 +10:00
Damien George	870e900d02	py: Introduce and use constants for multi-opcode sizes.	2019-09-26 15:27:11 +10:00
Damien George	ea060a42e9	py/vm: Factor cached map lookup code to inline function. To reduce code duplication and allow to more easily modify this function.	2019-09-10 11:23:52 +10:00
Milan Rossa	310b3d1b81	py: Integrate sys.settrace feature into the VM and runtime. This commit adds support for sys.settrace, allowing to install Python handlers to trace execution of Python code. The interface follows CPython as closely as possible. The feature is disabled by default and can be enabled via MICROPY_PY_SYS_SETTRACE.	2019-08-30 16:44:12 +10:00
Damien George	dbf35d3da3	py/bc: Factor out code to get bytecode line number info into new func.	2019-08-30 16:43:46 +10:00
Damien George	08c1fe5569	py/vm: Don't add traceback info for exceptions that are re-raised. With this patch exceptions that are re-raised have improved tracebacks (less confusing, match CPython), and it makes re-raise slightly more efficient (in time and RAM) because they no longer need to add a traceback. Also general VM performance is not measurably affected. Partially fixes issue #2928.	2019-08-28 12:31:53 +10:00
Damien George	16f6169c88	py/vm: Don't add traceback info for exc's propagated through a finally. With this patch exception tracebacks that go through a finally are improved (less confusing, match CPython), and it makes finally's slightly more efficient (in time and RAM) because they no longer need to add a traceback. Partially fixes issue #2928.	2019-08-28 12:31:49 +10:00
Damien George	2fca0d7f18	py/vm: Shorten error message for not-implemented opcode. It's really an opcode that's not implemented, so use "opcode" instead of "byte code". And remove the redundant "not implemented" text because that is already implied by the exception type. There's no need to have a long error message for an exception that is almost never encountered. Saves about 20 bytes of code size on most ports.	2019-08-22 16:13:05 +10:00
Damien George	ab26553759	py/vm: Remove obsolete comments about matching of exception opcodes. These are incorrect since `5a2599d962`	2019-05-27 11:58:32 +10:00
Damien George	5a2599d962	py: Replace POP_BLOCK and POP_EXCEPT opcodes with POP_EXCEPT_JUMP. POP_BLOCK and POP_EXCEPT are now the same, and are always followed by a JUMP. So this optimisation reduces code size, and RAM usage of bytecode by two bytes for each try-except handler.	2019-03-05 16:09:58 +11:00
Damien George	6f9e3ff719	py/vm: Remove currently_in_except_block variable. After the previous commit it is no longer needed.	2019-03-05 16:09:41 +11:00
Damien George	e1fb03f3e2	py: Fix VM crash with unwinding jump out of a finally block. This patch fixes a bug in the VM when breaking within a try-finally. The bug has to do with executing a break within the finally block of a try-finally statement. For example: def f(): for x in (1,): print('a', x) try: raise Exception finally: print(1) break print('b', x) f() Currently in uPy the above code will print: a 1 1 1 segmentation fault (core dumped) micropython Not only is there a seg fault, but the "1" in the finally block is printed twice. This is because when the VM executes a finally block it doesn't really know if that block was executed due to a fall-through of the try (no exception raised), or because an exception is active. In particular, for nested finallys the VM has no idea which of the nested ones have active exceptions and which are just fall-throughs. So when a break (or continue) is executed it tries to unwind all of the finallys, when in fact only some may be active. It's questionable whether break (or return or continue) should be allowed within a finally block, because they implicitly swallow any active exception, but nevertheless it's allowed by CPython (although almost never used in the standard library). And uPy should at least not crash in such a case. The solution here relies on the fact that exception and finally handlers always appear in the bytecode after the try body. Note: there was a similar bug with a return in a finally block, but that was previously fixed in `b735208403`	2019-03-05 16:05:05 +11:00
Damien George	eee1e8841a	py: Downcase all MP_OBJ_IS_xxx macros to make a more consistent C API. These macros could in principle be (inline) functions so it makes sense to have them lower case, to match the other C API functions. The remaining macros that are upper case are: - MP_OBJ_TO_PTR, MP_OBJ_FROM_PTR - MP_OBJ_NEW_SMALL_INT, MP_OBJ_SMALL_INT_VALUE - MP_OBJ_NEW_QSTR, MP_OBJ_QSTR_VALUE - MP_OBJ_FUN_MAKE_SIG - MP_DECLARE_CONST_xxx - MP_DEFINE_CONST_xxx These must remain macros because they are used when defining const data (at least, MP_OBJ_NEW_SMALL_INT is so it makes sense to have MP_OBJ_SMALL_INT_VALUE also a macro). For those macros that have been made lower case, compatibility macros are provided for the old names so that users do not need to change their code immediately.	2019-02-12 14:54:51 +11:00
Paul Sokolovsky	8fea833e3f	py: Update my copyright info on some files. Based on git history.	2019-02-06 00:19:00 +11:00
Paul Sokolovsky	2f5d113fad	py/warning: Support categories for warnings. Python defines warnings as belonging to categories, where category is a warning type (descending from exception type). This is useful, as e.g. allows to disable warnings selectively and provide user-defined warning types. So, implement this in MicroPython, except that categories are represented just with strings. However, enough hooks are left to implement categories differently per-port (e.g. as types), without need to patch each and every usage.	2019-01-31 16:48:30 +11:00
Damien George	afecc124e6	py: Fix location of VM returned exception in invalid opcode and comments The location for a returned exception was changed to state[0] in `d95947b48a`	2019-01-04 17:22:40 +11:00
Damien George	d95947b48a	py/vm: When VM raises exception put exc obj at beginning of func state. Instead of at end of state, n_state - 1. It was originally (way back in v1.0) put at the end of the state because the VM didn't have a pointer to the start. But now that the VM takes a mp_code_state_t pointer it does have a pointer to the start of the state so can put the exception object there. This commit saves about 30 bytes of code on all architectures, and, more importantly, reduces C-stack usage by a couple of words (8 bytes on Thumb2 and 16 bytes on x86-64) for every (non-generator) call of a bytecode function because fun_bc_call no longer needs to remember the n_state variable.	2018-09-29 23:25:08 +10:00

1 2 3 4 5 ...

312 Commits