micropython

Wykres commitów

Autor	SHA1	Wiadomość	Data
Jim Mussared	b326edf68c	all: Remove MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE. This commit removes all parts of code associated with the existing MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE optimisation option, including the -mcache-lookup-bc option to mpy-cross. This feature originally provided a significant performance boost for Unix, but wasn't able to be enabled for MCU targets (due to frozen bytecode), and added significant extra complexity to generating and distributing .mpy files. The equivalent performance gain is now provided by the combination of MICROPY_OPT_LOAD_ATTR_FAST_PATH and MICROPY_OPT_MAP_LOOKUP_CACHE (which has been enabled on the unix port in the previous commit). It's hard to provide precise performance numbers, but tests have been run on a wide variety of architectures (x86-64, ARM Cortex, Aarch64, RISC-V, xtensa) and they all generally agree on the qualitative improvements seen by the combination of MICROPY_OPT_LOAD_ATTR_FAST_PATH and MICROPY_OPT_MAP_LOOKUP_CACHE. For example, on a "quiet" Linux x64 environment (i3-5010U @ 2.10GHz) the change from CACHE_MAP_LOOKUP_IN_BYTECODE, to LOAD_ATTR_FAST_PATH combined with MAP_LOOKUP_CACHE is: diff of scores (higher is better) N=2000 M=2000 bccache -> attrmapcache diff diff% (error%) bm_chaos.py 13742.56 -> 13905.67 : +163.11 = +1.187% (+/-3.75%) bm_fannkuch.py 60.13 -> 61.34 : +1.21 = +2.012% (+/-2.11%) bm_fft.py 113083.20 -> 114793.68 : +1710.48 = +1.513% (+/-1.57%) bm_float.py 256552.80 -> 243908.29 : -12644.51 = -4.929% (+/-1.90%) bm_hexiom.py 521.93 -> 625.41 : +103.48 = +19.826% (+/-0.40%) bm_nqueens.py 197544.25 -> 217713.12 : +20168.87 = +10.210% (+/-3.01%) bm_pidigits.py 8072.98 -> 8198.75 : +125.77 = +1.558% (+/-3.22%) misc_aes.py 17283.45 -> 16480.52 : -802.93 = -4.646% (+/-0.82%) misc_mandel.py 99083.99 -> 128939.84 : +29855.85 = +30.132% (+/-5.88%) misc_pystone.py 83860.10 -> 82592.56 : -1267.54 = -1.511% (+/-2.27%) misc_raytrace.py 21490.40 -> 22227.23 : +736.83 = +3.429% (+/-1.88%) This shows that the new optimisations are at least as good as the existing inline-bytecode-caching, and are sometimes much better (because the new ones apply caching to a wider variety of map lookups). The new optimisations can also benefit code generated by the native emitter, because they apply to the runtime rather than the generated code. The improvement for the native emitter when LOAD_ATTR_FAST_PATH and MAP_LOOKUP_CACHE are enabled is (same Linux environment as above): diff of scores (higher is better) N=2000 M=2000 native -> nat-attrmapcache diff diff% (error%) bm_chaos.py 14130.62 -> 15464.68 : +1334.06 = +9.441% (+/-7.11%) bm_fannkuch.py 74.96 -> 76.16 : +1.20 = +1.601% (+/-1.80%) bm_fft.py 166682.99 -> 168221.86 : +1538.87 = +0.923% (+/-4.20%) bm_float.py 233415.23 -> 265524.90 : +32109.67 = +13.756% (+/-2.57%) bm_hexiom.py 628.59 -> 734.17 : +105.58 = +16.796% (+/-1.39%) bm_nqueens.py 225418.44 -> 232926.45 : +7508.01 = +3.331% (+/-3.10%) bm_pidigits.py 6322.00 -> 6379.52 : +57.52 = +0.910% (+/-5.62%) misc_aes.py 20670.10 -> 27223.18 : +6553.08 = +31.703% (+/-1.56%) misc_mandel.py 138221.11 -> 152014.01 : +13792.90 = +9.979% (+/-2.46%) misc_pystone.py 85032.14 -> 105681.44 : +20649.30 = +24.284% (+/-2.25%) misc_raytrace.py 19800.01 -> 23350.73 : +3550.72 = +17.933% (+/-2.79%) In summary, compared to MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE, the new MICROPY_OPT_LOAD_ATTR_FAST_PATH and MICROPY_OPT_MAP_LOOKUP_CACHE options: - are simpler; - take less code size; - are faster (generally); - work with code generated by the native emitter; - can be used on embedded targets with a small and constant RAM overhead; - allow the same .mpy bytecode to run on all targets. See #7680 for further discussion. And see also #7653 for a discussion about simplifying mpy-cross options. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>	2021-09-16 16:04:03 +10:00
David Lechner	3dc324d3f1	tests: Format all Python code with black, except tests in basics subdir. This adds the Python files in the tests/ directory to be formatted with ./tools/codeformat.py. The basics/ subdirectory is excluded for now so we aren't changing too much at once. In a few places `# fmt: off`/`# fmt: on` was used where the code had special formatting for readability or where the test was actually testing the specific formatting.	2020-03-30 13:21:58 +11:00
Petr Viktorin	25a9bccdee	py/compile: Disallow 'import ' outside module level. This check follows CPython's behaviour, because 'import ' always populates the globals with the imported names, not locals. Since it's safe to do this (doesn't lead to a crash or undefined behaviour) the check is only enabled for MICROPY_CPYTHON_COMPAT. Fixes issue #5121.	2019-10-04 16:46:47 +10:00
Damien George	c8c0fd4ca3	py: Rework and compress second part of bytecode prelude. This patch compresses the second part of the bytecode prelude which contains the source file name, function name, source-line-number mapping and cell closure information. This part of the prelude now begins with a single varible length unsigned integer which encodes 2 numbers, being the byte-size of the following 2 sections in the header: the "source info section" and the "closure section". After decoding this variable unsigned integer it's possible to skip over one or both of these sections very easily. This scheme saves about 2 bytes for most functions compared to the original format: one in the case that there are no closure cells, and one because padding was eliminated.	2019-10-01 12:26:22 +10:00
Damien George	02db91a7a3	py: Split RAISE_VARARGS opcode into 3 separate ones. From the beginning of this project the RAISE_VARARGS opcode was named and implemented following CPython, where it has an argument (to the opcode) counting how many args the raise takes: raise # 0 args (re-raise previous exception) raise exc # 1 arg raise exc from exc2 # 2 args (chained raise) In the bytecode this operation therefore takes 2 bytes, one for RAISE_VARARGS and one for the number of args. This patch splits this opcode into 3, where each is now a single byte. This reduces bytecode size by 1 byte for each use of raise. Every byte counts! It also has the benefit of reducing code size (on all ports except nanbox).	2019-09-26 15:39:50 +10:00
Damien George	67fdfebe64	tests: Update tests for changes to opcode ordering.	2019-09-26 15:27:11 +10:00
Damien George	2069c563f9	py: Add support for matmul operator @ as per PEP 465. To make progress towards MicroPython supporting Python 3.5, adding the matmul operator is important because it's a really "low level" part of the language, being a new token and modifications to the grammar. It doesn't make sense to make it configurable because 1) it would make the grammar and lexer complicated/messy; 2) no other operators are configurable; 3) it's not a feature that can be "dynamically plugged in" via an import. And matmul can be useful as a general purpose user-defined operator, it doesn't have to be just for numpy use. Based on work done by Jim Mussared.	2019-09-26 15:12:39 +10:00
Milan Rossa	498e35219e	tests: Add tests for sys.settrace feature.	2019-08-30 16:48:22 +10:00
Milan Rossa	ae6fe8b43c	py/compile: Improve the line numbering precision for comprehensions. The line number for comprehensions is now always reported as the correct global location in the script, instead of just "line 1".	2019-08-19 23:50:30 +10:00
Damien George	5a2599d962	py: Replace POP_BLOCK and POP_EXCEPT opcodes with POP_EXCEPT_JUMP. POP_BLOCK and POP_EXCEPT are now the same, and are always followed by a JUMP. So this optimisation reduces code size, and RAM usage of bytecode by two bytes for each try-except handler.	2019-03-05 16:09:58 +11:00
Damien George	e1fb03f3e2	py: Fix VM crash with unwinding jump out of a finally block. This patch fixes a bug in the VM when breaking within a try-finally. The bug has to do with executing a break within the finally block of a try-finally statement. For example: def f(): for x in (1,): print('a', x) try: raise Exception finally: print(1) break print('b', x) f() Currently in uPy the above code will print: a 1 1 1 segmentation fault (core dumped) micropython Not only is there a seg fault, but the "1" in the finally block is printed twice. This is because when the VM executes a finally block it doesn't really know if that block was executed due to a fall-through of the try (no exception raised), or because an exception is active. In particular, for nested finallys the VM has no idea which of the nested ones have active exceptions and which are just fall-throughs. So when a break (or continue) is executed it tries to unwind all of the finallys, when in fact only some may be active. It's questionable whether break (or return or continue) should be allowed within a finally block, because they implicitly swallow any active exception, but nevertheless it's allowed by CPython (although almost never used in the standard library). And uPy should at least not crash in such a case. The solution here relies on the fact that exception and finally handlers always appear in the bytecode after the try body. Note: there was a similar bug with a return in a finally block, but that was previously fixed in `b735208403`	2019-03-05 16:05:05 +11:00
Damien George	0864a6957f	py: Clean up unary and binary enum list to keep groups together. 2 non-bytecode binary ops (NOT_IN and IN_NOT) are moved out of the bytecode group, so this change will change the bytecode format.	2017-10-05 10:49:44 +11:00
Paul Sokolovsky	9d836fedbd	py: Clarify which mp_unary_op_t's may appear in the bytecode. Not all can, so we don't need to reserve bytecodes for them, and can use free slots for something else later.	2017-09-25 16:35:19 -07:00
Paul Sokolovsky	b8ee7ab5b9	py/runtime0.h: Put inplace arith ops in front of normal operations. This is to allow to place reverse ops immediately after normal ops, so they can be tested as one range (which is optimization for reverse ops introduction in the next patch).	2017-09-08 00:10:10 +03:00
Paul Sokolovsky	50b9329eba	py/runtime0.h: Move MP_BINARY_OP_DIVMOD to the end of mp_binary_op_t. It starts a dichotomy of mp_binary_op_t values which can't appear in the bytecode. Another reason to move it is to VALUES of OP_* and OP_INPLACE_* nicely adjacent. This also will be needed for OP_REVERSE_*, to be soon introduced.	2017-09-07 11:26:42 +03:00
Paul Sokolovsky	d4d1c45a55	py/runtime0.h: Move relational ops to the beginning of mp_binary_op_t. This is to allow to encode arithmetic operations more efficiently, in preparation to introduction of __rOP__ method support.	2017-09-07 10:55:43 +03:00
Damien George	30badd1ce1	tests: Add tests for calling super and loading a method directly.	2017-04-22 23:39:38 +10:00
Damien George	86b3db9cd0	tests/cmdline/cmd_showbc: Update to work with recent changes.	2017-02-16 18:38:07 +11:00
Damien George	861b001783	tests/cmdline: Update tests to pass with latest changes to bytecode.	2017-02-16 18:38:07 +11:00
Damien George	f4df3aaa72	py: Allow bytecode/native to put iter_buf on stack for simple for loops. So that the "for x in it: ..." statement can now work without using the heap (so long as the iterator argument fits in an iter_buf structure).	2017-02-16 18:38:06 +11:00
Damien George	453c2e8f55	tests/cmdline: Improve coverage test for printing bytecode.	2016-10-17 11:23:37 +11:00
stijn	7f19b1c3eb	tests: Fix expected output of verbose cmdline test The output might contain more than one line ending in 5b so properly skip everything until the next known point. This fixes test failures in appveyor debug builds.	2016-10-05 12:58:50 +02:00
Damien George	f65e4f0b8f	tests/cmdline/cmd_showbc: Fix test now that 1 value is stored on stack. This corresponds to the change in the way exception values are stored on the Python value stack.	2016-09-27 13:22:06 +10:00
Damien George	bb954d80a4	tests: Get cmdline verbose tests running again. The showbc function now no longer uses the system printf so works correctly.	2016-09-20 11:33:19 +10:00
Damien George	59fba2d6ea	py: Remove mp_load_const_bytes and instead load precreated bytes object. Previous to this patch each time a bytes object was referenced a new instance (with the same data) was created. With this patch a single bytes object is created in the compiler and is loaded directly at execute time as a true constant (similar to loading bignum and float objects). This saves on allocating RAM and means that bytes objects can now be used when the memory manager is locked (eg in interrupts). The MP_BC_LOAD_CONST_BYTES bytecode was removed as part of this. Generated bytecode is slightly larger due to storing a pointer to the bytes object instead of the qstr identifier. Code size is reduced by about 60 bytes on Thumb2 architectures.	2015-06-25 14:42:13 +00:00
Damien George	c5029bcbf3	py: Add MP_BINARY_OP_DIVMOD to simplify and consolidate divmod builtin.	2015-06-13 23:36:30 +01:00
Damien George	c2a4e4effc	py: Convert hash API to use MP_UNARY_OP_HASH instead of ad-hoc function. Hashing is now done using mp_unary_op function with MP_UNARY_OP_HASH as the operator argument. Hashing for int, str and bytes still go via fast-path in mp_unary_op since they are the most common objects which need to be hashed. This lead to quite a bit of code cleanup, and should be more efficient if anything. It saves 176 bytes code space on Thumb2, and 360 bytes on x86. The only loss is that the error message "unhashable type" is now the more generic "unsupported type for __hash__".	2015-05-12 22:46:02 +01:00
Damien George	9a42eb541e	py: Fix naming of function arguments when function is a closure. Addresses issue #1226.	2015-05-06 13:55:33 +01:00
Damien George	367d4d1098	tests: Fix cmd_showbc now that LOAD_CONST_ELLIPSIS bytecode is gone.	2015-05-05 23:58:52 +01:00
Damien George	8c1d23a0e2	py: Modify bytecode "with" behaviour so it doesn't use any heap. Before this patch a "with" block needed to create a bound method object on the heap for the __exit__ call. Now it doesn't because we use load_method instead of load_attr, and save the method+self on the stack.	2015-04-24 01:52:28 +01:00
Damien George	c9aa1883ed	py: Simplify bytecode prelude when encoding closed over variables.	2015-04-07 00:08:17 +01:00
Damien George	1004535237	tests: Make cmdline tests more stable by using regex for matching.	2015-03-20 17:25:25 +00:00
Damien George	0683c1ceef	tests: Don't try to verify amount of memory used in cmd_showbc test.	2015-03-14 17:38:41 +00:00
Damien George	703c009681	tests: Add cmdline test to test showbc code.	2015-03-14 14:06:20 +00:00

34 Commity (b326edf68c5edb648fac4dc2a3403ee33510e179)