diff --git a/docs/develop/compiler.rst b/docs/develop/compiler.rst new file mode 100644 index 0000000000..2007657490 --- /dev/null +++ b/docs/develop/compiler.rst @@ -0,0 +1,317 @@ +.. _compiler: + +The Compiler +============ + +The compilation process in MicroPython involves the following steps: + +* The lexer converts the stream of text that makes up a MicroPython program into tokens. +* The parser then converts the tokens into an abstract syntax (parse tree). +* Then bytecode or native code is emitted based on the parse tree. + +For purposes of this discussion we are going to add a simple language feature ``add1`` +that can be use in Python as: + +.. code-block:: bash + + >>> add1 3 + 4 + >>> + +The ``add1`` statement takes an integer as argument and adds ``1`` to it. + +Adding a grammar rule +---------------------- + +MicroPython's grammar is based on the `CPython grammar `_ +and is defined in `py/grammar.h `_. +This grammar is what is used to parse MicroPython source files. + +There are two macros you need to know to define a grammar rule: ``DEF_RULE`` and ``DEF_RULE_NC``. +``DEF_RULE`` allows you to define a rule with an associated compile function, +while ``DEF_RULE_NC`` has no compile (NC) function for it. + +A simple grammar definition with a compile function for our new ``add1`` statement +looks like the following: + +.. code-block:: c + + DEF_RULE(add1_stmt, c(add1_stmt), and(2), tok(KW_ADD1), rule(testlist)) + +The second argument ``c(add1_stmt)`` is the corresponding compile function that should be implemented +in ``py/compile.c`` to turn this rule into executable code. + +The third required argument can be ``or`` or ``and``. This specifies the number of nodes associated +with a statement. For example, in this case, our ``add1`` statement is similar to ADD1 in assembly +language. It takes one numeric argument. Therefore, the ``add1_stmt`` has two nodes associated with it. +One node is for the statement itself, i.e the literal ``add1`` corresponding to ``KW_ADD1``, +and the other for its argument, a ``testlist`` rule which is the top-level expression rule. + +.. note:: + The ``add1`` rule here is just an example and not part of the standard + MicroPython grammar. + +The fourth argument in this example is the token associated with the rule, ``KW_ADD1``. This token should be +defined in the lexer by editing ``py/lexer.h``. + +Defining the same rule without a compile function is achieved by using the ``DEF_RULE_NC`` macro +and omitting the compile function argument: + +.. code-block:: c + + DEF_RULE_NC(add1_stmt, and(2), tok(KW_ADD1), rule(testlist)) + +The remaining arguments take on the same meaning. A rule without a compile function must +be handled explicitly by all rules that may have this rule as a node. Such NC-rules are usually +used to express sub-parts of a complicated grammar structure that cannot be expressed in a +single rule. + +.. note:: + The macros ``DEF_RULE`` and ``DEF_RULE_NC`` take other arguments. For an in-depth understanding of + supported parameters, see `py/grammar.h `_. + +Adding a lexical token +---------------------- + +Every rule defined in the grammar should have a token associated with it that is defined in ``py/lexer.h``. +Add this token by editing the ``_mp_token_kind_t`` enum: + +.. code-block:: c + :emphasize-lines: 12 + + typedef enum _mp_token_kind_t { + ... + MP_TOKEN_KW_OR, + MP_TOKEN_KW_PASS, + MP_TOKEN_KW_RAISE, + MP_TOKEN_KW_RETURN, + MP_TOKEN_KW_TRY, + MP_TOKEN_KW_WHILE, + MP_TOKEN_KW_WITH, + MP_TOKEN_KW_YIELD, + MP_TOKEN_KW_ADD1, + ... + } mp_token_kind_t; + +Then also edit ``py/lexer.c`` to add the new keyword literal text: + +.. code-block:: c + :emphasize-lines: 12 + + STATIC const char *const tok_kw[] = { + ... + "or", + "pass", + "raise", + "return", + "try", + "while", + "with", + "yield", + "add1", + ... + }; + +Notice the keyword is named depending on what you want it to be. For consistency, maintain the +naming standard accordingly. + +.. note:: + The order of these keywords in ``py/lexer.c`` must match the order of tokens in the enum + defined in ``py/lexer.h``. + +Parsing +------- + +In the parsing stage the parser takes the tokens produced by the lexer and converts them to an abstract syntax tree (AST) or +*parse tree*. The implementation for the parser is defined in `py/parse.c `_. + +The parser also maintains a table of constants for use in different aspects of parsing, similar to what a +`symbol table `_ +does. + +Several optimizations like `constant folding `_ +on integers for most operations e.g. logical, binary, unary, etc, and optimizing enhancements on parenthesis +around expressions are performed during this phase, along with some optimizations on strings. + +It's worth noting that *docstrings* are discarded and not accessible to the compiler. +Even optimizations like `string interning `_ are +not applied to *docstrings*. + +Compiler passes +--------------- + +Like many compilers, MicroPython compiles all code to MicroPython bytecode or native code. The functionality +that achieves this is implemented in `py/compile.c `_. +The most relevant method you should know about is this: + +.. code-block:: c + + mp_obj_t mp_compile(mp_parse_tree_t *parse_tree, qstr source_file, bool is_repl) { + // Compile the input parse_tree to a raw-code structure. + mp_raw_code_t *rc = mp_compile_to_raw_code(parse_tree, source_file, is_repl); + // Create and return a function object that executes the outer module. + return mp_make_function_from_raw_code(rc, MP_OBJ_NULL, MP_OBJ_NULL); + } + +The compiler compiles the code in four passes: scope, stack size, code size and emit. +Each pass runs the same C code over the same AST data structure, with different things +being computed each time based on the results of the previous pass. + +First pass +~~~~~~~~~~ + +In the first pass, the compiler learns about the known identifiers (variables) and +their scope, being global, local, closed over, etc. In the same pass the emitter +(bytecode or native code) also computes the number of labels needed for the emitted +code. + +.. code-block:: c + + // Compile pass 1. + comp->emit = emit_bc; + comp->emit_method_table = &emit_bc_method_table; + + uint max_num_labels = 0; + for (scope_t *s = comp->scope_head; s != NULL && comp->compile_error == MP_OBJ_NULL; s = s->next) { + if (s->emit_options == MP_EMIT_OPT_ASM) { + compile_scope_inline_asm(comp, s, MP_PASS_SCOPE); + } else { + compile_scope(comp, s, MP_PASS_SCOPE); + + // Check if any implicitly declared variables should be closed over. + for (size_t i = 0; i < s->id_info_len; ++i) { + id_info_t *id = &s->id_info[i]; + if (id->kind == ID_INFO_KIND_GLOBAL_IMPLICIT) { + scope_check_to_close_over(s, id); + } + } + } + ... + } + +Second and third passes +~~~~~~~~~~~~~~~~~~~~~~~ + +The second and third passes involve computing the Python stack size and code size +for the bytecode or native code. After the third pass the code size cannot change, +otherwise jump labels will be incorrect. + +.. code-block:: c + + for (scope_t *s = comp->scope_head; s != NULL && comp->compile_error == MP_OBJ_NULL; s = s->next) { + ... + + // Pass 2: Compute the Python stack size. + compile_scope(comp, s, MP_PASS_STACK_SIZE); + + // Pass 3: Compute the code size. + if (comp->compile_error == MP_OBJ_NULL) { + compile_scope(comp, s, MP_PASS_CODE_SIZE); + } + + ... + } + +Just before pass two there is a selection for the type of code to be emitted, which can +either be native or bytecode. + +.. code-block:: c + + // Choose the emitter type. + switch (s->emit_options) { + case MP_EMIT_OPT_NATIVE_PYTHON: + case MP_EMIT_OPT_VIPER: + if (emit_native == NULL) { + emit_native = NATIVE_EMITTER(new)(&comp->compile_error, &comp->next_label, max_num_labels); + } + comp->emit_method_table = NATIVE_EMITTER_TABLE; + comp->emit = emit_native; + break; + + default: + comp->emit = emit_bc; + comp->emit_method_table = &emit_bc_method_table; + break; + } + +The bytecode option is the default but something unique to note for the native +code option is that there is another option via ``VIPER``. See the +:ref:`Emitting native code ` section for more details on +viper annotations. + +There is also support for *inline assembly code*, where assembly instructions are +written as Python function calls but are emitted directly as the corresponding +machine code. This assembler has only three passes (scope, code size, emit) +and uses a different implementation, not the ``compile_scope`` function. +See the `inline assembler tutorial `_ +for more details. + +Fourth pass +~~~~~~~~~~~ + +The fourth pass emits the final code that can be executed, either bytecode in +the virtual machine, or native code directly by the CPU. + +.. code-block:: c + + for (scope_t *s = comp->scope_head; s != NULL && comp->compile_error == MP_OBJ_NULL; s = s->next) { + ... + + // Pass 4: Emit the compiled bytecode or native code. + if (comp->compile_error == MP_OBJ_NULL) { + compile_scope(comp, s, MP_PASS_EMIT); + } + } + +Emitting bytecode +----------------- + +Statements in Python code usually correspond to emitted bytecode, for example ``a + b`` +generates "push a" then "push b" then "binary op add". Some statements do not emit +anything but instead affect other things like the scope of variables, for example +``global a``. + +The implementation of a function that emits bytecode looks similar to this: + +.. code-block:: c + + void mp_emit_bc_unary_op(emit_t *emit, mp_unary_op_t op) { + emit_write_bytecode_byte(emit, 0, MP_BC_UNARY_OP_MULTI + op); + } + +We use the unary operator expressions for an example here but the implementation +details are similar for other statements/expressions. The method ``emit_write_bytecode_byte()`` +is a wrapper around the main function ``emit_get_cur_to_write_bytecode()`` that all +functions must call to emit bytecode. + +.. _emitting_native_code: + +Emitting native code +--------------------- + +Similar to how bytecode is generated, there should be a corresponding function in ``py/emitnative.c`` for each +code statement: + +.. code-block:: c + + STATIC void emit_native_unary_op(emit_t *emit, mp_unary_op_t op) { + vtype_kind_t vtype; + emit_pre_pop_reg(emit, &vtype, REG_ARG_2); + if (vtype == VTYPE_PYOBJ) { + emit_call_with_imm_arg(emit, MP_F_UNARY_OP, op, REG_ARG_1); + emit_post_push_reg(emit, VTYPE_PYOBJ, REG_RET); + } else { + adjust_stack(emit, 1); + EMIT_NATIVE_VIPER_TYPE_ERROR(emit, + MP_ERROR_TEXT("unary op %q not implemented"), mp_unary_op_method_name[op]); + } + } + +The difference here is that we have to handle *viper typing*. Viper annotations allow +us to handle more than one type of variable. By default all variables are Python objects, +but with viper a variable can also be declared as a machine-typed variable like a native +integer or pointer. Viper can be thought of as a superset of Python, where normal Python +objects are handled as usual, while native machine variables are handled in an optimised +way by using direct machine instructions for the operations. Viper typing may break +Python equivalence because, for example, integers become native integers and can overflow +(unlike Python integers which extend automatically to arbitrary precision). diff --git a/docs/develop/extendingmicropython.rst b/docs/develop/extendingmicropython.rst new file mode 100644 index 0000000000..7fb1ae47a0 --- /dev/null +++ b/docs/develop/extendingmicropython.rst @@ -0,0 +1,19 @@ +.. _extendingmicropython: + +Extending MicroPython in C +========================== + +This chapter describes options for implementing additional functionality in C, but from code +written outside of the main MicroPython repository. The first approach is useful for building +your own custom firmware with some project-specific additional modules or functions that can +be accessed from Python. The second approach is for building modules that can be loaded at runtime. + +Please see the :ref:`library section ` for more information on building core modules that +live in the main MicroPython repository. + +.. toctree:: + :maxdepth: 3 + + cmodules.rst + natmod.rst + \ No newline at end of file diff --git a/docs/develop/gettingstarted.rst b/docs/develop/gettingstarted.rst new file mode 100644 index 0000000000..3dd00a579a --- /dev/null +++ b/docs/develop/gettingstarted.rst @@ -0,0 +1,324 @@ +.. _gettingstarted: + +Getting Started +=============== + +This guide covers a step-by-step process on setting up version control, obtaining and building +a copy of the source code for a port, building the documentation, running tests, and a description of the +directory structure of the MicroPython code base. + +Source control with git +----------------------- + +MicroPython is hosted on `GitHub `_ and uses +`Git `_ for source control. The workflow is such that +code is pulled and pushed to and from the main repository. Install the respective version +of Git for your operating system to follow through the rest of the steps. + +.. note:: + For a reference on the installation instructions, please refer to + the `Git installation instructions `_. + Learn about the basic git commands in this `Git Handbook `_ + or any other sources on the internet. + +Get the code +------------ + +It is recommended that you maintain a fork of the MicroPython repository for your development purposes. +The process of obtaining the source code includes the following: + +#. Fork the repository https://github.com/micropython/micropython +#. You will now have a fork at /micropython>. +#. Clone the forked repository using the following command: + +.. code-block:: bash + + $ git clone https://github.com//micropython + +Then, `configure the remote repositories `_ to be able to +collaborate on the MicroPython project. + +Configure remote upstream: + +.. code-block:: bash + + $ cd micropython + $ git remote add upstream https://github.com/micropython/micropython + +It is common to configure ``upstream`` and ``origin`` on a forked repository +to assist with sharing code changes. You can maintain your own mapping but +it is recommended that ``origin`` maps to your fork and ``upstream`` to the main +MicroPython repository. + +After the above configuration, your setup should be similar to this: + +.. code-block:: bash + + $ git remote -v + origin https://github.com//micropython (fetch) + origin https://github.com//micropython (push) + upstream https://github.com/micropython/micropython (fetch) + upstream https://github.com/micropython/micropython (push) + +You should now have a copy of the source code. By default, you are pointing +to the master branch. To prepare for further development, it is recommended +to work on a development branch. + +.. code-block:: bash + + $ git checkout -b dev-branch + +You can give it any name. You will have to compile MicroPython whenever you change +to a different branch. + +Compile and build the code +-------------------------- + +When compiling MicroPython, you compile a specific :term:`port`, usually +targeting a specific :ref:`board `. Start by installing the required dependencies. +Then build the MicroPython cross-compiler before you can successfully compile and build. +This applies specifically when using Linux to compile. +The Windows instructions are provided in a later section. + +.. _required_dependencies: + +Required dependencies +~~~~~~~~~~~~~~~~~~~~~ + +Install the required dependencies for Linux: + +.. code-block:: bash + + $ sudo apt-get install build-essential libffi-dev git pkg-config + +For the stm32 port, the ARM cross-compiler is required: + +.. code-block:: bash + + $ sudo apt-get install arm-none-eabi-gcc arm-none-eabi-binutils arm-none-eabi-newlib + +See the `ARM GCC +toolchain `_ +for the latest details. + +Python is also required. Python 2 is supported for now, but we recommend using Python 3. +Check that you have Python available on your system: + +.. code-block:: bash + + $ python3 + Python 3.5.0 (default, Jul 17 2020, 14:04:10) + [GCC 5.4.0 20160609] on linux + Type "help", "copyright", "credits" or "license" for more information. + >>> + +All supported ports have different dependency requirements, see their respective +`readme files `_. + +Building the MicroPython cross-compiler +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Almost all ports require building ``mpy-cross`` first to perform pre-compilation +of Python code that will be included in the port firmware: + +.. code-block:: bash + + $ cd mpy-cross + $ make + +.. note:: + Note that, ``mpy-cross`` must be built for the host architecture + and not the target architecture. + +If it built successfully, you should see a message similar to this: + +.. code-block:: bash + + LINK mpy-cross + text data bss dec hex filename + 279328 776 880 280984 44998 mpy-cross + +.. note:: + + Use ``make -C mpy-cross`` to build the cross-compiler in one statement + without moving to the ``mpy-cross`` directory otherwise, you will need + to do ``cd ..`` for the next steps. + +Building the Unix port of MicroPython +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The Unix port is a version of MicroPython that runs on Linux, macOS, and other Unix-like operating systems. +It's extremely useful for developing MicroPython as it avoids having to deploy your code to a device to test it. +In many ways, it works a lot like CPython's python binary. + +To build for the Unix port, make sure all Linux related dependencies are installed as detailed in the +required dependencies section. See the :ref:`required_dependencies` +to make sure that all dependencies are installed for this port. Also, make sure you have a working +environment for ``gcc`` and ``GNU make``. Ubuntu 20.04 has been used for the example +below but other unixes ought to work with little modification: + +.. code-block:: bash + + $ gcc --version + gcc (Ubuntu 9.3.0-10ubuntu2) 9.3.0 + Copyright (C) 2019 Free Software Foundation, Inc. + This is free software; see the source for copying conditions. There is NO + warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.then build: + +.. code-block:: bash + + $ cd ports/unix + $ make submodules + $ make + +If MicroPython built correctly, you should see the following: + +.. code-block:: bash + + LINK micropython + text data bss dec hex filename + 412033 5680 2496 420209 66971 micropython + +Now run it: + +.. code-block:: bash + + $ ./micropython + MicroPython v1.13-38-gc67012d-dirty on 2020-09-13; linux version + Use Ctrl-D to exit, Ctrl-E for paste mode + >>> print("hello world") + hello world + >>> + +Building the Windows port +~~~~~~~~~~~~~~~~~~~~~~~~~ + +The Windows port includes a Visual Studio project file micropython.vcxproj that you can use to build micropython.exe. +It can be opened in Visual Studio or built from the command line using msbuild. Alternatively, it can be built using mingw, +either in Windows with Cygwin, or on Linux. +See `windows port documentation `_ for more information. + +Building the STM32 port +~~~~~~~~~~~~~~~~~~~~~~~ + +Like the Unix port, you need to install some required dependencies +as detailed in the :ref:`required_dependencies` section, then build: + +.. code-block:: bash + + $ cd ports/stm32 + $ make submodules + $ make + +Please refer to the `stm32 documentation `_ +for more details on flashing the firmware. + +.. note:: + See the :ref:`required_dependencies` to make sure that all dependencies are installed for this port. + The cross-compiler is needed. ``arm-none-eabi-gcc`` should also be in the $PATH or specified manually + via CROSS_COMPILE, either by setting the environment variable or in the ``make`` command line arguments. + +You can also specify which board to use: + +.. code-block:: bash + + $ cd ports/stm32 + $ make submodules + $ make BOARD= + +See `ports/stm32/boards `_ +for the available boards. e.g. "PYBV11" or "NUCLEO_WB55". + +Building the documentation +-------------------------- + +MicroPython documentation is created using ``Sphinx``. If you have already +installed Python, then install ``Sphinx`` using ``pip``. It is recommended +that you use a virtual environment: + +.. code-block:: bash + + $ python3 -m venv env + $ source env/bin/activate + $ pip install sphinx + +Navigate to the ``docs`` directory: + +.. code-block:: bash + + $ cd docs + +Build the docs: + +.. code-block:: bash + + $ make html + +Open ``docs/build/html/index.html`` in your browser to view the docs locally. Refer to the +documentation on `importing your documentation +`_ to use Read the Docs. + +Running the tests +----------------- + +To run all tests in the test suite on the Unix port use: + +.. code-block:: bash + + $ cd ports/unix + $ make test + +To run a selection of tests on a board/device connected over USB use: + +.. code-block:: bash + + $ cd tests + $ ./run-tests --target minimal --device /dev/ttyACM0 + +See also :ref:`writingtests`. + +Folder structure +---------------- + +There are a couple of directories to take note of in terms of where certain implementation details +are. The following is a break down of the top-level folders in the source code. + +py + + Contains the compiler, runtime, and core library implementation. + +mpy-cross + + Has the MicroPython cross-compiler which pre-compiles the Python scripts to bytecode. + +ports + + Code for all the versions of MicroPython for the supported ports. + +lib + + Low-level C libraries used by any port which are mostly 3rd-party libraries. + +drivers + + Has drivers for specific hardware and intended to work across multiple ports. + +extmod + + Contains a C implementation of more non-core modules. + +docs + + Has the standard documentation found at https://docs.micropython.org/. + +tests + + An implementation of the test suite. + +tools + + Contains helper tools including the ``upip`` and the ``pyboard.py`` module. + +examples + + Example code for building MicroPython as a library as well as native modules. diff --git a/docs/develop/img/bitmap.png b/docs/develop/img/bitmap.png new file mode 100644 index 0000000000..87de81d769 Binary files /dev/null and b/docs/develop/img/bitmap.png differ diff --git a/docs/develop/img/collision.png b/docs/develop/img/collision.png new file mode 100644 index 0000000000..a67ddd6137 Binary files /dev/null and b/docs/develop/img/collision.png differ diff --git a/docs/develop/img/linprob.png b/docs/develop/img/linprob.png new file mode 100644 index 0000000000..c288189084 Binary files /dev/null and b/docs/develop/img/linprob.png differ diff --git a/docs/develop/index.rst b/docs/develop/index.rst index f1fd0692ec..7a6a6be67c 100644 --- a/docs/develop/index.rst +++ b/docs/develop/index.rst @@ -1,14 +1,27 @@ -Developing and building MicroPython -=================================== +MicroPython Internals +===================== -This chapter describes some options for extending MicroPython in C. Note -that it doesn't aim to be a complete guide for developing with MicroPython. -See the `getting started guide -`_ for further information. +This chapter covers a tour of MicroPython from the perspective of a developer, contributing +to MicroPython. It acts as a comprehensive resource on the implementation details of MicroPython +for both novice and expert contributors. + +Development around MicroPython usually involves modifying the core runtime, porting or +maintaining a new library. This guide describes at great depth, the implementation +details of MicroPython including a getting started guide, compiler internals, porting +MicroPython to a new platform and implementing a core MicroPython library. .. toctree:: - :maxdepth: 1 + :maxdepth: 3 - cmodules.rst + gettingstarted.rst + writingtests.rst + compiler.rst + memorymgt.rst + library.rst + optimizations.rst qstr.rst - natmod.rst + maps.rst + publiccapi.rst + extendingmicropython.rst + porting.rst + \ No newline at end of file diff --git a/docs/develop/library.rst b/docs/develop/library.rst new file mode 100644 index 0000000000..bebddcc8a3 --- /dev/null +++ b/docs/develop/library.rst @@ -0,0 +1,86 @@ +.. _internals_library: + +Implementing a Module +===================== + +This chapter details how to implement a core module in MicroPython. +MicroPython modules can be one of the following: + +- Built-in module: A general module that is be part of the MicroPython repository. +- User module: A module that is useful for your specific project that you maintain + in your own repository or private codebase. +- Dynamic module: A module that can be deployed and imported at runtime to your device. + +A module in MicroPython can be implemented in one of the following locations: + +- py/: A core library that mirrors core CPython functionality. +- extmod/: A CPython or MicroPython-specific module that is shared across multiple ports. +- ports//: A port-specific module. + +.. note:: + This chapter describes modules implemented in ``py/`` or core modules. + See :ref:`extendingmicropython` for details on implementing an external module. + For details on port-specific modules, see :ref:`porting_to_a_board`. + +Implementing a core module +-------------------------- + +Like CPython, MicroPython has core builtin modules that can be accessed through import statements. +An example is the ``gc`` module discussed in :ref:`memorymanagement`. + +.. code-block:: bash + + >>> import gc + >>> gc.enable() + >>> + +MicroPython has several other builtin standard/core modules like ``io``, ``uarray`` etc. +Adding a new core module involves several modifications. + +First, create the ``C`` file in the ``py/`` directory. In this example we are adding a +hypothetical new module ``subsystem`` in the file ``modsubsystem.c``: + +.. code-block:: c + + #include "py/builtin.h" + #include "py/runtime.h" + + #if MICROPY_PY_SUBSYSTEM + + // info() + STATIC mp_obj_t py_subsystem_info(void) { + return MP_OBJ_NEW_SMALL_INT(42); + } + MP_DEFINE_CONST_FUN_OBJ_0(subsystem_info_obj, py_subsystem_info); + + STATIC const mp_rom_map_elem_t mp_module_subsystem_globals_table[] = { + { MP_ROM_QSTR(MP_QSTR___name__), MP_ROM_QSTR(MP_QSTR_subsystem) }, + { MP_ROM_QSTR(MP_QSTR_info), MP_ROM_PTR(&subsystem_info_obj) }, + }; + STATIC MP_DEFINE_CONST_DICT(mp_module_subsystem_globals, mp_module_subsystem_globals_table); + + const mp_obj_module_t mp_module_subsystem = { + .base = { &mp_type_module }, + .globals = (mp_obj_dict_t *)&mp_module_subsystem_globals, + }; + + MP_REGISTER_MODULE(MP_QSTR_subsystem, mp_module_subsystem, MICROPY_PY_SUBSYSTEM); + + #endif + +The implementation includes a definition of all functions related to the module and adds the +functions to the module's global table in ``mp_module_subsystem_globals_table``. It also +creates the module object with ``mp_module_subsystem``. The module is then registered with +the wider system via the ``MP_REGISTER_MODULE`` macro. + +After building and running the modified MicroPython, the module should now be importable: + +.. code-block:: bash + + >>> import subsystem + >>> subsystem.info() + 42 + >>> + +Our ``info()`` function currently returns just a single number but can be extended +to do anything. Similarly, more functions can be added to this new module. diff --git a/docs/develop/maps.rst b/docs/develop/maps.rst new file mode 100644 index 0000000000..8f899fa1d3 --- /dev/null +++ b/docs/develop/maps.rst @@ -0,0 +1,63 @@ +.. _maps: + +Maps and Dictionaries +===================== + +MicroPython dictionaries and maps use techniques called open addressing and linear probing. +This chapter details both of these methods. + +Open addressing +--------------- + +`Open addressing `_ is used to resolve collisions. +Collisions are very common occurrences and happen when two items happen to hash to the same +slot or location. For example, given a hash setup as this: + +.. image:: img/collision.png + +If there is a request to fill slot ``0`` with ``70``, since the slot ``0`` is not empty, open addressing +finds the next available slot in the dictionary to service this request. This sequential search for an alternate +location is called *probing*. There are several sequence probing algorithms but MicroPython uses +linear probing that is described in the next section. + +Linear probing +-------------- + +Linear probing is one of the methods for finding an available address or slot in a dictionary. In MicroPython, +it is used with open addressing. To service the request described above, unlike other probing algorithms, +linear probing assumes a fixed interval of ``1`` between probes. The request will therefore be serviced by +placing the item in the next free slot which is slot ``4`` in our example: + +.. image:: img/linprob.png + +The same methods i.e open addressing and linear probing are used to search for an item in a dictionary. +Assume we want to search for the data item ``33``. The computed hash value will be 2. Looking at slot 2 +reveals ``33``, at this point, we return ``True``. Searching for ``70`` is quite different as there was a +collision at the time of insertion. Therefore computing the hash value is ``0`` which is currently +holding ``44``. Instead of simply returning ``False``, we perform a sequential search starting at point +``1`` until the item ``70`` is found or we encounter a free slot. This is the general way of performing +look-ups in hashes: + +.. code-block:: c + + // not yet found, keep searching in this table + pos = (pos + 1) % set->alloc; + + if (pos == start_pos) { + // search got back to starting position, so index is not in table + if (lookup_kind & MP_MAP_LOOKUP_ADD_IF_NOT_FOUND) { + if (avail_slot != NULL) { + // there was an available slot, so use that + set->used++; + *avail_slot = index; + return index; + } else { + // not enough room in table, rehash it + mp_set_rehash(set); + // restart the search for the new element + start_pos = pos = hash % set->alloc; + } + } + } else { + return MP_OBJ_NULL; + } diff --git a/docs/develop/memorymgt.rst b/docs/develop/memorymgt.rst new file mode 100644 index 0000000000..5b1690cc82 --- /dev/null +++ b/docs/develop/memorymgt.rst @@ -0,0 +1,141 @@ +.. _memorymanagement: + +Memory Management +================= + +Unlike programming languages such as C/C++, MicroPython hides memory management +details from the developer by supporting automatic memory management. +Automatic memory management is a technique used by operating systems or applications to automatically manage +the allocation and deallocation of memory. This eliminates challenges such as forgetting to +free the memory allocated to an object. Automatic memory management also avoids the critical issue of using memory +that is already released. Automatic memory management takes many forms, one of them being +garbage collection (GC). + +The garbage collector usually has two responsibilities; + +#. Allocate new objects in available memory. +#. Free unused memory. + +There are many GC algorithms but MicroPython uses the +`Mark and Sweep `_ +policy for managing memory. This algorithm has a mark phase that traverses the heap marking all +live objects while the sweep phase goes through the heap reclaiming all unmarked objects. + +Garbage collection functionality in MicroPython is available through the ``gc`` built-in +module: + +.. code-block:: bash + + >>> x = 5 + >>> x + 5 + >>> import gc + >>> gc.enable() + >>> gc.mem_alloc() + 1312 + >>> gc.mem_free() + 2071392 + >>> gc.collect() + 19 + >>> gc.disable() + >>> + +Even when ``gc.disable()`` is invoked, collection can be triggered with ``gc.collect()``. + +The object model +---------------- + +All MicroPython objects are referred to by the ``mp_obj_t`` data type. +This is usually word-sized (i.e. the same size as a pointer on the target architecture), +and can be typically 32-bit (STM32, nRF, ESP32, Unix x86) or 64-bit (Unix x64). +It can also be greater than a word-size for certain object representations, for +example ``OBJ_REPR_D`` has a 64-bit sized ``mp_obj_t`` on a 32-bit architecture. + +An ``mp_obj_t`` represents a MicroPython object, for example an integer, float, type, dict or +class instance. Some objects, like booleans and small integers, have their value stored directly +in the ``mp_obj_t`` value and do not require additional memory. Other objects have their value +store elsewhere in memory (for example on the garbage-collected heap) and their ``mp_obj_t`` contains +a pointer to that memory. A portion of ``mp_obj_t`` is the tag which tells what type of object it is. + +See ``py/mpconfig.h`` for the specific details of the available representations. + +**Pointer tagging** + +Because pointers are word-aligned, when they are stored in an ``mp_obj_t`` the +lower bits of this object handle will be zero. For example on a 32-bit architecture +the lower 2 bits will be zero: + +``********|********|********|******00`` + +These bits are reserved for purposes of storing a tag. The tag stores extra information as +opposed to introducing a new field to store that information in the object, which may be +inefficient. In MicroPython the tag tells if we are dealing with a small integer, interned +(small) string or a concrete object, and different semantics apply to each of these. + +For small integers the mapping is this: + +``********|********|********|*******1`` + +Where the asterisks hold the actual integer value. For an interned string or an immediate +object (e.g. ``True``) the layout of the ``mp_obj_t`` value is, respectively: + +``********|********|********|*****010`` + +``********|********|********|*****110`` + +While a concrete object that is none of the above takes the form: + +``********|********|********|******00`` + +The stars here correspond to the address of the concrete object in memory. + +Allocation of objects +---------------------- + +The value of a small integer is stored directly in the ``mp_obj_t`` and will be +allocated in-place, not on the heap or elsewhere. As such, creation of small +integers does not affect the heap. Similarly for interned strings that already have +their textual data stored elsewhere, and immediate values like ``None``, ``False`` +and ``True``. + +Everything else which is a concrete object is allocated on the heap and its object structure is such that +a field is reserved in the object header to store the type of the object. + +.. code-block:: bash + + +++++++++++ + + + + + type + object header + + + + +++++++++++ + + + object items + + + + + + + +++++++++++ + +The heap's smallest unit of allocation is a block, which is four machine words in +size (16 bytes on a 32-bit machine, 32 bytes on a 64-bit machine). +Another structure also allocated on the heap tracks the allocation of +objects in each block. This structure is called a *bitmap*. + +.. image:: img/bitmap.png + +The bitmap tracks whether a block is "free" or "in use" and use two bits to track this state +for each block. + +The mark-sweep garbage collector manages the objects allocated on the heap, and also +utilises the bitmap to mark objects that are still in use. +See `py/gc.c `_ +for the full implementation of these details. + +**Allocation: heap layout** + +The heap is arranged such that it consists of blocks in pools. A block +can have different properties: + +- *ATB(allocation table byte):* If set, then the block is a normal block +- *FREE:* Free block +- *HEAD:* Head of a chain of blocks +- *TAIL:* In the tail of a chain of blocks +- *MARK :* Marked head block +- *FTB(finaliser table byte):* If set, then the block has a finaliser diff --git a/docs/develop/optimizations.rst b/docs/develop/optimizations.rst new file mode 100644 index 0000000000..d972cde666 --- /dev/null +++ b/docs/develop/optimizations.rst @@ -0,0 +1,72 @@ +.. _optimizations: + +Optimizations +============= + +MicroPython uses several optimizations to save RAM but also ensure the efficient +execution of programs. This chapter discusses some of these optimizations. + +.. note:: + :ref:`qstr` and :ref:`maps` details other optimizations on strings and + dictionaries. + +Frozen bytecode +--------------- + +When MicroPython loads Python code from the filesystem, it first has to parse the file into +a temporary in-memory representation, and then generate bytecode for execution, both of which +are stored in the heap (in RAM). This can lead to significant amounts of memory being used. +The MicroPython cross compiler can be used to generate +a ``.mpy`` file, containing the pre-compiled bytecode for a Python module. This will still +be loaded into RAM, but it avoids the additional overhead of the parsing stage. + +As a further optimisation, the pre-compiled bytecode from a ``.mpy`` file can be "frozen" +into the firmware image as part of the main firmware compilation process, which means that +the bytecode will be executed from ROM. This can lead to a significant memory saving, and +reduce heap fragmentation. + +Variables +--------- + +MicroPython processes local and global variables differently. Global variables +are stored and looked up from a global dictionary that is allocated on the heap +(note that each module has its own separate dict, so separate namespace). +Local variables on the other hand are are stored on the Python value stack, which may +live on the C stack or on the heap. They are accessed directly by their offset +within the Python stack, which is more efficient than a global lookup in a dict. + +The length of global variable names also affects how much RAM is used as identifiers +are stored in RAM. The shorter the identifier, the less memory is used. + +The other aspect is that ``const`` variables that start with an underscore are treated as +proper constants and are not allocated or added in a dictionary, hence saving some memory. +These variables use ``const()`` from the MicroPython library. Therefore: + +.. code-block:: python + + from micropython import const + + X = const(1) + _Y = const(2) + foo(X, _Y) + +Compiles to: + +.. code-block:: python + + X = 1 + foo(1, 2) + +Allocation of memory +-------------------- + +Most of the common MicroPython constructs are not allocated on the heap. +However the following are: + +- Dynamic data structures like lists, mappings, etc; +- Functions, classes and object instances; +- imports; and +- First-time assignment of global variables (to create the slot in the global dict). + +For a detailed discussion on a more user-centric perspective on optimization, +see `Maximising MicroPython speed `_ diff --git a/docs/develop/porting.rst b/docs/develop/porting.rst new file mode 100644 index 0000000000..59dd570008 --- /dev/null +++ b/docs/develop/porting.rst @@ -0,0 +1,310 @@ +.. _porting_to_a_board: + +Porting MicroPython +=================== + +The MicroPython project contains several ports to different microcontroller families and +architectures. The project repository has a `ports `_ +directory containing a subdirectory for each supported port. + +A port will typically contain definitions for multiple "boards", each of which is a specific piece of +hardware that that port can run on, e.g. a development kit or device. + +The `minimal port `_ is +available as a simplified reference implementation of a MicroPython port. It can run on both the +host system and an STM32F4xx MCU. + +In general, starting a port requires: + +- Setting up the toolchain (configuring Makefiles, etc). +- Implementing boot configuration and CPU initialization. +- Initialising basic drivers required for development and debugging (e.g. GPIO, UART). +- Performing the board-specific configurations. +- Implementing the port-specific modules. + +Minimal MicroPython firmware +---------------------------- + +The best way to start porting MicroPython to a new board is by integrating a minimal +MicroPython interpreter. For this walkthrough, create a subdirectory for the new +port in the ``ports`` directory: + +.. code-block:: bash + + $ cd ports + $ mkdir example_port + +The basic MicroPython firmware is implemented in the main port file, e.g ``main.c``: + +.. code-block:: c + + #include "py/compile.h" + #include "py/gc.h" + #include "py/mperrno.h" + #include "py/stackctrl.h" + #include "lib/utils/gchelper.h" + #include "lib/utils/pyexec.h" + + // Allocate memory for the MicroPython GC heap. + static char heap[4096]; + + int main(int argc, char **argv) { + // Initialise the MicroPython runtime. + mp_stack_ctrl_init(); + gc_init(heap, heap + sizeof(heap)); + mp_init(); + mp_obj_list_init(MP_OBJ_TO_PTR(mp_sys_path), 0); + mp_obj_list_init(MP_OBJ_TO_PTR(mp_sys_argv), 0); + + // Start a normal REPL; will exit when ctrl-D is entered on a blank line. + pyexec_friendly_repl(); + + // Deinitialise the runtime. + gc_sweep_all(); + mp_deinit(); + return 0; + } + + // Handle uncaught exceptions (should never be reached in a correct C implementation). + void nlr_jump_fail(void *val) { + for (;;) { + } + } + + // Do a garbage collection cycle. + void gc_collect(void) { + gc_collect_start(); + gc_helper_collect_regs_and_stack(); + gc_collect_end(); + } + + // There is no filesystem so stat'ing returns nothing. + mp_import_stat_t mp_import_stat(const char *path) { + return MP_IMPORT_STAT_NO_EXIST; + } + + // There is no filesystem so opening a file raises an exception. + mp_lexer_t *mp_lexer_new_from_file(const char *filename) { + mp_raise_OSError(MP_ENOENT); + } + +We also need a Makefile at this point for the port: + +.. code-block:: Makefile + + # Include the core environment definitions; this will set $(TOP). + include ../../py/mkenv.mk + + # Include py core make definitions. + include $(TOP)/py/py.mk + + # Set CFLAGS and libraries. + CFLAGS = -I. -I$(BUILD) -I$(TOP) + LIBS = -lm + + # Define the required source files. + SRC_C = \ + main.c \ + mphalport.c \ + lib/mp-readline/readline.c \ + lib/utils/gchelper_generic.c \ + lib/utils/pyexec.c \ + lib/utils/stdout_helpers.c \ + + # Define the required object files. + OBJ = $(PY_CORE_O) $(addprefix $(BUILD)/, $(SRC_C:.c=.o)) + + # Define the top-level target, the main firmware. + all: $(BUILD)/firmware.elf + + # Define how to build the firmware. + $(BUILD)/firmware.elf: $(OBJ) + $(ECHO) "LINK $@" + $(Q)$(CC) $(LDFLAGS) -o $@ $^ $(LIBS) + $(Q)$(SIZE) $@ + + # Include remaining core make rules. + include $(TOP)/py/mkrules.mk + +Remember to use proper tabs to indent the Makefile. + +MicroPython Configurations +-------------------------- + +After integrating the minimal code above, the next step is to create the MicroPython +configuration files for the port. The compile-time configurations are specified in +``mpconfigport.h`` and additional hardware-abstraction functions, such as time keeping, +in ``mphalport.h``. + +The following is an example of an ``mpconfigport.h`` file: + +.. code-block:: c + + #include + + // Python internal features. + #define MICROPY_ENABLE_GC (1) + #define MICROPY_HELPER_REPL (1) + #define MICROPY_ERROR_REPORTING (MICROPY_ERROR_REPORTING_TERSE) + #define MICROPY_FLOAT_IMPL (MICROPY_FLOAT_IMPL_FLOAT) + + // Fine control over Python builtins, classes, modules, etc. + #define MICROPY_PY_ASYNC_AWAIT (0) + #define MICROPY_PY_BUILTINS_SET (0) + #define MICROPY_PY_ATTRTUPLE (0) + #define MICROPY_PY_COLLECTIONS (0) + #define MICROPY_PY_MATH (0) + #define MICROPY_PY_IO (0) + #define MICROPY_PY_STRUCT (0) + + // Type definitions for the specific machine. + + typedef intptr_t mp_int_t; // must be pointer size + typedef uintptr_t mp_uint_t; // must be pointer size + typedef long mp_off_t; + + // We need to provide a declaration/definition of alloca(). + #include + + // Define the port's name and hardware. + #define MICROPY_HW_BOARD_NAME "example-board" + #define MICROPY_HW_MCU_NAME "unknown-cpu" + + #define MP_STATE_PORT MP_STATE_VM + + #define MICROPY_PORT_ROOT_POINTERS \ + const char *readline_hist[8]; + +This configuration file contains machine-specific configurations including aspects like if different +MicroPython features should be enabled e.g. ``#define MICROPY_ENABLE_GC (1)``. Making this Setting +``(0)`` disables the feature. + +Other configurations include type definitions, root pointers, board name, microcontroller name +etc. + +Similarly, an minimal example ``mphalport.h`` file looks like this: + +.. code-block:: c + + static inline void mp_hal_set_interrupt_char(char c) {} + +Support for standard input/output +--------------------------------- + +MicroPython requires at least a way to output characters, and to have a REPL it also +requires a way to input characters. Functions for this can be implemented in the file +``mphalport.c``, for example: + +.. code-block:: c + + #include + #include "py/mpconfig.h" + + // Receive single character, blocking until one is available. + int mp_hal_stdin_rx_chr(void) { + unsigned char c = 0; + int r = read(STDIN_FILENO, &c, 1); + (void)r; + return c; + } + + // Send the string of given length. + void mp_hal_stdout_tx_strn(const char *str, mp_uint_t len) { + int r = write(STDOUT_FILENO, str, len); + (void)r; + } + +These input and output functions have to be modified depending on the +specific board API. This example uses the standard input/output stream. + +Building and running +-------------------- + +At this stage the directory of the new port should contain:: + + ports/example_port/ + ├── main.c + ├── Makefile + ├── mpconfigport.h + ├── mphalport.c + └── mphalport.h + +The port can now be built by running ``make`` (or otherwise, depending on your system). + +If you are using the default compiler settings in the Makefile given above then this +will create an executable called ``build/firmware.elf`` which can be executed directly. +To get a functional REPL you may need to first configure the terminal to raw mode: + +.. code-block:: bash + + $ stty raw opost -echo + $ ./build/firmware.elf + +That should give a MicroPython REPL. You can then run commands like: + +.. code-block:: bash + + MicroPython v1.13 on 2021-01-01; example-board with unknown-cpu + >>> import usys + >>> usys.implementation + ('micropython', (1, 13, 0)) + >>> + +Use Ctrl-D to exit, and then run ``reset`` to reset the terminal. + +Adding a module to the port +--------------------------- + +To add a custom module like ``myport``, first add the module definition in a file +``modmyport.c``: + +.. code-block:: c + + #include "py/runtime.h" + + STATIC mp_obj_t myport_info(void) { + mp_printf(&mp_plat_print, "info about my port\n"); + return mp_const_none; + } + STATIC MP_DEFINE_CONST_FUN_OBJ_0(myport_info_obj, myport_info); + + STATIC const mp_rom_map_elem_t myport_module_globals_table[] = { + { MP_OBJ_NEW_QSTR(MP_QSTR___name__), MP_OBJ_NEW_QSTR(MP_QSTR_myport) }, + { MP_ROM_QSTR(MP_QSTR_info), MP_ROM_PTR(&myport_info_obj) }, + }; + STATIC MP_DEFINE_CONST_DICT(myport_module_globals, myport_module_globals_table); + + const mp_obj_module_t myport_module = { + .base = { &mp_type_module }, + .globals = (mp_obj_dict_t *)&myport_module_globals, + }; + + MP_REGISTER_MODULE(MP_QSTR_myport, myport_module, 1); + +Note: the "1" as the third argument in ``MP_REGISTER_MODULE`` enables this new module +unconditionally. To allow it to be conditionally enabled, replace the "1" by +``MICROPY_PY_MYPORT`` and then add ``#define MICROPY_PY_MYPORT (1)`` in ``mpconfigport.h`` +accordingly. + +You will also need to edit the Makefile to add ``modmyport.c`` to the ``SRC_C`` list, and +a new line adding the same file to ``SRC_QSTR`` (so qstrs are searched for in this new file), +like this: + +.. code-block:: Makefile + + SRC_C = \ + main.c \ + modmyport.c \ + mphalport.c \ + ... + + SRC_QSTR += modport.c + +If all went correctly then, after rebuilding, you should be able to import the new module: + +.. code-block:: bash + + >>> import myport + >>> myport.info() + info about my port + >>> diff --git a/docs/develop/publiccapi.rst b/docs/develop/publiccapi.rst new file mode 100644 index 0000000000..132c7b136b --- /dev/null +++ b/docs/develop/publiccapi.rst @@ -0,0 +1,25 @@ +.. _publiccapi: + +The public C API +================ + +The public C-API comprises functions defined in all C header files in the ``py/`` +directory. Most of the important core runtime C APIs are exposed in ``runtime.h`` and +``obj.h``. + +The following is an example of public API functions from ``obj.h``: + +.. code-block:: c + + mp_obj_t mp_obj_new_list(size_t n, mp_obj_t *items); + mp_obj_t mp_obj_list_append(mp_obj_t self_in, mp_obj_t arg); + mp_obj_t mp_obj_list_remove(mp_obj_t self_in, mp_obj_t value); + void mp_obj_list_get(mp_obj_t self_in, size_t *len, mp_obj_t **items); + +At its core, any functions and macros in header files make up the public +API and can be used to access very low-level details of MicroPython. Static +inline functions in header files are fine too, such functions will be +inlined in the code when used. + +Header files in the ``ports`` directory are only exposed to the functionality +specific to a given port. diff --git a/docs/develop/qstr.rst b/docs/develop/qstr.rst index 3550a8bd42..cd1fc47862 100644 --- a/docs/develop/qstr.rst +++ b/docs/develop/qstr.rst @@ -1,3 +1,5 @@ +.. _qstr: + MicroPython string interning ============================ @@ -57,6 +59,7 @@ Processing happens in the following stages: information. Note that this step only uses files that have changed, which means that ``qstr.i.last`` will only contain data from files that have changed since the last compile. + 2. ``qstr.split`` is an empty file created after running ``makeqstrdefs.py split`` on qstr.i.last. It's just used as a dependency to indicate that the step ran. This script outputs one file per input C file, ``genhdr/qstr/...file.c.qstr``, @@ -71,8 +74,8 @@ Processing happens in the following stages: data is written to another file (``qstrdefs.collected.h.hash``) which allows it to track changes across builds. -4. ``qstrdefs.preprocessed.h`` adds in the QSTRs from qstrdefs*. It - concatenates ``qstrdefs.collected.h`` with ``qstrdefs*.h``, then it transforms +4. Generate an enumeration, each entry of which maps a ``MP_QSTR_Foo`` to it's corresponding index. + It concatenates ``qstrdefs.collected.h`` with ``qstrdefs*.h``, then it transforms each line from ``Q(Foo)`` to ``"Q(Foo)"`` so they pass through the preprocessor unchanged. Then the preprocessor is used to deal with any conditional compilation in ``qstrdefs*.h``. Then the transformation is undone back to diff --git a/docs/develop/writingtests.rst b/docs/develop/writingtests.rst new file mode 100644 index 0000000000..4bdf4dd7a6 --- /dev/null +++ b/docs/develop/writingtests.rst @@ -0,0 +1,70 @@ +.. _writingtests: + +Writing tests +============= + +Tests in MicroPython are located at the path ``tests/``. The following is a listing of +key directories and the run-tests runner script: + +.. code-block:: bash + + . + ├── basics + ├── extmod + ├── float + ├── micropython + ├── run-tests + ... + +There are subfolders maintained to categorize the tests. Add a test by creating a new file in one of the +existing folders or in a new folder. It's also possible to make custom tests outside this tests folder, +which would be recommended for a custom port. + +For example, add the following code in a file ``print.py`` in the ``tests/unix/`` subdirectory: + +.. code-block:: python + + def print_one(): + print(1) + + print_one() + +If you run your tests, this test should appear in the test output: + +.. code-block:: bash + + $ cd ports/unix + $ make tests + skip unix/extra_coverage.py + pass unix/ffi_callback.py + pass unix/ffi_float.py + pass unix/ffi_float2.py + pass unix/print.py + pass unix/time.py + pass unix/time2.py + +Tests are run by comparing the output from the test target against the output from CPython. +So any test should use print statements to indicate test results. + +For tests that can't be compared to CPython (i.e. micropython-specific functionality), +you can provide a ``.py.exp`` file which will be used as the truth for comparison. + +The other way to run tests, which is useful when running on targets other than the Unix port, is: + +.. code-block:: bash + + $ cd tests + $ ./run-tests + +Then to run on a board: + +.. code-block:: bash + + $ ./run-tests --target minimal --device /dev/ttyACM0 + +And to run only a certain set of tests (eg a directory): + +.. code-block:: bash + + $ ./run-tests -d basics + $ ./run-tests float/builtin*.py