Commit Graph

49 Commits

Author SHA1 Message Date
Jim Mussared c44b3927b8 py/objstr: Add a helper to set mp_obj_str_t data.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
2022-10-11 17:50:19 +11:00
Jim Mussared 6c3d8d38bf py/objstr: Always validate utf-8 for mp_obj_new_str.
All uses of this are either tiny strings or not-known-to-be-safe.

Update comments for mp_obj_new_str_copy and mp_obj_new_str_of_type.

Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
2022-08-26 16:45:46 +10:00
Damien George 945f377b43 py/objstr: Remove str function object declarations from header file.
Since f7f56d4285 consolidated all uses of
these to a single locals dict, they no longer need to be made public.

Signed-off-by: Damien George <damien@micropython.org>
2022-08-12 16:38:22 +10:00
Jim Mussared 28aaab9590 py/objstr: Add hex/fromhex to bytes/memoryview/bytearray.
These were added in Python 3.5.

Enabled via MICROPY_PY_BUILTINS_BYTES_HEX, and enabled by default for all
ports that currently have ubinascii.

Rework ubinascii to use the implementation of these methods.

Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
2022-08-12 12:44:30 +10:00
Andrew Leech f7f56d4285 py/objstr: Consolidate methods for str/bytes/bytearray/array.
This commit adds the bytes methods to bytearray, matching CPython.  The
existing implementations of these methods for str/bytes are reused for
bytearray with minor updates to match CPython return types.

For details on the CPython behaviour see
https://docs.python.org/3/library/stdtypes.html#bytes-and-bytearray-operations

The work to merge locals tables for str/bytes/bytearray/array was done by
@jimmo.  Because of this merging of locals the change in code size for this
commit is mostly negative:

       bare-arm:    +0 +0.000%
    minimal x86:   +29 +0.018%
       unix x64:  -792 -0.128% standard[incl -448(data)]
    unix nanbox:  -436 -0.078% nanbox[incl -448(data)]
          stm32:   -40 -0.010% PYBV10
         cc3200:   -32 -0.017%
        esp8266:   -28 -0.004% GENERIC
          esp32:   -72 -0.005% GENERIC[incl -200(data)]
         mimxrt:   -40 -0.011% TEENSY40
     renesas-ra:   -40 -0.006% RA6M2_EK
            nrf:   -16 -0.009% pca10040
            rp2:   -64 -0.013% PICO
           samd:  +148 +0.105% ADAFRUIT_ITSYBITSY_M4_EXPRESS
2022-08-11 23:18:02 +10:00
Damien George 82b3500724 py/qstr: Change qstr hash type from mp_uint_t to size_t.
The hash is either 8 or 16 bits (depending on MICROPY_QSTR_BYTES_IN_HASH)
so will fit in a size_t.

This saves 268 bytes on the unix nanbox build.  Non-nanbox configurations
are unchanged because mp_uint_t is the same size as size_t.

Signed-off-by: Damien George <damien@micropython.org>
2022-08-11 23:18:02 +10:00
Damien George b5986784e4 py/objstr: Reformat str access macros to make them readable.
Signed-off-by: Damien George <damien@micropython.org>
2022-08-10 14:31:06 +10:00
Damien George 69661f3343 all: Reformat C and Python source code with tools/codeformat.py.
This is run with uncrustify 0.70.1, and black 19.10b0.
2020-02-28 10:33:03 +11:00
Damien George 4c0176d13f py/objstr: Don't use inline GET_STR_DATA_LEN for object-repr D.
Changing to use the helper function mp_obj_str_get_data_no_check() reduces
code size of nan-boxing builds by about 1000 bytes.
2019-12-27 23:15:52 +11:00
stijn fb54736bdb py/objarray: Add decode method to bytearray.
Reuse the implementation for bytes since it works the same way regardless
of the underlying type.  This method gets added for CPython compatibility
of bytearray, but to keep the code simple and small array.array now also
has a working decode method, which is non-standard but doesn't hurt.
2019-05-21 14:24:04 +10:00
Damien George eee1e8841a py: Downcase all MP_OBJ_IS_xxx macros to make a more consistent C API.
These macros could in principle be (inline) functions so it makes sense to
have them lower case, to match the other C API functions.

The remaining macros that are upper case are:
- MP_OBJ_TO_PTR, MP_OBJ_FROM_PTR
- MP_OBJ_NEW_SMALL_INT, MP_OBJ_SMALL_INT_VALUE
- MP_OBJ_NEW_QSTR, MP_OBJ_QSTR_VALUE
- MP_OBJ_FUN_MAKE_SIG
- MP_DECLARE_CONST_xxx
- MP_DEFINE_CONST_xxx

These must remain macros because they are used when defining const data (at
least, MP_OBJ_NEW_SMALL_INT is so it makes sense to have
MP_OBJ_SMALL_INT_VALUE also a macro).

For those macros that have been made lower case, compatibility macros are
provided for the old names so that users do not need to change their code
immediately.
2019-02-12 14:54:51 +11:00
Damien George 1f1d5194d7 py/objstr: Make mp_obj_new_str_of_type check for existing interned qstr.
The function mp_obj_new_str_of_type is a general str object constructor
used in many places in the code to create either a str or bytes object.
When creating a str it should first check if the string data already exists
as an interned qstr, and if so then return the qstr object.  This patch
makes the function have such behaviour, which helps to reduce heap usage by
reusing existing interned data where possible.

The old behaviour of mp_obj_new_str_of_type (which didn't check for
existing interned data) is made available through the function
mp_obj_new_str_copy, but should only be used in very special cases.

One consequence of this patch is that the following expression is now True:

    'abc' is ' abc '.split()[0]
2017-11-16 13:53:04 +11:00
Damien George 58321dd985 all: Convert mp_uint_t to mp_unary_op_t/mp_binary_op_t where appropriate
The unary-op/binary-op enums are already defined, and there are no
arithmetic tricks used with these types, so it makes sense to use the
correct enum type for arguments that take these values.  It also reduces
code size quite a bit for nan-boxing builds.
2017-08-29 13:16:30 +10:00
Alexander Steffen 55f33240f3 all: Use the name MicroPython consistently in comments
There were several different spellings of MicroPython present in comments,
when there should be only one.
2017-07-31 18:35:40 +10:00
Alexander Steffen 299bc62586 all: Unify header guard usage.
The code conventions suggest using header guards, but do not define how
those should look like and instead point to existing files. However, not
all existing files follow the same scheme, sometimes omitting header guards
altogether, sometimes using non-standard names, making it easy to
accidentally pick a "wrong" example.

This commit ensures that all header files of the MicroPython project (that
were not simply copied from somewhere else) follow the same pattern, that
was already present in the majority of files, especially in the py folder.

The rules are as follows.

Naming convention:
* start with the words MICROPY_INCLUDED
* contain the full path to the file
* replace special characters with _

In addition, there are no empty lines before #ifndef, between #ifndef and
one empty line before #endif. #endif is followed by a comment containing
the name of the guard macro.

py/grammar.h cannot use header guards by design, since it has to be
included multiple times in a single C file. Several other files also do not
need header guards as they are only used internally and guaranteed to be
included only once:
* MICROPY_MPHALPORT_H
* mpconfigboard.h
* mpconfigport.h
* mpthreadport.h
* pin_defs_*.h
* qstrdefs*.h
2017-07-18 11:57:39 +10:00
Damien George c0d9500eee py/objstr: Convert mp_uint_t to size_t (and use int) where appropriate. 2017-02-16 16:51:16 +11:00
Damien George 4ebdb1f2b2 py: Be more specific with MP_DECLARE_CONST_FUN_OBJ macros.
In order to have more fine-grained control over how builtin functions are
constructed, the MP_DECLARE_CONST_FUN_OBJ macros are made more specific,
with suffix of _0, _1, _2, _3, _VAR, _VAR_BETEEN or _KW.  These names now
match the MP_DEFINE_CONST_FUN_OBJ macros.
2016-10-21 16:26:01 +11:00
Damien George 5f3bda422a py: If str/bytes hash is 0 then explicitly compute it. 2016-09-02 14:49:50 +10:00
Paul Sokolovsky 1b5abfcaae py/objstr: Implement str.center().
Disabled by default, enabled in unix port. Need for this method easily
pops up when working with text UI/reporting, and coding workalike
manually again and again counter-productive.
2016-05-22 00:13:44 +03:00
Paul Sokolovsky c38809e26b py/objarray: Implement "in" operator for bytearray. 2016-02-14 18:57:11 +02:00
Damien George 5b3f0b7f39 py: Change first arg of type.make_new from mp_obj_t to mp_obj_type_t*.
The first argument to the type.make_new method is naturally a uPy type,
and all uses of this argument cast it directly to a pointer to a type
structure.  So it makes sense to just have it a pointer to a type from
the very beginning (and a const pointer at that).  This patch makes
such a change, and removes all unnecessary casting to/from mp_obj_t.
2016-01-11 00:49:27 +00:00
Damien George 4b72b3a133 py: Change type signature of builtin funs that take variable or kw args.
With this patch the n_args parameter is changed type from mp_uint_t to
size_t.
2016-01-11 00:49:27 +00:00
Damien George a0c97814df py: Change type of .make_new and .call args: mp_uint_t becomes size_t.
This patch changes the type signature of .make_new and .call object method
slots to use size_t for n_args and n_kw (was mp_uint_t.  Makes code more
efficient when mp_uint_t is larger than a machine word.  Doesn't affect
ports when size_t and mp_uint_t have the same size.
2016-01-11 00:48:41 +00:00
Damien George 999cedb90f py: Wrap all obj-ptr conversions in MP_OBJ_TO_PTR/MP_OBJ_FROM_PTR.
This allows the mp_obj_t type to be configured to something other than a
pointer-sized primitive type.

This patch also includes additional changes to allow the code to compile
when sizeof(mp_uint_t) != sizeof(void*), such as using size_t instead of
mp_uint_t, and various casts.
2015-11-29 14:25:35 +00:00
Damien George c3f64d9799 py: Change qstr_* functions to use size_t as the type for str len arg. 2015-11-29 14:25:04 +00:00
Damien George 04353cc85e py: With obj repr "C", change raw str accessor from macro to function.
This saves around 1000 bytes (Thumb2 arch) because in repr "C" it is
costly to check and extract a qstr.  So making such check/extract a
function instead of a macro saves lots of code space.
2015-10-20 12:38:54 +01:00
Damien George 7f9d1d6ab9 py: Overhaul and simplify printf/pfenv mechanism.
Previous to this patch the printing mechanism was a bit of a tangled
mess.  This patch attempts to consolidate printing into one interface.

All (non-debug) printing now uses the mp_print* family of functions,
mainly mp_printf.  All these functions take an mp_print_t structure as
their first argument, and this structure defines the printing backend
through the "print_strn" function of said structure.

Printing from the uPy core can reach the platform-defined print code via
two paths: either through mp_sys_stdout_obj (defined pert port) in
conjunction with mp_stream_write; or through the mp_plat_print structure
which uses the MP_PLAT_PRINT_STRN macro to define how string are printed
on the platform.  The former is only used when MICROPY_PY_IO is defined.

With this new scheme printing is generally more efficient (less layers
to go through, less arguments to pass), and, given an mp_print_t*
structure, one can call mp_print_str for efficiency instead of
mp_printf("%s", ...).  Code size is also reduced by around 200 bytes on
Thumb2 archs.
2015-04-16 14:30:16 +00:00
Damien George 4dea922610 py: Adjust some spaces in code style/format, purely for consistency. 2015-04-09 15:29:54 +00:00
Paul Sokolovsky ac2f7a7f6a objstr: Add .splitlines() method.
splitlines() occurs ~179 times in CPython3 standard library, so was
deemed worthy to implement. The method has subtle semantic differences
from just .split("\n"). It is also defined as working for any end-of-line
combination, but this is currently not implemented - it works only with
LF line-endings (which should be OK for text strings on any platforms,
but not OK for bytes).
2015-04-04 00:09:48 +03:00
Paul Sokolovsky 8705171233 objstr: Expose mp_obj_str_split() for reuse in other modules. 2015-03-23 22:43:37 +02:00
Paul Sokolovsky 344e15b1ae objstr: Remove code duplication and unbreak Windows build.
There was really weird warning (promoted to error) when building Windows
port. Exact cause is still unknown, but it uncovered another issue:
8-bit and unicode str_make_new implementations should be mutually exclusive,
and not built at the same time. What we had is that bytes_decode() pulled
8-bit str_make_new() even for unicode build.
2015-01-23 02:15:56 +02:00
Paul Sokolovsky c114496641 objstr: Implement kwargs support for str.format(). 2015-01-04 00:26:31 +02:00
Damien George 51dfcb4bb7 py: Move to guarded includes, everywhere in py/ core.
Addresses issue #1022.
2015-01-01 20:32:09 +00:00
Damien George 39dc145478 py: Change [u]int to mp_[u]int_t in qstr.[ch], and some other places.
This should pretty much resolve issue #50.
2014-10-03 19:52:22 +01:00
Damien George cde0ca21bf py: Simplify JSON str printing (while still conforming to JSON spec).
The JSON specs are relatively flexible and allow us to use one function
to print strings, be they ascii, bytes or utf-8 encoded.
2014-09-25 17:35:56 +01:00
Damien George 4abff7500f py: Change uint to mp_uint_t in runtime.h, stackctrl.h, binary.h.
Part of code cleanup, working towards resolving issue #50.
2014-08-30 14:59:21 +01:00
Damien George 4d91723587 py: Remove use of int type in obj.h.
Part of code cleanup, working towards resolving issue #50.
2014-08-30 14:28:06 +01:00
Damien George ecc88e949c Change some parts of the core API to use mp_uint_t instead of uint/int.
Addressing issue #50, still some way to go yet.
2014-08-30 00:35:11 +01:00
Damien George 26a0d4f4f1 py: Change hash and len members of str from 16 bit to full word.
This allows to make strings longer than 64k.  It doesn't use any more
RAM with current GC because a str object still fits in a GC block.
2014-08-22 18:34:28 +01:00
Damien George 40f3c02682 Rename machine_(u)int_t to mp_(u)int_t.
See discussion in issue #50.
2014-07-03 13:25:24 +01:00
Damien George e04a44e2f6 py: Small comments, name changes, use of machine_int_t. 2014-06-28 10:27:23 +01:00
Paul Sokolovsky ea2c936c7e objstrunicode: Refactor str_index_to_ptr() following objstr. 2014-06-27 00:04:20 +03:00
Paul Sokolovsky cdc020da4b objstrunicode: Re-add buffer protocol back for now, required for io.StringIO. 2014-06-27 00:04:18 +03:00
Paul Sokolovsky 9731912ccb py: Prune unneeded code from objstrunicode, reuse code in objstr. 2014-06-27 00:04:18 +03:00
Damien George f600a6a085 py: Slightly improve efficiency of mp_obj_new_str; rename str_new.
Reorder interning logic in mp_obj_new_str, to be more efficient.

str_new is globally accessible, so should be prefixed with mp_obj_.
2014-05-25 22:34:34 +01:00
Paul Sokolovsky a47b64ae2d objstringio: Implement io.BytesIO.
Done in generalized manner, allowing any stream class to be specified as
working with bytes.
2014-05-15 07:28:19 +03:00
Damien George 04b9147e15 Add license header to (almost) all files.
Blanket wide to all .c and .h files.  Some files originating from ST are
difficult to deal with (license wise) so it was left out of those.

Also merged modpyb.h, modos.h, modstm.h and modtime.h in stmhal/.
2014-05-03 23:27:38 +01:00
Damien George 897fe0c0d0 py: Add builtin functions bin and oct, and some tests for them. 2014-04-15 22:03:55 +01:00
Paul Sokolovsky 58676fc2c7 objstr: Allow to define statically allocated str objects.
Similar to tuples, lists, dicts. Statically allocated strings don't have hash
computed.
2014-04-14 01:45:06 +03:00