Working with ndarrays
=====================

The ``ndarray`` is the type that holds numerical data in ``ulab``.
Think of it as a typed, contiguous, n-dimensional buffer with a small
header that describes its shape, strides and element type. This page
walks through the most important things you can do with one.

Anatomy of an ndarray
---------------------

Internally, an ``ndarray`` is a small header followed by a pointer to
a contiguous block of data. The header records:

* ``dtype`` -- the element type;
* ``itemsize`` -- the number of bytes each element occupies;
* ``ndim`` -- the number of dimensions (rank);
* ``shape`` -- a tuple, the length along each axis;
* ``strides`` -- bytes to move along each axis;
* a ``len`` and a pointer to the data buffer.

This is the same memory layout as CPython ``numpy``. Two important
practical consequences:

* Operations like ``reshape``, ``transpose`` and slicing only
  manipulate the header (a handful of bytes), not the data buffer.
  They are essentially free.
* The data is binary and packed, so an ``ndarray`` of ``uint8`` is 8
  times smaller than the equivalent list of Python ``int``.

For a quick textual dump including the data pointer, call
``np.ndinfo(a)``::

   from ulab import numpy as np

   a = np.array(range(5), dtype=np.float)
   np.ndinfo(a)
   # class: ndarray
   # shape: (5,)
   # strides: (8,)
   # itemsize: 8
   # data pointer: 0x...
   # type: float

The data-pointer line is especially useful when you want to confirm
that two arrays share data (a *view*, see below).

Creating arrays
---------------

From iterables
~~~~~~~~~~~~~~

Pass a list (or any iterable) to ``np.array``::

   from ulab import numpy as np

   a = np.array([1, 2, 3, 4, 5, 6])             # 1-D, dtype=float
   b = np.array([1, 2, 3, 4], dtype=np.uint8)   # 1-D, dtype=uint8

   m = np.array([[1, 2, 3],
                 [4, 5, 6]], dtype=np.int16)    # 2-D, shape (2, 3)

If the inner iterables have different lengths, ``ValueError`` is
raised. Mixed iterable types (``range`` + ``list``, etc.) are fine.

From other arrays
~~~~~~~~~~~~~~~~~

Passing one ``ndarray`` to ``np.array`` makes a copy. If the source
and destination dtypes match, the copy is fast (a straight
``memcpy``). If they differ, the elements are converted::

   a = np.array(range(5), dtype=np.uint8)
   b = np.array(a)                     # converted to float (default)
   c = np.array(a, dtype=np.uint8)     # raw copy, stays uint8

Note that the default ``dtype`` of ``np.array`` is always ``float``,
so re-wrapping an integer array without specifying ``dtype=`` always
incurs an iteration and conversion. When performance matters, pass
``dtype=`` explicitly.

Helper constructors
~~~~~~~~~~~~~~~~~~~

These mirror their CPython ``numpy`` equivalents::

   np.zeros((4, 4))
   np.ones(8, dtype=np.uint8)
   np.full((2, 3), 7, dtype=np.int16)
   np.eye(4)
   np.eye(4, M=6, k=-1, dtype=np.int16)
   np.arange(0, 10, 0.5)
   np.linspace(0, 1, num=11)
   np.logspace(0, 3, num=4)            # 1, 10, 100, 1000
   np.diag([1, 2, 3])
   np.empty((3, 3))                    # uninitialised (alias for zeros)

``np.frombuffer`` is especially useful on the camera. It wraps an
existing ``bytes``-like buffer as a typed array *without copying*,
and accepts ``offset=`` and ``count=``::

   buf = b'\x01\x02\x03\x04\x05\x06\x07\x08'
   v = np.frombuffer(buf, dtype=np.uint16)
   # array([513, 1027, 1541, 2055], dtype=uint16) on a little-endian MCU

   v2 = np.frombuffer(buf, dtype=np.uint8, offset=2, count=3)
   # array([3, 4, 5], dtype=uint8)

If the peripheral that produced the buffer uses a different
endianness from the microcontroller, use ``a.byteswap()`` to flip
each multi-byte element. ``a.byteswap()`` returns a new array;
``a.byteswap(inplace=True)`` modifies in place::

   a = np.frombuffer(buf, dtype=np.uint16)
   b = a.byteswap()

Inspecting arrays
-----------------

Each array has the usual ``numpy``-style properties::

   a = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.uint8)

   a.shape       # (2, 3)
   a.size        # 6  -- number of elements
   a.itemsize    # 1  -- bytes per element
   a.dtype       # dtype('uint8')
   a.strides     # (3, 1) -- bytes to step along each axis
   len(a)        # 2 -- always the length of the FIRST axis

You can reshape an array by assigning a new shape tuple to ``.shape``,
which is a shorthand for ``a.reshape((...))``::

   a = np.array(range(9))
   a.shape = (3, 3)

Reshaping
---------

``reshape`` returns a new array (a *view*, when possible) with the
requested shape. The total number of elements must be unchanged::

   a = np.arange(12, dtype=np.uint8)
   m = a.reshape((3, 4))
   print(m)
   # array([[0, 1, 2, 3],
   #        [4, 5, 6, 7],
   #        [8, 9, 10, 11]], dtype=uint8)

Other shape-related methods:

* ``a.flatten()`` -- return a 1-D *copy* of the array. With
  ``order='C'`` (the default) the array is walked along the last axis
  first; with ``order='F'`` the first axis is walked first.
* ``a.flat`` -- iterator that walks every element regardless of rank,
  without allocating a flat copy. Useful for memory-tight loops.
* ``a.transpose()`` (or ``a.T``) -- swap axes. Implemented by flipping
  strides; no data is copied.
* ``a.copy()`` -- explicit deep copy of the data.
* ``a.tolist()`` -- convert to nested Python lists.
* ``a.tobytes()`` -- get the raw underlying bytes (raises
  ``ValueError`` on non-dense, i.e. sliced, arrays).

Example -- ``flatten`` versus ``flat``::

   m = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.int8)

   for x in m.flat:                # walks 1, 2, 3, 4, 5, 6 (no copy)
       print(x)

   flat = m.flatten()              # 1-D copy, dtype preserved
   flat_f = m.flatten(order='F')   # 1, 4, 2, 5, 3, 6

Example -- transpose::

   a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.uint8)
   print(a.T)
   # array([[1, 4, 7],
   #        [2, 5, 8],
   #        [3, 6, 9]], dtype=uint8)

Slicing and indexing
--------------------

Single elements
~~~~~~~~~~~~~~~

Square-bracket indexing works like in standard Python and ``numpy``::

   a = np.arange(10, dtype=np.uint8)
   print(a[0], a[-1])      # 0 9
   print(a[1], a[-2])      # 1 8

   m = np.arange(9, dtype=np.uint8).reshape((3, 3))
   print(m[1, 1])          # 4 (single number, fully indexed)
   print(m[0])             # array([[0, 1, 2]], dtype=uint8) (a row)

When the number of indices is smaller than the rank, the result is a
*view* of reduced rank, not a scalar.

Slices (and views)
~~~~~~~~~~~~~~~~~~

A slice ``start:stop:step`` returns a *view* of the original array,
not a copy. The view shares the underlying data buffer -- modifying
the view modifies the source::

   a = np.arange(10, dtype=np.uint8)
   v = a[::2]              # array([0, 2, 4, 6, 8], dtype=uint8)
   v[0] = 99
   print(a)                # array([99, 1, 2, 3, 4, 5, 6, 7, 8, 9])

You can confirm a view shares data with the original by comparing the
``data pointer`` line in ``np.ndinfo``::

   np.ndinfo(a)            # data pointer: 0x...
   np.ndinfo(a[::2])       # SAME data pointer

If you want an independent buffer, call ``.copy()``::

   v = a[::2].copy()

Slicing extends naturally to higher dimensions::

   m = np.array([[1, 2, 3],
                 [4, 5, 6],
                 [7, 8, 9]], dtype=np.uint8)

   m[0]            # first row
   m[0, :2]        # first two elements of row 0
   m[:, 0]         # column 0 (still 2-D in ulab)
   m[-1]           # last row
   m[-1:-3:-1]     # rows in reverse, last two

Boolean indexing
~~~~~~~~~~~~~~~~

A boolean array of the same shape can be used to select elements
(currently only on 1-D arrays; higher-rank inputs raise
``NotImplementedError``)::

   a = np.arange(9, dtype=np.float)
   mask = a < 5
   print(a[mask])          # array([0.0, 1.0, 2.0, 3.0, 4.0])

The mask itself is just a normal ``ndarray`` of dtype ``bool``, so
you can build it with arbitrary expressions, even involving universal
functions::

   b = np.array([4, 4, 4, 3, 3, 3, 13, 13, 13], dtype=np.uint8)
   a = np.arange(9, dtype=np.uint8)
   print(a[a * a > np.sin(b) * 100.0])

Boolean masks also work on the *left* of an assignment, replacing
elements that satisfy the condition::

   a = np.arange(9, dtype=np.uint8)
   a[a < 3] = 99
   # array([99, 99, 99, 3, 4, 5, 6, 7, 8], dtype=uint8)

The right hand side may be a scalar, or another array of matching
size::

   a = np.arange(9, dtype=np.uint8)
   b = np.array(range(9)) + 12
   a[b < 15] = b[b < 15]
   # array([12, 13, 14, 3, 4, 5, 6, 7, 8], dtype=uint8)

Slice assignment
~~~~~~~~~~~~~~~~

You can also assign *into* a slice. The right-hand side may be a
scalar, another array, or a view::

   m = np.zeros((3, 3), dtype=np.uint8)
   m[0]      = 1            # assign to whole row
   m[:, 2]   = 3            # assign to whole column
   m[1, 1:3] = [7, 8]       # assign to a partial row

Slice assignment is one of the most powerful tools for writing
allocation-free numerical code -- see :doc:`tricks` for examples.

dtypes and upcasting
--------------------

``ulab`` supports a smaller set of dtypes than CPython ``numpy``:
``uint8``, ``int8``, ``uint16``, ``int16``, ``float``, ``bool`` and
optionally ``complex``. The default is ``float``.

Two arrays with different dtypes can be operands of the same
operator. The result type follows ``ulab``'s upcasting rules:

==============  ==============  ===============
left            right           result
==============  ==============  ===============
``uint8``       ``int8``        ``int16``
``uint8``       ``int16``       ``int16``
``uint8``       ``uint16``      ``uint16``
``int8``        ``int16``       ``int16``
``int8``        ``uint16``      ``uint16``
``uint16``      ``int16``       ``float``
any             ``float``       ``float``
any             ``complex``     ``complex``
==============  ==============  ===============

The last two rules differ slightly from CPython ``numpy``, where they
would produce ``int32``. ``ulab`` does not have ``int32``, so it
either picks the widest available integer or upcasts to ``float``.

When the operands of an integer-only operator overflow, the result
*wraps* (``ulab`` does not promote to a wider integer)::

   a = np.array([200, 200], dtype=np.uint8)
   b = np.array([100, 100], dtype=np.uint8)
   print(a + b)             # array([44, 44], dtype=uint8) -- wraps!

If you need a wider intermediate, cast first::

   c = np.array(a, dtype=np.uint16) + b

When a binary operator has a Python scalar on the other side, the
scalar is converted to a single-element array of the *smallest*
suitable dtype. ``123`` becomes a ``uint8`` array; ``-1000`` becomes
``int16``; a Python ``float`` becomes a ``float`` array.

Choose the dtype that matches the hardware that produced the data.
For an 8-bit sensor, ``np.uint8`` saves 4-8x RAM compared to the
``float`` default.

Iterating
---------

``ndarray`` instances are iterable. Iterating a 1-D array yields
scalars; iterating an n-D array yields ``(n-1)``-D *views*::

   a = np.array([1, 2, 3, 4, 5], dtype=np.uint8)
   for x in a:
       print(x)        # 1, 2, 3, 4, 5

   m = np.array([[0, 1, 2], [3, 4, 5]], dtype=np.uint8)
   for row in m:
       print(row)      # array([0, 1, 2]) then array([3, 4, 5])

Because the rows yielded by iterating a matrix are *views*, modifying
them modifies the source matrix.

To walk every element of an n-D array as scalars without flattening
it, use ``a.flat``.

Comparison operators
--------------------

The relational operators (``<``, ``>``, ``<=``, ``>=``, ``==``,
``!=``) are vectorised and return a ``bool`` array::

   a = np.array([1, 2, 3, 4, 5, 6, 7, 8], dtype=np.uint8)
   print(a < 5)
   # array([True, True, True, True, False, False, False, False], dtype=bool)

For the symmetric form, use the function names::

   np.greater(5, a)        # 5 > a, element-wise

.. warning::

   The ``ndarray`` *must* be on the left of a relational operator
   when comparing to a scalar. ``a > 2`` works; ``2 < a`` raises
   ``TypeError``. Use ``np.greater``/``np.less``/``np.equal`` if you
   need the symmetric form (also recommended for CircuitPython, where
   the ``==``/``!=`` operators are not overloaded).

Pretty-printing
---------------

By default, arrays longer than 10 elements along the last axis are
abbreviated with ``...``. You can change this globally::

   np.set_printoptions(threshold=200)             # print up to 200 elements in full
   np.set_printoptions(threshold=10, edgeitems=2) # 2 items each side of the ellipsis
   np.get_printoptions()                          # {'threshold': 10, 'edgeitems': 2}

Views vs. copies, in summary
----------------------------

Views are *cheap*: a view is just a new header that points to the
same data buffer as the source. Copies are *expensive*: they
allocate a new buffer and walk through the source.

The following operations return *views*:

* slicing (``a[1:5]``, ``a[::2]``, ``m[:, 0]``);
* single-axis indexing of a higher-dimensional array (``m[0]``);
* iterating an n-D array;
* ``a.reshape(...)`` (when possible);
* ``a.transpose()`` / ``a.T``;
* ``np.frombuffer(buf, ...)``.

The following operations return *copies*:

* ``a.copy()``;
* ``a.flatten()``;
* boolean indexing (``a[mask]``);
* arithmetic (``a + b``, ``a * 2``, ``np.sin(a)``);
* ``np.array(a)`` (always copies).

Reach for ``.copy()`` only when you genuinely need an independent
buffer -- the camera has limited RAM and avoiding a copy is often
the difference between fitting and not fitting.

Where to go next
----------------

* :doc:`universal` -- element-wise math and ``np.vectorize``.
* :doc:`images` -- bridge between ``image.Image`` and ``ndarray``.
* :doc:`tricks` -- broadcasting, performance tips, advanced
  indexing.
* :doc:`programming` -- broadcasting internals and memory-efficient
  patterns.
* :doc:`/library/omv.ulab.numpy` -- the complete reference for the
  ``ndarray`` class (methods, properties, operators) and all numpy
  module-level functions.