8.3.2. Working with ndarrays

The ndarray is the type that holds numerical data in ulab. Think of it as a typed, contiguous, n-dimensional buffer with a small header that describes its shape, strides and element type. This page walks through the most important things you can do with one.

8.3.2.1. Anatomy of an ndarray

Internally, an ndarray is a small header followed by a pointer to a contiguous block of data. The header records:

  • dtype – the element type;

  • itemsize – the number of bytes each element occupies;

  • ndim – the number of dimensions (rank);

  • shape – a tuple, the length along each axis;

  • strides – bytes to move along each axis;

  • a len and a pointer to the data buffer.

This is the same memory layout as CPython numpy. Two important practical consequences:

  • Operations like reshape, transpose and slicing only manipulate the header (a handful of bytes), not the data buffer. They are essentially free.

  • The data is binary and packed, so an ndarray of uint8 is 8 times smaller than the equivalent list of Python int.

For a quick textual dump including the data pointer, call np.ndinfo(a):

from ulab import numpy as np

a = np.array(range(5), dtype=np.float)
np.ndinfo(a)
# class: ndarray
# shape: (5,)
# strides: (8,)
# itemsize: 8
# data pointer: 0x...
# type: float

The data-pointer line is especially useful when you want to confirm that two arrays share data (a view, see below).

8.3.2.2. Creating arrays

8.3.2.2.1. From iterables

Pass a list (or any iterable) to np.array:

from ulab import numpy as np

a = np.array([1, 2, 3, 4, 5, 6])             # 1-D, dtype=float
b = np.array([1, 2, 3, 4], dtype=np.uint8)   # 1-D, dtype=uint8

m = np.array([[1, 2, 3],
              [4, 5, 6]], dtype=np.int16)    # 2-D, shape (2, 3)

If the inner iterables have different lengths, ValueError is raised. Mixed iterable types (range + list, etc.) are fine.

8.3.2.2.2. From other arrays

Passing one ndarray to np.array makes a copy. If the source and destination dtypes match, the copy is fast (a straight memcpy). If they differ, the elements are converted:

a = np.array(range(5), dtype=np.uint8)
b = np.array(a)                     # converted to float (default)
c = np.array(a, dtype=np.uint8)     # raw copy, stays uint8

Note that the default dtype of np.array is always float, so re-wrapping an integer array without specifying dtype= always incurs an iteration and conversion. When performance matters, pass dtype= explicitly.

8.3.2.2.3. Helper constructors

These mirror their CPython numpy equivalents:

np.zeros((4, 4))
np.ones(8, dtype=np.uint8)
np.full((2, 3), 7, dtype=np.int16)
np.eye(4)
np.eye(4, M=6, k=-1, dtype=np.int16)
np.arange(0, 10, 0.5)
np.linspace(0, 1, num=11)
np.logspace(0, 3, num=4)            # 1, 10, 100, 1000
np.diag([1, 2, 3])
np.empty((3, 3))                    # uninitialised (alias for zeros)

np.frombuffer is especially useful on the camera. It wraps an existing bytes-like buffer as a typed array without copying, and accepts offset= and count=:

buf = b'\x01\x02\x03\x04\x05\x06\x07\x08'
v = np.frombuffer(buf, dtype=np.uint16)
# array([513, 1027, 1541, 2055], dtype=uint16) on a little-endian MCU

v2 = np.frombuffer(buf, dtype=np.uint8, offset=2, count=3)
# array([3, 4, 5], dtype=uint8)

If the peripheral that produced the buffer uses a different endianness from the microcontroller, use a.byteswap() to flip each multi-byte element. a.byteswap() returns a new array; a.byteswap(inplace=True) modifies in place:

a = np.frombuffer(buf, dtype=np.uint16)
b = a.byteswap()

8.3.2.3. Inspecting arrays

Each array has the usual numpy-style properties:

a = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.uint8)

a.shape       # (2, 3)
a.size        # 6  -- number of elements
a.itemsize    # 1  -- bytes per element
a.dtype       # dtype('uint8')
a.strides     # (3, 1) -- bytes to step along each axis
len(a)        # 2 -- always the length of the FIRST axis

You can reshape an array by assigning a new shape tuple to .shape, which is a shorthand for a.reshape((...)):

a = np.array(range(9))
a.shape = (3, 3)

8.3.2.4. Reshaping

reshape returns a new array (a view, when possible) with the requested shape. The total number of elements must be unchanged:

a = np.arange(12, dtype=np.uint8)
m = a.reshape((3, 4))
print(m)
# array([[0, 1, 2, 3],
#        [4, 5, 6, 7],
#        [8, 9, 10, 11]], dtype=uint8)

Other shape-related methods:

  • a.flatten() – return a 1-D copy of the array. With order='C' (the default) the array is walked along the last axis first; with order='F' the first axis is walked first.

  • a.flat – iterator that walks every element regardless of rank, without allocating a flat copy. Useful for memory-tight loops.

  • a.transpose() (or a.T) – swap axes. Implemented by flipping strides; no data is copied.

  • a.copy() – explicit deep copy of the data.

  • a.tolist() – convert to nested Python lists.

  • a.tobytes() – get the raw underlying bytes (raises ValueError on non-dense, i.e. sliced, arrays).

Example – flatten versus flat:

m = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.int8)

for x in m.flat:                # walks 1, 2, 3, 4, 5, 6 (no copy)
    print(x)

flat = m.flatten()              # 1-D copy, dtype preserved
flat_f = m.flatten(order='F')   # 1, 4, 2, 5, 3, 6

Example – transpose:

a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.uint8)
print(a.T)
# array([[1, 4, 7],
#        [2, 5, 8],
#        [3, 6, 9]], dtype=uint8)

8.3.2.5. Slicing and indexing

8.3.2.5.1. Single elements

Square-bracket indexing works like in standard Python and numpy:

a = np.arange(10, dtype=np.uint8)
print(a[0], a[-1])      # 0 9
print(a[1], a[-2])      # 1 8

m = np.arange(9, dtype=np.uint8).reshape((3, 3))
print(m[1, 1])          # 4 (single number, fully indexed)
print(m[0])             # array([[0, 1, 2]], dtype=uint8) (a row)

When the number of indices is smaller than the rank, the result is a view of reduced rank, not a scalar.

8.3.2.5.2. Slices (and views)

A slice start:stop:step returns a view of the original array, not a copy. The view shares the underlying data buffer – modifying the view modifies the source:

a = np.arange(10, dtype=np.uint8)
v = a[::2]              # array([0, 2, 4, 6, 8], dtype=uint8)
v[0] = 99
print(a)                # array([99, 1, 2, 3, 4, 5, 6, 7, 8, 9])

You can confirm a view shares data with the original by comparing the data pointer line in np.ndinfo:

np.ndinfo(a)            # data pointer: 0x...
np.ndinfo(a[::2])       # SAME data pointer

If you want an independent buffer, call .copy():

v = a[::2].copy()

Slicing extends naturally to higher dimensions:

m = np.array([[1, 2, 3],
              [4, 5, 6],
              [7, 8, 9]], dtype=np.uint8)

m[0]            # first row
m[0, :2]        # first two elements of row 0
m[:, 0]         # column 0 (still 2-D in ulab)
m[-1]           # last row
m[-1:-3:-1]     # rows in reverse, last two

8.3.2.5.3. Boolean indexing

A boolean array of the same shape can be used to select elements (currently only on 1-D arrays; higher-rank inputs raise NotImplementedError):

a = np.arange(9, dtype=np.float)
mask = a < 5
print(a[mask])          # array([0.0, 1.0, 2.0, 3.0, 4.0])

The mask itself is just a normal ndarray of dtype bool, so you can build it with arbitrary expressions, even involving universal functions:

b = np.array([4, 4, 4, 3, 3, 3, 13, 13, 13], dtype=np.uint8)
a = np.arange(9, dtype=np.uint8)
print(a[a * a > np.sin(b) * 100.0])

Boolean masks also work on the left of an assignment, replacing elements that satisfy the condition:

a = np.arange(9, dtype=np.uint8)
a[a < 3] = 99
# array([99, 99, 99, 3, 4, 5, 6, 7, 8], dtype=uint8)

The right hand side may be a scalar, or another array of matching size:

a = np.arange(9, dtype=np.uint8)
b = np.array(range(9)) + 12
a[b < 15] = b[b < 15]
# array([12, 13, 14, 3, 4, 5, 6, 7, 8], dtype=uint8)

8.3.2.5.4. Slice assignment

You can also assign into a slice. The right-hand side may be a scalar, another array, or a view:

m = np.zeros((3, 3), dtype=np.uint8)
m[0]      = 1            # assign to whole row
m[:, 2]   = 3            # assign to whole column
m[1, 1:3] = [7, 8]       # assign to a partial row

Slice assignment is one of the most powerful tools for writing allocation-free numerical code – see Tips, tricks and broadcasting for examples.

8.3.2.6. dtypes and upcasting

ulab supports a smaller set of dtypes than CPython numpy: uint8, int8, uint16, int16, float, bool and optionally complex. The default is float.

Two arrays with different dtypes can be operands of the same operator. The result type follows ulab’s upcasting rules:

left

right

result

uint8

int8

int16

uint8

int16

int16

uint8

uint16

uint16

int8

int16

int16

int8

uint16

uint16

uint16

int16

float

any

float

float

any

complex

complex

The last two rules differ slightly from CPython numpy, where they would produce int32. ulab does not have int32, so it either picks the widest available integer or upcasts to float.

When the operands of an integer-only operator overflow, the result wraps (ulab does not promote to a wider integer):

a = np.array([200, 200], dtype=np.uint8)
b = np.array([100, 100], dtype=np.uint8)
print(a + b)             # array([44, 44], dtype=uint8) -- wraps!

If you need a wider intermediate, cast first:

c = np.array(a, dtype=np.uint16) + b

When a binary operator has a Python scalar on the other side, the scalar is converted to a single-element array of the smallest suitable dtype. 123 becomes a uint8 array; -1000 becomes int16; a Python float becomes a float array.

Choose the dtype that matches the hardware that produced the data. For an 8-bit sensor, np.uint8 saves 4-8x RAM compared to the float default.

8.3.2.7. Iterating

ndarray instances are iterable. Iterating a 1-D array yields scalars; iterating an n-D array yields (n-1)-D views:

a = np.array([1, 2, 3, 4, 5], dtype=np.uint8)
for x in a:
    print(x)        # 1, 2, 3, 4, 5

m = np.array([[0, 1, 2], [3, 4, 5]], dtype=np.uint8)
for row in m:
    print(row)      # array([0, 1, 2]) then array([3, 4, 5])

Because the rows yielded by iterating a matrix are views, modifying them modifies the source matrix.

To walk every element of an n-D array as scalars without flattening it, use a.flat.

8.3.2.8. Comparison operators

The relational operators (<, >, <=, >=, ==, !=) are vectorised and return a bool array:

a = np.array([1, 2, 3, 4, 5, 6, 7, 8], dtype=np.uint8)
print(a < 5)
# array([True, True, True, True, False, False, False, False], dtype=bool)

For the symmetric form, use the function names:

np.greater(5, a)        # 5 > a, element-wise

Warning

The ndarray must be on the left of a relational operator when comparing to a scalar. a > 2 works; 2 < a raises TypeError. Use np.greater/np.less/np.equal if you need the symmetric form (also recommended for CircuitPython, where the ==/!= operators are not overloaded).

8.3.2.9. Pretty-printing

By default, arrays longer than 10 elements along the last axis are abbreviated with .... You can change this globally:

np.set_printoptions(threshold=200)             # print up to 200 elements in full
np.set_printoptions(threshold=10, edgeitems=2) # 2 items each side of the ellipsis
np.get_printoptions()                          # {'threshold': 10, 'edgeitems': 2}

8.3.2.10. Views vs. copies, in summary

Views are cheap: a view is just a new header that points to the same data buffer as the source. Copies are expensive: they allocate a new buffer and walk through the source.

The following operations return views:

  • slicing (a[1:5], a[::2], m[:, 0]);

  • single-axis indexing of a higher-dimensional array (m[0]);

  • iterating an n-D array;

  • a.reshape(...) (when possible);

  • a.transpose() / a.T;

  • np.frombuffer(buf, ...).

The following operations return copies:

  • a.copy();

  • a.flatten();

  • boolean indexing (a[mask]);

  • arithmetic (a + b, a * 2, np.sin(a));

  • np.array(a) (always copies).

Reach for .copy() only when you genuinely need an independent buffer – the camera has limited RAM and avoiding a copy is often the difference between fitting and not fitting.

8.3.2.11. Where to go next