9.5. Shape and strides¶
The data inside an ndarray is one
packed block of numbers. The descriptor in front of that
block decides how that flat block is read out as a
tensor. Understanding what the descriptor records is
what makes the indexing, slicing, and view machinery on
the following pages feel inevitable rather than
arbitrary.
9.5.1. What the descriptor records¶
Five values describe how to read the data block as a tensor:
a = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.uint8)
a.ndim # 2 - number of dimensions
a.shape # (2, 3)- length along each dimension
a.itemsize # 1 - bytes per element (from dtype)
a.size # 6 - total number of elements
a.strides # (3, 1)- step pattern through the buffer
The ndinfo() helper prints all of them
plus the location of the underlying buffer in one call.
Two arrays whose buffer locations match are sharing
memory:
np.ndinfo(a)
# class: ndarray
# shape: (2, 3)
# strides: (3, 1)
# itemsize: 1
# data pointer: 0x...
# type: uint8
9.5.2. Print truncation¶
Printing a large array shows only its first and last
few elements, with ... in the middle, so the IDE
terminal does not fill with thousands of values:
>>> print(np.arange(1000, dtype=np.uint16))
array([0, 1, 2, ..., 997, 998, 999], dtype=uint16)
set_printoptions() overrides the
thresholds when debugging needs the whole buffer:
np.set_printoptions(threshold=2000) # print up to 2000 elements in full
np.set_printoptions(edgeitems=10) # 10 items at each end, not 3
get_printoptions() reads the current
settings back as a dict.
9.5.3. Strides explained¶
A stride is how many bytes to step in the data block
to move one element along a given axis. For the 2x3
uint8 array above, the strides are (3, 1):
moving down by one row jumps 3 bytes, moving right by
one column jumps 1 byte. That is the same as saying the
rows are stored back to back, left to right:
memory: [ 1 ][ 2 ][ 3 ][ 4 ][ 5 ][ 6 ]
^ row 0 ^ row 1
<------- 3 bytes ---->
To read a[i, j], numpy computes
i * strides[0] + j * strides[1] from the start of
the data block and reads itemsize bytes from there.
The same formula extends to any number of dimensions.
This layout – rows stored end to end, with the last
axis varying fastest as you walk through memory – is
called row-major or C order. Every array
numpy allocates on the camera uses this layout;
the Fortran-order option that desktop numpy offers
appears only as the order='F' keyword on
flatten().
9.5.4. Row-major has consequences¶
Two things fall out of “rows stored back to back” that matter when shaping a buffer on the camera.
The last axis is contiguous. Walking a[0, 0] to
a[0, 1] touches the next byte over. Walking
a[0, 0] to a[1, 0] jumps across a whole row.
The last axis is the fast axis for whole-array math.
numpy on the camera always walks the last axis
innermost, regardless of which axis happens to be
longer. The desktop numpy library silently reorders
its loops to put the longest axis innermost; the camera
does not, so a layout choice that desktop numpy
would have papered over still costs time here.
np.sum(m, axis=1) collapses the last axis and runs
in the contiguous direction; np.sum(m, axis=0) does
not. When you have a choice about how to lay out a
buffer, put the long axis last so operations along it
stay in the inner loop.
If the layout starts out wrong,
transpose() (or the .T
shortcut) fixes it without copying the data – it just
swaps the strides:
a = b.T # now iterates fast
Performance has the full performance discussion.
9.5.5. Reshape, transpose, slicing – descriptor edits¶
Any operation that only rewrites the descriptor is free.
reshape swaps a new shape and strides
across the same data block. transpose reverses the
strides. a[::2] doubles a stride. Each returns a
view of the same underlying buffer.
Anything that has to walk the data and write a new buffer is a copy. The detail of which is which is on Views and copies; the rule for now is that “shape math” is free and “data math” is not.
9.5.6. A note about ndim¶
numpy on the camera is built with a maximum
supported ndim of either 2 or 4, fixed in the
firmware. The suffix on ulab.__version__
(-2D, -4D) reports which. Operations that
would produce a higher-rank array raise
ValueError. The vast majority of camera-side
work is 1-D or 2-D, so the limit is rarely an issue.