9.7. Views and copies

A view is a second window onto the same data block as the source. No data is copied; the view holds a fresh descriptor (its own shape, strides, and dtype) but shares the buffer. Views are essentially free.

A copy asks the cam for a new buffer and walks the source filling it in. Copies cost both time and RAM.

Most of the shape-shifting methods produce views. Most of the data-transforming ones produce copies. Knowing which is which decides whether a hot loop runs the cam out of RAM.

9.7.1. Reshape

reshape() returns an array of the requested shape. The total number of elements must be unchanged or ValueError is raised:

a = np.arange(12, dtype=np.uint8)
m = a.reshape((3, 4))

The result is a view – m and a share data. Writing through m[0, 0] = 99 changes a[0] too.

Assigning a new tuple to shape is shorthand for the same operation:

a = np.arange(9)
a.shape = (3, 3)

9.7.2. Transpose

transpose() (or the .T shortcut) reverses the axes. Implemented by reversing the strides – no data is moved:

m = np.arange(6, dtype=np.uint8).reshape((2, 3))
t = m.T                  # shape (3, 2), shares m's buffer

A transposed view does not walk the data block contiguously; that is fine for the arithmetic and reductions that come later, but it means tobytes() raises on a transposed view – the bytes are not in the order the view’s shape suggests.

9.7.3. Flatten and flat

flatten() returns a 1-D copy of the array:

f = m.flatten()          # new dense 1-D ndarray

Pass order='C' (default) to walk the last axis first or order='F' for the first axis first.

flat is the iterator form. It yields every element of any-rank ndarray as scalars, without allocating a flat copy:

for x in m.flat:
    print(x)

When the application needs to walk every element, prefer flat; when it needs a dense 1-D buffer to hand to another function, use flatten().

9.7.4. Iteration

Iterating a 1-D array yields scalars; iterating a higher-rank array yields (n-1)-D views:

m = np.array([[0, 1, 2], [3, 4, 5]], dtype=np.uint8)
for row in m:
    print(row)               # array([0, 1, 2]), array([3, 4, 5])

The rows yielded by iterating a matrix are views, so modifying them modifies the source.

9.7.5. Copies

copy() is the explicit way to get an independent ndarray whose modifications do not affect the original. A new buffer is allocated and the source is walked into it:

c = a.copy()

tobytes() returns a bytearray that shares memory with the array’s data block. Writes through the bytearray modify the array in place. Raises ValueError if the array is not dense (a sliced view, a transpose, …).

tolist() returns the contents as a possibly nested Python list. Useful for serialising small results; expensive for large ones, because every element becomes a separate Python object.

9.7.6. Which operations return which

The full rule:

The following operations return views:

  • slicing – a[1:5], a[::2], m[:, 0];

  • single-axis indexing of a higher-rank array – m[0];

  • iterating an n-D array;

  • reshape(), when the requested layout is compatible;

  • transpose() / .T;

  • frombuffer();

  • asarray(), when the dtype matches.

The following operations return copies:

  • copy();

  • flatten();

  • boolean indexing – a[mask];

  • arithmetic – a + b, a * 2, np.sin(a);

  • array() – always copies, even from another array;

  • concatenate().

Reach for an explicit copy only when an independent buffer is genuinely needed. On a camera with limited RAM, the difference between a view and a copy is often the difference between code that fits and code that does not.