The ISP pipeline ================ The *image signal processor* (ISP) is the hardware pipeline that turns the raw pixel values from the sensor into a finished colour image. The on-sensor :doc:`pixel-level corrections <../sensor/calibration>` are the first stages of that pipeline. After those run, the rest of the pipeline does the colour processing and output formatting in a fixed order on every frame. .. figure:: ../figures/isp-pipeline.svg :alt: A vertical pipeline diagram with eight labelled boxes, top to bottom: statistics extraction, auto white balance, debayering, colour matrix correction, gamma correction, image scaling, image cropping, and pixel packing. An arrow at the top is labelled "corrected Bayer pixels" and an arrow at the bottom is labelled "finished frame". The colour-processing and output stages of the ISP. The pipeline runs each stage over every pixel in the frame before the next one starts. The stages ---------- Each stage applies one well-defined transformation in turn. The order matters -- later stages assume earlier ones have already run, and a couple of stages take inputs from the *previous* frame's output as well. 1. **Statistics extraction** measures per-region average brightness and per-channel sums from the corrected Bayer frame. The numbers feed the auto-exposure, auto-gain, and auto-white-balance control loops, which then update the sensor's settings for the *next* frame. 2. **Auto-white-balance gains** scale each Bayer pixel by a per-colour multiplier -- red pixels by an R gain, green pixels by a G gain, blue pixels by a B gain -- pushing the scene's white reference toward neutral grey so the recorded colours look the way the eye saw them. The multipliers come from the previous frame's AWB statistics. 3. **Debayering** reconstructs the missing two colour channels at every pixel from the Bayer mosaic, turning one-channel-per-pixel raw data into three-channel RGB. (See :doc:`debayering`.) Everything after this stage runs on RGB pixels rather than on the Bayer mosaic. 4. **Colour matrix correction (CCM)** applies a 3x3 matrix multiply to each RGB pixel that maps the sensor's native red-green-blue response into a standard colour space. Each sensor's filters have their own spectral response, which is not exactly what any standard expects; the matrix is a per-sensor calibrated transform that turns "sensor RGB" into "standard RGB". 5. **Gamma correction** applies a non-linear curve to each channel that compresses the linear sensor signal into a perception-matched encoding. The eye notices differences between dark tones more than differences between bright tones, so an encoding that spends more of its bit budget on the dark end captures more visible detail at a given bit depth. 6. **Image scaling** resizes the frame from the sensor's native resolution to the target output resolution. Most applications run at less than the sensor's full pixel count, and scaling down reduces both the bandwidth and the memory pressure on everything that follows. 7. **Image cropping** extracts a sub-rectangle of the scaled frame and discards the pixels outside it. Used to capture a region of interest, match a specific aspect ratio, or drop a border the application does not need. 8. **Pixel packing** converts the per-channel internal representation (typically 10 or 12 bits per channel) into the chosen output format and writes the result to RAM. The control-loop feedback ------------------------- Stages 1 and 2 form a control loop that spans multiple frames. The statistics extracted from frame N tell the sensor how bright the scene was and how its colour balance sat that frame; the auto-exposure, auto-gain, and auto-white-balance controllers use those numbers to pick new exposure, gain, and white-balance register values for frame N+1. The new values take effect on the next frame's read-out, the new frame's statistics come back, and the loop closes. For a scene that does not change, the loop converges within a few frames and stays at a constant setting. For a scene whose brightness or colour cast changes -- the camera panning from indoors to a sunlit window, for example -- the loop tracks the change over several frames, and the user sees a brief brightness or colour drift on the way to the new steady state. Where the ISP runs ------------------ Two arrangements are common. * An *on-sensor ISP* runs the whole pipeline inside the sensor chip and outputs a finished RGB image. The MCU just collects the result. * An *off-sensor ISP* lives in the host MCU or SoC. The sensor outputs raw Bayer; the MCU's silicon (or its driver code) runs the pipeline before handing the finished frame to user code. The split affects what output formats the sensor can hand the user directly. A sensor with a full on-chip ISP lets the user pick from any finished format the chip supports. A sensor without one outputs Bayer only, and the format conversions happen in MCU silicon or software.