--- jupytext: text_representation: extension: .md format_name: myst format_version: 0.13 jupytext_version: 1.11.2 kernelspec: display_name: Python 3 language: python name: python3 --- ```{code-cell} --- tags: [remove-input, remove-output] --- %matplotlib inline %config InlineBackend.figure_format = 'retina' ``` # Images are numpy arrays +++ Images are represented in ``scikit-image`` using standard ``numpy`` arrays. This allows maximum inter-operability with other libraries in the scientific Python ecosystem, such as ``matplotlib`` and ``scipy``. Let's see how to build a grayscale image as a 2D array: ```{code-cell} import numpy as np from matplotlib import pyplot as plt random_image = np.random.random([500, 500]) plt.imshow(random_image, cmap='gray') plt.colorbar(); ``` The same holds for "real-world" images: ```{code-cell} from skimage import data coins = data.coins() print('Type:', type(coins)) print('dtype:', coins.dtype) print('shape:', coins.shape) plt.imshow(coins, cmap='gray'); ``` A color image is a 3D array, where the last dimension has size 3 and represents the red, green, and blue channels: ```{code-cell} cat = data.chelsea() print("Shape:", cat.shape) print("Values min/max:", cat.min(), cat.max()) plt.imshow(cat); ``` These are *just NumPy arrays*. E.g., we can make a red square by using standard array slicing and manipulation: ```{code-cell} cat[10:110, 10:110, :] = [255, 0, 0] # [red, green, blue] plt.imshow(cat); ``` Images can also include transparent regions by adding a 4th dimension, called an *alpha layer*. +++ ## Other shapes, and their meanings |Image type|Coordinates| |:---|:---| |2D grayscale|(row, column)| |2D multichannel|(row, column, channel)| |3D grayscale (or volumetric) |(plane, row, column)| |3D multichannel|(plane, row, column, channel)| +++ ## Displaying images using matplotlib ```{code-cell} from skimage import data img0 = data.chelsea() img1 = data.rocket() ``` ```{code-cell} import matplotlib.pyplot as plt f, (ax0, ax1) = plt.subplots(1, 2, figsize=(20, 10)) ax0.imshow(img0) ax0.set_title('Cat', fontsize=18) ax0.axis('off') ax1.imshow(img1) ax1.set_title('Rocket', fontsize=18) ax1.set_xlabel(r'Launching position $\alpha=320$') ax1.vlines([202, 300], 0, img1.shape[0], colors='magenta', linewidth=3, label='Side tower position') ax1.plot([168, 190, 200], [400, 200, 300], color='white', linestyle='--', label='Side angle') ax1.legend(); ``` For more on plotting, see the [Matplotlib documentation](https://matplotlib.org/gallery/index.html#images-contours-and-fields) and [pyplot API](https://matplotlib.org/api/pyplot_summary.html). +++ ## Data types and image values In literature, one finds different conventions for representing image values: ``` 0 - 255 where 0 is black, 255 is white 0 - 1 where 0 is black, 1 is white ``` ``scikit-image`` supports both conventions--the choice is determined by the data-type of the array. E.g., here, I generate two valid images: ```{code-cell} linear0 = np.linspace(0, 1, 2500).reshape((50, 50)) linear1 = np.linspace(0, 255, 2500).reshape((50, 50)).astype(np.uint8) print("Linear0:", linear0.dtype, linear0.min(), linear0.max()) print("Linear1:", linear1.dtype, linear1.min(), linear1.max()) fig, (ax0, ax1) = plt.subplots(1, 2, figsize=(15, 15)) ax0.imshow(linear0, cmap='gray') ax1.imshow(linear1, cmap='gray'); ``` The library is designed in such a way that any data-type is allowed as input, as long as the range is correct (0-1 for floating point images, 0-255 for unsigned bytes, 0-65535 for unsigned 16-bit integers). +++ You can convert images between different representations by using ``img_as_float``, ``img_as_ubyte``, etc.: ```{code-cell} from skimage import img_as_float, img_as_ubyte image = data.chelsea() image_ubyte = img_as_ubyte(image) image_float = img_as_float(image) print("type, min, max:", image_ubyte.dtype, image_ubyte.min(), image_ubyte.max()) print("type, min, max:", image_float.dtype, image_float.min(), image_float.max()) print() print("231/255 =", 231/255.) ``` Your code would then typically look like this: ```{code-cell} def my_function(any_image): float_image = img_as_float(any_image) # Proceed, knowing image is in [0, 1] ``` We recommend using the floating point representation, given that ``scikit-image`` mostly uses that format internally. +++ ## Image I/O Mostly, we won't be using input images from the scikit-image example data sets. Those images are typically stored in JPEG or PNG format. Since scikit-image operates on NumPy arrays, *any* image reader library that provides arrays will do. Options include imageio, matplotlib, pillow, etc. scikit-image conveniently wraps many of these in the `io` submodule, and will use whichever of the libraries mentioned above are installed: ```{code-cell} from skimage import io image = io.imread('../images/balloon.jpg') print(type(image)) print(image.dtype) print(image.shape) print(image.min(), image.max()) plt.imshow(image); ``` We also have the ability to load multiple images, or multi-layer TIFF images: ```{code-cell} ic = io.ImageCollection('../images/*.png:../images/*.jpg') print('Type:', type(ic)) ic.files ``` ```{code-cell} import os f, axes = plt.subplots(nrows=3, ncols=len(ic) // 3 + 1, figsize=(20, 5)) # subplots returns the figure and an array of axes # we use `axes.ravel()` to turn these into a list axes = axes.ravel() for ax in axes: ax.axis('off') for i, image in enumerate(ic): axes[i].imshow(image, cmap='gray') axes[i].set_title(os.path.basename(ic.files[i])) plt.tight_layout() ``` ## Aside: `enumerate` `enumerate` gives us each element in a container, along with its position. ```{code-cell} animals = ['cat', 'dog', 'leopard'] ``` ```{code-cell} for i, animal in enumerate(animals): print('The animal in position {} is {}'.format(i, animal)) ``` ## Exercise: draw the letter H Define a function that takes as input an RGB image and a pair of coordinates (row, column), and returns a copy with a green letter H overlaid at those coordinates. The coordinates point to the top-left corner of the H. The arms and strut of the H should have a width of 3 pixels, and the H itself should have a height of 24 pixels and width of 20 pixels. Start with the following template: ```{code-cell} --- tags: [hide-output] --- def draw_H(image, coords, color=(0, 255, 0)): out = image.copy() ... return out ``` Test your function like so: ```{code-cell} --- tags: [remove-output] --- cat = data.chelsea() cat_H = draw_H(cat, (50, -50)) plt.imshow(cat_H); ``` ## Exercise: visualizing RGB channels Display the different color channels of the image along (each as a gray-scale image). Start with the following template: ```{code-cell} --- tags: [raises-exception, remove-output] --- # --- read in the image --- image = plt.imread('../images/Bells-Beach.jpg') # --- assign each color channel to a different variable --- r = ... # FIXME: grab channel from image... g = ... # FIXME b = ... # FIXME # --- display the image and r, g, b channels --- f, axes = plt.subplots(1, 4, figsize=(16, 5)) for ax in axes: ax.axis('off') (ax_r, ax_g, ax_b, ax_color) = axes ax_r.imshow(r, cmap='gray') ax_r.set_title('red channel') ax_g.imshow(g, cmap='gray') ax_g.set_title('green channel') ax_b.imshow(b, cmap='gray') ax_b.set_title('blue channel') # --- Here, we stack the R, G, and B layers again # to form a color image --- ax_color.imshow(np.stack([r, g, b], axis=2)) ax_color.set_title('all channels'); ``` Now, take a look at the following R, G, and B channels. How would their combination look? (Write some code to confirm your intuition.) ```{code-cell} from skimage import draw red = np.zeros((300, 300)) green = np.zeros((300, 300)) blue = np.zeros((300, 300)) r, c = draw.circle_perimeter(100, 100, 100, shape=red.shape) red[r, c] = 1 r, c = draw.circle_perimeter(100, 200, 100, shape=green.shape) green[r, c] = 1 r, c = draw.circle_perimeter(200, 150, 100, shape=blue.shape) blue[r, c] = 1 f, axes = plt.subplots(1, 3) for (ax, channel) in zip(axes, [red, green, blue]): ax.imshow(channel, cmap='gray') ax.axis('off') ``` ## Exercise: Convert to grayscale ("black and white") The *relative luminance* of an image is the intensity of light coming from each point. Different colors contribute differently to the luminance: it's very hard to have a bright, pure blue, for example. So, starting from an RGB image, the luminance is given by: $$ Y = 0.2126R + 0.7152G + 0.0722B $$ Use Python 3.5's matrix multiplication, `@`, to convert an RGB image to a grayscale luminance image according to the formula above. Compare your results to that obtained with `skimage.color.rgb2gray`. Change the coefficients to 1/3 (i.e., take the mean of the red, green, and blue channels, to see how that approach compares with `rgb2gray`). ```{code-cell} --- tags: [raises-exception, remove-output] --- from skimage import color, img_as_float image = img_as_float(io.imread('../images/balloon.jpg')) gray = color.rgb2gray(image) my_gray = ... # FIXME # --- display the results --- f, (ax0, ax1) = plt.subplots(1, 2, figsize=(10, 6)) ax0.imshow(gray, cmap='gray') ax0.set_title('skimage.color.rgb2gray') ax1.imshow(my_gray, cmap='gray') ax1.set_title('my rgb2gray') ```