openCV
Window, mouse and keyboard operation
Comprehensive example
Generate a 500 * 500 pure black canvas, create a window, add mouse events to the window, and use the window to display images
import cv2 import numpy as np # Generate 500 * 500 pure black canvas convas = np.zeros(shape=(512, 512, 3), dtype=np.uint8) # create a window cv2.namedWindow(winname='draw circle') # Write mouse events and draw circles for the canvas def onMouse(event, x, y, flags, param): """Double click the mouse with the left button: draw a circle with the mouse position as the center and the radius""" if event == cv2.EVENT_LBUTTONDBLCLK: img, radius, thickness = param thickness = 1 if thickness <= 0 else thickness cv2.circle(img=img, center=(x, y), radius=radius, color=(0, 0, 255), thickness=thickness) # 1 solid circle img = convas.copy() radius = 100 thickness = 1 # Add mouse events to windows cv2.setMouseCallback('draw circle', onMouse, param=[img, radius, thickness]) # Display canvas through window while 1: cv2.imshow(winname='draw circle', mat=img) # The window and canvas are decoupled, so the relationship between the window and canvas should be refreshed every time a circle is added if cv2.waitKey(20) & 0xFF == 27: # esc build exit cv2.destroyWindow(winname='draw circle') break cv2.destroyAllWindows()

Create window cv2.namedWindow(winname, flags=None)

notes:
""" namedWindow(winname[, flags]) > None . @brief Creates a window. ... . @param winname Name of the window in the window caption that may be used as a window identifier. . @param flags Flags of the window. The supported flags are: (cv::WindowFlags) """

Function: create window with specified name

Parameters:
winname: str Window name flags


Destroy window cv2.destroyWindow(winname)

notes:
""" destroyWindow(winname) > None . @brief Destroys the specified window. . . The function destroyWindow destroys the window with the given name. . . @param winname Name of the window to be destroyed. """

Function: destroy the window with the specified name

Parameters:
winname: str Window name


Destroy all windows (CV2. Destroyallwindows)

notes:
""" destroyAllWindows() > None . @brief Destroys all of the HighGUI windows. . . The function destroyAllWindows destroys all of the opened HighGUI windows. """

Function: destroy all open windows

Parameter: None


Mouse callback event cv2.setMouseCallback()

notes:

""" setMouseCallback(windowName, onMouse [, param]) > None """


Function: add mouse events for a specified window

Parameters:
windowName: str Window name onMouse: Function(event, x, y, flags, param) Mouse event callback function

Write the mouse callback function onMouse:Function
def onMouse(event, x, y, flags, param): if event == cv2.EVENT_...: # Judge mouse trigger events pass # Perform the desired action

View mouse event
import cv2 events = [i for i in dir(cv2) if 'EVENT' in i] print(events)
['EVENT_FLAG_ALTKEY', 'EVENT_FLAG_CTRLKEY', 'EVENT_FLAG_LBUTTON', 'EVENT_FLAG_MBUTTON', 'EVENT_FLAG_RBUTTON', 'EVENT_FLAG_SHIFTKEY', 'EVENT_LBUTTONDBLCLK', # Double left click 'EVENT_LBUTTONDOWN', # Left key press 'EVENT_LBUTTONUP', # Left key release 'EVENT_MBUTTONDBLCLK', 'EVENT_MBUTTONDOWN', 'EVENT_MBUTTONUP', 'EVENT_MOUSEHWHEEL', 'EVENT_MOUSEMOVE', 'EVENT_MOUSEWHEEL', 'EVENT_RBUTTONDBLCLK', # Right click double click 'EVENT_RBUTTONDOWN', # Right click 'EVENT_RBUTTONUP'] # Right click release


Keyboard cv2.watKey()
Image reading, writing and display
Comprehensive example
import cv2 # 1. Read image lean = cv2.imread(filename=r'F:\wuxin_convenient\Pictures\CV\dataimg\lena.jpeg', flags=cv2.IMREAD_COLOR) # Read color image lena_gray = cv2.imread(filename=r'F:\wuxin_convenient\Pictures\CV\dataimg\lena.jpeg', flags=cv2.IMREAD_GRAYSCALE) # Read gray image # 3. Display image cv2.imshow(winname='img', mat=lena) cv2.imshow(winname='img_gray', mat=lena_gray) cv2.waitKey() cv2.destroyAllWindows() # 2. Write image cv2.imwrite(filename='data/lena.png', img=lena) cv2.imwrite(filename='data/lena_gray.png', img=lena_gray)

Image reading cv2.imread (filename, flags = none)  > numpy.ndarray

Function: read image data according to write path and read mode

Parameters:
filename: str Picture storage path flages: Union[int, str] Picture reading mode flages: Union[int, str]
Valueexplain 1
cv2.IMREAD_COLORBGR color image 0
cv2.IMREAD_GRAYSCALEGray image

View image reading mode
import cv2 flags=[i for i in dir(cv2) if 'IMREAD' in i] print(flags)
['IMREAD_ANYCOLOR', 'IMREAD_ANYDEPTH', 'IMREAD_COLOR', 'IMREAD_GRAYSCALE', 'IMREAD_IGNORE_ORIENTATION', 'IMREAD_LOAD_GDAL', 'IMREAD_REDUCED_COLOR_2', 'IMREAD_REDUCED_COLOR_4', 'IMREAD_REDUCED_COLOR_8', 'IMREAD_REDUCED_GRAYSCALE_2', 'IMREAD_REDUCED_GRAYSCALE_4', 'IMREAD_REDUCED_GRAYSCALE_8', 'IMREAD_UNCHANGED']

？？


Image write cv2.imwrite(filename, img, params=None)

Function: write image data (memory  > hard disk)

Parameters:
filename: str Picture storage path img: numpy.ndarray image data


Image display cv2.imshow(winname, mat)

The function imshow displays an image in the specified window

Parameters:
winname: str Window name mat: numpy.ndarray image data

Image color operation
Comprehensive example
import cv2 import matplotlib.pyplot as plt img = cv2.imread('data/lena.jpeg') # 1. Color space transformation img_gray = cv2.cvtColor(src=img, code=COLOR_BGR2GRAY) # Image graying (BGR  > gray) img_hsv = cv2.cvtColor(src=img, code=COLOR_BGR2HSV) # BGR>HSV # 2.BGR image channel suppression img[:,:,0] = 0 # Blue channel suppression img[:,:,1] = 0 # Green channel suppression img[:,:,2] = 0 # Red channel suppression # 3.HSV image channel adjustment img[:,:,0] += 10 # Brightness channel (H component) enhancement img[:,:,1] *= 1.2 # Saturation channel (S component) enhancement img[:,:,2] += 10 # Hue channel (V component) offset # 4. Threshold processing t, img_binary = cv2.thrshold(src=img, thresh=127, maxval=255, type=cv2.THRESHOLD_BINARY) # Binarization threshold processing for BGR color image t, img_binary_inv = cv2.thrshold(src=img, thresh=127, maxval=255, type=cv2.THRESHOLD_BINARY_INV) # Inverse binarization threshold processing for BGR color image t, img_gray_binary = cv2.thrshold(src=img_gray, thresh=127, maxval=255, type=cv2.THRESHOLD_BINARY) # Binarization threshold processing for gray image t, img_gray_binary_inv = cv2.thrshold(src=img_gray, thresh=127, maxval=255, type=cv2.THRESHOLD_BINARY_INV) # The gray image is processed by inverse binarization threshold # 5. Histogram equalization plt.hist(img_hsv[:,:,0].ravel(), bins=256, range=[0,255]) # UN equalized luminance channel histogram img_hsv[:,:,0] = cv2.equalizeHist(img_hsv[:,:,0]) # Histogram equalization of brightness channel (H component) of HSV color image plt.hist(img_hsv[:,:,0].ravel(), bins=256, range=[0,255]) # Equalized luminance channel histogram

Color space transformation cvtcolor (SRC, code, DST = none, dstcn = none)  > ndarray

notes:
""" cvtColor(src, code[, dst[, dstCn]]) > dst . @brief Converts an image from one color space to another. ... . @param src input image: 8bit unsigned, 16bit unsigned ( CV_16UC... ), or singleprecision . floatingpoint. . @param dst output image of the same size and depth as src. . @param code color space conversion code (see #ColorConversionCodes). . @param dstCn number of channels in the destination image; if the parameter is 0, the number of the . channels is derived automatically from src and code. . . @see @ref imgproc_color_conversions """

Function: color space conversion

Parameters:
src: ndarray Original image data code: cv2.COLOR_... Color space conversion coding dst: ndarray target image dstCn

View color space conversion encoding
import cv2 # codes = [i for i in dir(cv2) if i.startswith('COLOR_')] codes = [i for i in dir(cv2) if i.startswith('COLOR_') and i.count('_') < 2 and len(i)<15] print(codes)
['COLOR_BGR2BGRA', 'COLOR_BGR2GRAY', # BGR>gray 'COLOR_BGR2HLS', 'COLOR_BGR2HSV', # BGR>HSV 'COLOR_BGR2LAB', 'COLOR_BGR2LUV', 'COLOR_BGR2Lab', 'COLOR_BGR2Luv', 'COLOR_BGR2RGB', 'COLOR_BGR2RGBA', 'COLOR_BGR2XYZ', 'COLOR_BGR2YUV', 'COLOR_BGRA2BGR', 'COLOR_BGRA2RGB', 'COLOR_GRAY2BGR', # gray>BGR 'COLOR_GRAY2RGB', 'COLOR_HLS2BGR', 'COLOR_HLS2RGB', 'COLOR_HSV2BGR', # HSV>BGR 'COLOR_HSV2RGB', 'COLOR_LAB2BGR', 'COLOR_LAB2LBGR', 'COLOR_LAB2LRGB', 'COLOR_LAB2RGB', 'COLOR_LBGR2LAB', 'COLOR_LBGR2LUV', 'COLOR_LBGR2Lab', 'COLOR_LBGR2Luv', 'COLOR_LRGB2LAB', 'COLOR_LRGB2LUV', 'COLOR_LRGB2Lab', 'COLOR_LRGB2Luv', 'COLOR_LUV2BGR', 'COLOR_LUV2LBGR', 'COLOR_LUV2LRGB', 'COLOR_LUV2RGB', 'COLOR_Lab2BGR', 'COLOR_Lab2LBGR', 'COLOR_Lab2LRGB', 'COLOR_Lab2RGB', 'COLOR_Luv2BGR', 'COLOR_Luv2LBGR', 'COLOR_Luv2LRGB', 'COLOR_Luv2RGB', 'COLOR_RGB2BGR', 'COLOR_RGB2BGRA', 'COLOR_RGB2GRAY', 'COLOR_RGB2HLS', 'COLOR_RGB2HSV', 'COLOR_RGB2LAB', 'COLOR_RGB2LUV', 'COLOR_RGB2Lab', 'COLOR_RGB2Luv', 'COLOR_RGB2RGBA', 'COLOR_RGB2XYZ', 'COLOR_RGB2YUV', 'COLOR_RGBA2BGR', 'COLOR_RGBA2RGB', 'COLOR_XYZ2BGR', 'COLOR_XYZ2RGB', 'COLOR_YUV2BGR', 'COLOR_YUV2RGB']

Image graying
import cv2 img = cv2.imread('data/lena.jpeg') dst = cv2.cvtColor(src=img, code=cv2.COLOR_BGR2GRAY)
Several processing methods of image graying:
 Component method: take the brightness of three components in the color image as the gray value of three gray images, and select one of the gray images according to the application requirements.
 Maximum value method: take the maximum value of three components of brightness in color image as the gray value of gray image.
 Mean value method: calculate the average value of the brightness of the three components in the color image to a gray value as the gray value of the gray image.
 Weighted average method: weighted average the three components with different weights according to the importance and other indicators.
 Human eye adaptation weighting: human eyes are most sensitive to green and blue is the second. According to this, the weight of three components of BGR is obtained for weighted average f ( i , j ) = 0.11 B ( i , j ) + 0.59 G ( i , j ) + 0.30 R ( i , j ) f(i,j)=0.11B(i,j) + 0.59G(i,j) + 0.30R(i,j) f(i,j)=0.11B(i,j)+0.59G(i,j)+0.30R(i,j)


BGR image channel suppression
 Blue channel adjustment img_BGR[:,:,0]=0
 Green channel adjustment img_BGR[:,:,1]=0
 Red channel adjustment img_BGR[:,:,2]=0

HSV image channel adjustment
 Brightness channel adjustment img_HSV[:,:,0] + = offset
 Saturation channel adjustment img_HSV[:,:,1] * = correction factor
 Tone channel adjustment img_HSV[:,:,2] + = offset

Threshold processing CV2. Threshold (SRC, threshold, maxval, type, DST = none)  > threshold_ value, ndarray

notes:
""" threshold(src, thresh, maxval, type[, dst]) > retval, dst . @brief Applies a fixedlevel threshold to each array element. ... . @param src input array (multiplechannel, 8bit or 32bit floating point). . @param dst output array of the same size and type and the same number of channels as src. . @param thresh threshold value. . @param maxval maximum value to use with the #THRESH_BINARY and #THRESH_BINARY_INV thresholding . types. . @param type thresholding type (see #ThresholdTypes). . @return the computed threshold value if Otsu's or Triangle methods used. . . @sa adaptiveThreshold, findContours, compare, min, max """

@brief: applies a fixed level threshold to each array element.
 Binarization threshold processing: for pixels with gray value > threshold t, set their gray value to max value; For pixels with gray value < = threshold t, set their gray value to 0.
 Inverse binarization threshold processing: Contrary to binarization, the gray value of pixels > t is set to 0<= The gray value of the pixel of T is set to max value.

@param:
src: ndarray Original image thresh: int Threshold t maxval: int Maximum gray value type: cv2.THRESH_... Threshold processing type

Viewing threshold processing types
import cv2 threshold_types = [i for i in dir(cv2) if i.startswith('THRESH')] print(threshold_types)
['THRESH_BINARY', # Binarization threshold processing 'THRESH_BINARY_INV', # Inverse binarization threshold processing 'THRESH_MASK', 'THRESH_OTSU', 'THRESH_TOZERO', 'THRESH_TOZERO_INV', 'THRESH_TRIANGLE', 'THRESH_TRUNC']

？？


Histogram equalization CV2. Equalizehist (SRC, DST = none)  > ndarray
Gray level histogram: it reflects the frequency of each gray level pixel in a component. The histogram is drawn with the gray level and the frequency of gray level as the horizontal and vertical coordinates. It describes the relationship between frequency and gray level. It is an important feature of image and reflects the gray distribution of image.
$$
\begin{align}&Digital image with gray level range [0,L1]_ {MN}, whose histogram is a discrete function: h(r_k)=n_k \
&Of which:\
& \qquad r_ K is the kth gray value\
& \qquad n_ K is the gray level in the image, and R is the gray level_ Number of pixels of K\
&Normalize the histogram with the total number of pixels: p(r_k) = \frac{n_k}{MN}\end{align}
$$
Histogram equalization: it is an image transformation processing method based on probability theory. It achieves gray level equalization by comprehensively considering the gray value distribution of the whole image and modifying the gray value of each pixel of the whole image. Effectively enhance the image that is too dark, too bright and the details are not clear, so that the image has high contrast and large gray tone change.
notes:
""" equalizeHist(src[, dst]) > dst . @brief Equalizes the histogram of a grayscale image. ... . @param src Source 8bit single channel image. Single channel image . @param dst Destination image of the same size and type as src . """

@brief: equalize the histogram of gray image.

@param:
src: ndarray Original image (single channel image) dst: ndarray like src target image

Image morphological operation
Straightness / parallelism transformation
 Straightness: after affine transformation, the straight line of the image is still a straight line.
 Parallelism: after affine transformation, parallel lines are still parallel lines.
Comprehensive example
import cv2 img = cv2.imread('data/lena.jpeg') # 1. Mirror (flip) img_flip0 = cv2.flip(img, 0) # Flip around the x axis img_flip1 = cv2.flip(img, 1) # Flip around the y axis img_flip_minus = cv2.flip(img, 1) # Flip around xaxis and yaxis # 2. Affine transformation translated_img = translate(src=img, x=10, y=20) # Custom translation transform rotated_img = rotate(src=img, angle=90) # Custom rotation transform # 3. Perspective transformation # 4. Zoom h,w = src.shape[:2] equScaleDown_dst_size = h/2, w/2 equScaleDown_dst = cv2.resize(src=img, dsize=equScaleDown) # Equal scale reduction nonEquScaleDown_dst_size = h50, w50 nonEquScaleDown_dst Up= cv2.resize(src=img, dsize=nonEquScaleDown) # Unequal scale reduction equScaleUp_dst_size = h*2, w*2 equScaleUp_dst = cv2.resize(src=img, dsize=equScaleDown,) # Equal scale reduction nonEquScaleUp_dst_size = h+50, w+50 nonEquScaleUp_dst = cv2.resize(src=img, dsize=nonEquScaleDown) # Unequal scale reduction # 5. Cutting centerCrop_dst = center_crop(src=img, w=50, h=50) # Custom center crop randomCrop_dst = random_crop(src=img, w=50, h=50) # Custom random clipping

Mirror (flip) cv2.flip (SRC, flipcode, DST = none)  > ndarray

notes:
""" flip(src, flipCode[, dst]) > dst . @brief Flips a 2D array around vertical, horizontal, or both axes. ... . @param src input array. . @param dst output array of the same size and type as src. . @param flipCode a flag to specify how to flip the array; 0 means . flipping around the xaxis and positive value (for example, 1) means . flipping around yaxis. Negative value (for example, 1) means flipping . around both axes. . @sa transpose , repeat , completeSymm """

@brief: flips a 2D array on a vertical, horizontal, or two axes.

@param:
src: ndarray Original image flipCode: Union[1, 0, 1] Flip axis dst: ndarray like src target image flipCode: Union[1, 0, 1] 1 Flip around xaxis and yaxis 0 Flip around the x axis 1 Flip around the y axis


Affine transformation cv2.warpAffine(src, M, dsize, dst=None, flags=None, borderMode=None, borderValue=None)
affine transformation: the image realizes translation, rotation and other operations through a series of geometric transformations, and the transformation maintains the flatness and parallelism of the image.
In computer graphics, in order to uniformly represent translation, rotation and scaling by matrix, homogeneous coordinates need to be introduced. (assuming that a 2x2 matrix is used, there is no way to describe the translation operation. Only by introducing the 3x3 matrix form can the translation, rotation and scaling operations in two dimensions be described uniformly. Similarly, a 4x4 matrix must be used to describe the threedimensional transformation uniformly).
At the application level, affine transformation is based on three fixed vertices
[the external chain image transfer fails. The source station may have an antitheft chain mechanism. It is recommended to save the image and upload it directly (imgxcog7ped1632439287) (F: \ wuxin_convenent \ documents \ typro \ tarena \ final sprint \ fundamentals of digital image processing (opencv) _img \ three fixed vertices of affine transformation. png)]
notes:
""" warpAffine(src, M, dsize[, dst[, flags[, borderMode[, borderValue]]]]) > dst . @brief Applies an affine transformation to an image. ... . @param src input image. . @param dst output image that has the size dsize and the same type as src . . @param M \f$2\times 3\f$ transformation matrix. . @param dsize size of the output image. ... . @sa warpPerspective, resize, remap, getRectSubPix, transform """

@brief: apply affine transformation to the image.

@param:
src: ndarray Original image M: ndarray Transformation matrix dsize: Union[Tuple, List] The size of the output image dst ndarray target image flags borderMode borderValue

Fixed three vertex affine transformation
??

Translation transformation
def translate(src, x:int, y:int): > ndarray """ Define the translation matrix for image translation transformation :param img: input image :param x: Horizontal translation coordinate distance :param y: Vertical translation coordinate distance :return: Panned image """ h, w = src.shape[:2] # Take out the height and width of the original image (shape = (height, width, number of channels)) M = np.float([1,0,x], [0,1,y]) # Define translation matrix # affine transformation shifted_dst = cv2.warpAffine(src=src, M=M, dsize=(w,h)) # Note: width, height return shifted_dst

Calculate translation matrix

Two dimensional translation matrix
[the external chain picture transfer fails. The source station may have an antitheft chain mechanism. It is recommended to save the picture and upload it directly (imghs5shcml1632434291) (F: \ wuxin_convenent \ documents \ typro \ tarena \ final sprint \ fundamentals of digital image processing (opencv) _img \ twodimensional translation. png)]
$$
\begin{align}
&Point P is translated to point P 'in x and y directions, and the following can be obtained:\
& \qquad x′=x+t_x \
& \qquad y′=y+t_y \
&Write in matrix form:\
& \qquad \begin{bmatrix} x' \ y' \end{bmatrix} =
\begin{bmatrix}
1 & 0 & t_x \
0 & 1 & t_y \
\end{bmatrix} * \begin{bmatrix} x \ y \ 1 \end{bmatrix} \
&Introducing homogeneous coordinates:\
& \qquad \begin{bmatrix} x' \ y' \ 1 \end{bmatrix} =
\begin{bmatrix}
1 & 0 & t_x \
0 & 1 & t_y \
0 & 0 & 1 \
\end{bmatrix} * \begin{bmatrix} x \ y \ 1 \end{bmatrix} \\end{align}
$$



Rotation transformation
def rotate(src, angle, center=None, scale=1.0): """ Image rotation transformation :param img: Original drawing :param angle: Rotation angle :param center: Center of rotation :param scale: Scale :return: Returns the rotated image src """ h, w = src.shape[:2] # Take out the height and width of the original image # Rotation center (if center=None, add rotation center) if center: center = (w / 2, h / 2) # Calculate rotation matrix M = cv2.getRotationMatrix2D(center, angle, scale) # Affine transformation using rotation matrix rotated_dst = cv2.warpAffine(img, M, (w, h)) return rotated_dst

Calculate the rotation matrix cv2.getRotationMatrix2D(center, angle, scale)

notes:
""" getRotationMatrix2D(center, angle, scale) > retval . @brief Calculates an affine matrix of 2D rotation. ... . . The transformation maps the rotation center to itself. If this is not the target, adjust the shift. . . @param center Center of the rotation in the source image. . @param angle Rotation angle in degrees. Positive values mean counterclockwise rotation (the . coordinate origin is assumed to be the topleft corner). . @param scale Isotropic scale factor. . . @sa getAffineTransform, warpAffine, transform """

Two dimensional rotation principle: rotation rotates around a point in two dimensions.

2D rotation around origin
[the external chain image transfer fails. The source station may have an antitheft chain mechanism. It is recommended to save the image and upload it directly (imgrhozawe81632439295) (F: \ wuxin_convenent \ documents \ typro \ tarena \ final sprint \ fundamentals of digital image processing (opencv) _img \ twodimensional rotation around the origin. png)]
$$
\begin{align}
&As shown in the figure, point v rotates the \ theta angle around the origin to obtain point v '\
&Suppose the coordinates of point v are (x, y), the distance from the origin to V is r, and the angle between the vector from the origin to point v and the X axis is \ phi\
&The coordinates (x ', y') of point v 'are derived:\
& \qquad x=rcos\phi ,\ y=rsin\phi \
& \qquad x′=rcos(θ+\phi ) ,\ y′=rsin(θ+\phi ) \
&Obtained by trigonometric function expansion:\
& \qquad x′=rcosθcos\phi −rsinθsin\phi \
& \qquad y′=rsinθcos\phi +rcosθsin\phi \
&Bring in the x and y expressions to get:\
& \qquad x′=xcosθ−ysinθ \
& \qquad y′=xsinθ+ycosθ \
&Write in matrix form:\
& \qquad \begin{bmatrix} x' \ y' \end{bmatrix} =
\begin{bmatrix}
cos\theta & sin\theta \
sin\theta & cos\theta \
\end{bmatrix} * \begin{bmatrix} x \ y \end{bmatrix}\end{align}
$$ 
2D rotation around any point
Idea: 1. First move the rotation point to the origin 2. Performs a rotation around the origin 3. Then move the rotation point back to the original position
[the external chain image transfer fails. The source station may have an antitheft chain mechanism. It is recommended to save the image and upload it directly (imgtxiiudbk1632439298) (F: \ wuxin_convenent \ documents \ typro \ tarena \ final sprint \ fundamentals of digital image processing (opencv) img \ twodimensional rotation around any point. png)]
$$
\begin{align}
&In the case of processing rotation around any point, you need to perform two translations\
&Assuming that the translation matrix from the origin to the rotation point is T(t_x,t_y), the translation matrix from the rotation point to the origin is T(tx,t_y), and the rotation matrix R(\theta)\
&When column vectors are used to describe the coordinates of points, the whole calculation process is as follows:\
& \qquad v'=T(x,y)*R(\theta)*T(x,y)v = Mv \
&The obtained rotation matrix M is:\
& \qquad M =
\begin{bmatrix}
1 & 0 & t_x \
0 & 1 & t_y \
0 & 0 & 1 \
\end{bmatrix} *
\begin{bmatrix}
cos\theta & sin\theta & 0 \
sin\theta & cos\theta & 0 \
0 & 0 & 1 \
\end{bmatrix} *
\begin{bmatrix}
1 & 0 & t_x \
0 & 1 & t_y \
0 & 0 & 1 \
\end{bmatrix} \\end{align}
$$





Perspective transformation warpPerspective(src, M, dsize, dst=None, flags=None, borderMode=None, borderValue=None)
Perspective transformation is the transformation of an image based on four fixed vertices
[the external chain picture transfer fails. The source station may have an antitheft chain mechanism. It is recommended to save the picture and upload it directly (imgjmviwbdr16324392300) (F: \ wuxin_convenent \ documents \ typro \ tarena \ final sprint \ fundamentals of digital image processing (opencv) _img \ four fixed vertices of perspective transformation. png)]
notes:
""" warpPerspective(src, M, dsize[, dst[, flags[, borderMode[, borderValue]]]]) > dst . @brief Applies a perspective transformation to an image. ... . @param src input image. . @param dst output image that has the size dsize and the same type as src . . @param M \f$3\times 3\f$ transformation matrix. . @param dsize size of the output image. ... . @sa warpAffine, resize, remap, getRectSubPix, perspectiveTransform """

@brief: apply perspective transformation to the image.

@param:

Fixed four vertex perspective transform
def perspective(src, src_points, dst_points): # Generate perspective transformation matrix M = cv2.getPerspectiveTransform(src=src_points, dst=dst_points) h,w = src.shape[:2] perspective_dst = cv2.warpPerspective(src=src, M=M, disze=(w,h)) return perspective_dst

Calculate the perspective matrix cv2.getPerspectiveTransform(src, dst, solveMethod=None)

notes:
""" getPerspectiveTransform(src, dst[, solveMethod]) > retval . @brief Calculates a perspective transform from four pairs of the corresponding points. ... . @param src Coordinates of quadrangle vertices in the source image. . @param dst Coordinates of the corresponding quadrangle vertices in the destination image. . @param solveMethod method passed to cv::solve (#DecompTypes) . . @sa findHomography, warpPerspective, perspectiveTransform """




Zoom cv2.resize(src, dsize, dst=None, fx=None, fy=None, interpolation=None)

notes:
""" resize(src, dsize[, dst[, fx[, fy[, interpolation]]]]) > dst . @brief Resizes an image. . @param src input image. . @param dst output image; it has the size dsize (when it is nonzero) or the size computed from . src.size(), fx, and fy; the type of dst is the same as of src. . @param dsize output image size; if it equals zero, it is computed as:... . @param interpolation interpolation method, see #InterpolationFlags . . @sa warpAffine, warpPerspective, remap """

(non) equal scale reduction

(non) equal scale magnification

View interpolation method
import cv2 interpolations = [i for i in dir(cv2) if i.startswith('INTER_')] print(interpolations)
['INTER_AREA', 'INTER_BITS', 'INTER_BITS2', 'INTER_CUBIC', 'INTER_LANCZOS4', 'INTER_LINEAR', # bilinear interpolation 'INTER_LINEAR_EXACT', 'INTER_MAX', # Maximum interpolation 'INTER_NEAREST', # Nearest neighbor interpolation 'INTER_NEAREST_EXACT', 'INTER_TAB_SIZE', 'INTER_TAB_SIZE2']

Bilinear interpolation method: in this method, four nearest neighbors are used to estimate the gray level of a given position
KaTeX parse error: No such environment: align at position 8: \begin{ ̲ a ̲ l ̲ i ̲ g ̲ n ̲}̲ & Let (x,y) be the estimated position, v





Crop array slice

Center clipping
def center_crop(im, w, h): # Image center position center_x, center_y, temp = np.array(im.shape)//2 # tuple does not support scalar operation # Generate clipping start position (upper left corner) start_x = center_x  w//2 start_y = center_y  h//2 new_img = im[start_y:start_y + h, start_x:start_x + w, :] return new_img

Random clipping
def random_crop(im, w, h): # Randomly generated clipping start position (upper left corner) start_x = np.random.randint(0, im.shape[1]) #  w start_y = np.random.randint(0, im.shape[0]) #  h new_img = im[start_y:start_y + h, start_x:start_x + w, :] # Im [row, column, channel] = im [height, width, channel], or directly im[h,w], then all channels are taken by default return new_img

Image arithmetic calculation
Comprehensive example
import cv2
 Image addition array addition
 Purpose:
 Watermark overlay
 Denoising (cumulative mean)
 Purpose:
 Image subtraction array subtraction
 Purpose:
 Find image differences
 Continuous image background elimination and motion trajectory detection
 Purpose:
Corrosion and expansion (morphological filtering)
Image dilation and erosion are two basic morphological operations, which are mainly used to find the maximum region and minimum region in the image.
import cv2 img = cv2.imread('data/lena.jpeg') img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) t, img_binary = cv2.threshold(img_gray, 127, 255, type=cv2.THRESH_BINARY) # Define convolution kernel kernel = np.ones((5,5), np.uint8) # 1. Image corrosion eroded_img_binary = cv2.erode(src=img_binary, kernel=kernel, iterations=5) eroded2_img_binary = cv2.morphologyEx(src=img_bianry, op=cv2.MORPH_ERODE, kernel=kernel, iterations=5) # 2. Image expansion dilated_img_binary = cv2.dilate(src=img_binary, kernel=kernel, iterations=5) dilated2_img_binary = cv2.morphologyEx(src=img_bianry, op=cv2.MORPH_DILATE, kernel=kernel, iterations=5) # 3. Image opening operation (segmentation  removing external noise) open_img_binary = cv2.dilate(src=eroded_img_binary, kernel=kernel, iterations=5) open2_img_binary = cv2.morphologyEx(src=img_bianry, op=cv2.MORPH_OPEN, kernel=kernel, iterations=5) # 3. Image closing operation (Unicom  remove internal noise) close_img_binary = cv2.erode(src=dliated_img_binary, kernel=kernel, iterations=5) close2_img_binary = cv2.morphologyEx(src=img_bianry, op=cv2.MORPH_CLOSE, kernel=kernel, iterations=5) # 4. Morphological gradient (get the edge of foreground image) img_binary = dilated_img_binary  eroded_img_binary # 5. Top (ceremony) hat operation (get edge noise) tophat_img_binary = img_binary  open_img_binary # 6. Black hat operation (get internal noise) balckhat_img_binary = close_img_binary  img_binary

Morphological extension cv2.morphologyEx(src, op, kernel, dst=None, anchor=None, iterations=None, borderType=None, borderValue=None)
Description: integrated advanced morphological transformation function
notes:
""" morphologyEx(src, op, kernel[, dst[, anchor[, iterations[, borderType[, borderValue]]]]]) > dst . @brief Performs advanced morphological transformations. ... . @param src Source image. The number of channels can be arbitrary. The depth should be one of . CV_8U, CV_16U, CV_16S, CV_32F or CV_64F. . @param dst Destination image of the same size and type as source image. . @param op Type of a morphological operation, see #MorphTypes . @param kernel Structuring element. It can be created using #getStructuringElement. . @param anchor Anchor position with the kernel. Negative values mean that the anchor is at the . kernel center. . @param iterations Number of times erosion and dilation are applied. . @param borderType Pixel extrapolation method, see #BorderTypes. #BORDER_WRAP is not supported. . @param borderValue Border value in case of a constant border. The default value has a special . meaning. . @sa dilate, erode, getStructuringElement . @note The number of iterations is the number of times erosion or dilatation operation will be applied. . For instance, an opening operation (#MORPH_OPEN) with two iterations is equivalent to apply . successively: erode > erode > dilate > dilate (and not erode > dilate > erode > dilate). """

@brief: perform advanced morphological transformation.

@param:
src: ndarray Source image op: cv2.MORPH_... Operation operator kernel: ndarray Convolution kernel achor: Tuple Anchor position The default value ( 1,  1) indicates the center position iterations: int Number of iterations Default 1

View configuration operation list
import cv2 morphs = [i for i in dir(cv2) if i.startswith('MORPH_')] print(morphs)
['MORPH_BLACKHAT', #Black hat operation 'MORPH_CLOSE', # Closed operation 'MORPH_CROSS', 'MORPH_DILATE', # Expansion operation 'MORPH_ELLIPSE', 'MORPH_ERODE', # Corrosion operation 'MORPH_GRADIENT', # Morphological gradient 'MORPH_HITMISS', 'MORPH_OPEN', # Open operation 'MORPH_RECT', 'MORPH_TOPHAT' # Top hat operation ]


Image corrosion cv2.erode(src, kernel, dst=None, anchor=None, iterations=None, borderType=None, borderValue=None)
Description: reduce and refine the highlighted area or white part in the binary image. The operation result is smaller than the highlighted area in the original image, that is, the highlighted area is eroded. (color images and gray images can also be corroded, but the effect is not obvious)
Purpose: "shrink" or "refine" the foreground (highlighted area) in binary graphics to realize the functions of edge denoising and element segmentation.
Principle:
$$
\begin{align}&Corrosion operator: '', corrosion operation is defined as:\
& \qquad AB={xB_x \subseteq A } \
&Definition: image A is etched with convolution template B, and image A is etched with template B\
&\ qquad convolution operation to obtain the minimum value of pixels in the B coverage area, and use the minimum value to replace the reference\
&\ qquad pixel value of the point\&Corrosion process: the template (convolution kernel) scans the original image pixel by pixel according to its central point, and the original image\
&\ qquad pixel scanned by convolution kernel: if all values are 1, the pixel value is 1; Otherwise, it is 0 (0 is dark and 1 is bright)\
&\ qquad also shows that the larger the convolution kernel area and the more convolution times, the deeper the corrosion degree and the highlight area\
&\ qquad is eaten more widely.\end{align}
$$
notes:
""" erode(src, kernel[, dst[, anchor[, iterations[, borderType[, borderValue]]]]]) > dst . @brief Erodes an image by using a specific structuring element. ... . @param src input image; the number of channels can be arbitrary, but the depth should be one of . CV_8U, CV_16U, CV_16S, CV_32F or CV_64F. . @param dst output image of the same size and type as src. . @param kernel structuring element used for erosion; if `element=Mat()`, a `3 x 3` rectangular . structuring element is used. Kernel can be created using #getStructuringElement. . @param anchor position of the anchor within the element; default value (1, 1) means that the . anchor is at the element center. . @param iterations number of times erosion is applied. . @param borderType pixel extrapolation method, see #BorderTypes. #BORDER_WRAP is not supported. . @param borderValue border value in case of a constant border . @sa dilate, morphologyEx, getStructuringElement """


Image dilate(src, kernel, dst=None, anchor=None, iterations=None, borderType=None, borderValue=None)
Description: expand the highlighted area or white part of the binary image. The operation result is larger than the highlighted area of the original image, that is, the highlighted area is expanded. (color image and gray image can also be expanded, but the effect is not obvious)
Purpose: expand the foreground (highlighted area) in binary graphics to realize internal denoising and element connectivity (such as filling the blank in the image after image segmentation).
Principle:
$$
\begin{align}&Expansion operator: '\ oplus', expansion operation is defined as:\
& \qquad AB={xB_x \cap A \neq } \
&Definition representation: image A is expanded with convolution template B, which is connected with image A through template B\
&\ qquad convolution operation to obtain the maximum value of pixels in the B coverage area, and use this maximum value to replace the reference\
&\ qquad pixel value of the point\&Expansion process: the template (convolution kernel) scans the original image pixel by pixel according to its central point, and the original image\
&\ qquad pixel scanned by convolution kernel: if at least one value is 1, the pixel value is 1; Otherwise\
&\ qquad is 0 (0 is dark and 1 is bright)\
&\ qquad also shows that the larger the convolution kernel area, the more convolution times, the deeper the expansion degree and the highlighted area\
&\ qquad is expanded more widely.\end{align}
$$
notes:
""" dilate(src, kernel[, dst[, anchor[, iterations[, borderType[, borderValue]]]]]) > dst . @brief Dilates an image by using a specific structuring element. ... . @param src input image; the number of channels can be arbitrary, but the depth should be one of . CV_8U, CV_16U, CV_16S, CV_32F or CV_64F. . @param dst output image of the same size and type as src. . @param kernel structuring element used for dilation; if elemenat=Mat(), a 3 x 3 rectangular . structuring element is used. Kernel can be created using #getStructuringElement . @param anchor position of the anchor within the element; default value (1, 1) means that the . anchor is at the element center. . @param iterations number of times dilation is applied. . @param borderType pixel extrapolation method, see #BorderTypes. #BORDER_WRAP is not suported. . @param borderValue border value in case of a constant border . @sa erode, morphologyEx, getStructuringElement """


Image opening operation
Description: the image corrodes first and then expands
Purpose: denoising Foreground: remove external noise  segmentation;
 Background: eliminate small highlighted areas.
[the external chain image transfer fails, and the source station may have an antitheft chain mechanism. It is recommended to save the image and upload it directly (IMG etnkrikz1632439242304) (F: \ wuxin_convenent \ documents \ typro \ tarena \ final sprint \ fundamentals of digital image processing (opencv) _img \ image opening operation. png)]

Image closure operation
Description: the image expands first and then corrodes
Purpose: denoising Foreground: remove internal noise  connectivity (eliminate small black holes in the foreground);
[the external chain image transfer fails. The source station may have an antitheft chain mechanism. It is recommended to save the image and upload it directly (img5eworrya16324342307) (F: \ wuxin_convenent \ documents \ typro \ tarena \ final sprint \ fundamentals of digital image processing (opencv) _img \ image closing operation. png)]
 Foreground: remove internal noise  connectivity (eliminate small black holes in the foreground);

Morphological gradient
Description: difference image of expansion and corrosion
Purpose: highlight the periphery of the highlighted area and provide new ideas for contour search.

Top hat operation
Description: the difference image between the original image and the opening operation (expansion after corrosion)
Purpose: background extraction (extracting small high bright blocks)

Black hat operation
Description: close the difference image between the operation and the original image
Purpose: foreground black hole extraction
Image gradient processing
In the process of image enhancement, various image smoothing algorithms are usually used to eliminate noise. The common noise of image mainly includes additive noise, multiplicative noise and quantization noise.
Generally speaking, the energy of the image is mainly concentrated in its lowfrequency part, the frequency band of the noise is mainly in the highfrequency part, and the image edge information is also mainly concentrated in its highfrequency part.
This will lead to the blurring of image edge and image contour after smoothing of the original image. In order to reduce the impact of such adverse effects, it is necessary to use image sharpening technology to make the edge of the image clear. The purpose of image sharpening processing is to make the edge, contour and details of the image clear:
 The fundamental reason why the smoothed image becomes blurred is that the image is subjected to average or integral operation, so the image can be made clear by inverse operation (such as differential operation). Differential operation is to calculate the change rate of signal. According to the differential property of Fourier transform, differential operation has a strong role of highfrequency component.
 Considering from the frequency domain, the essence of image blur is that its highfrequency component is attenuated, so high pass filter can be used to make the image clear.
However, it should be noted that the image that can be sharpened must have a high sex noise ratio, otherwise the sex noise ratio of the image after sharpening is lower, so that the noise increases more than the signal. Therefore, it is generally to remove or reduce the noise before sharpening.
Original image – > smoothing – > image edge / contour blur – > image sharpening – > image edge / contour / detail clarity
Image gradient: the speed at which image pixel values change. For the edge part of the object, the gray value changes greatly and the gradient value is also large; In the smoother part of the object, the change of gray value and gradient value are small. (the gradient value is large in the edge area with large gray change, small in the area with gentle gray change, and zero in the area with uniform gray. Sharpening: increase the gradient; blur: reduce the gradient)

The view image is a binary discrete function, and the change rate of image gray is expressed by differentiation.

In calculus, a onedimensional function f ( x ) f(x) First order differential and twodimensional function of f(x) f ( x , y ) f(x,y) The basic definition of the firstorder partial differential of f(x,y) is:
d f ( x ) d x = lim △ x → 0 f ( x + △ x ) − f ( x ) △ x ∂ f ( x , y ) ∂ x = lim △ x → 0 f ( x + △ x , y ) − f ( x , y ) △ x ∂ f ( x , y ) ∂ y = lim △ y → 0 f ( x , y + △ y ) − f ( x , y ) △ y \frac{df(x)}{dx}=\lim_{\vartriangle x \to 0} \frac{f(x+\vartriangle x)−f(x)}{\vartriangle x} \\ \\ \frac{\partial f(x,y)}{\partial x}=\lim_{\vartriangle x \to 0} \frac{f(x+\vartriangle x,y)−f(x,y)}{\vartriangle x} \\ \frac{\partial f(x,y)}{\partial y}=\lim_{\vartriangle y \to 0} \frac{f(x,y+\vartriangle y)−f(x,y)}{\vartriangle y} dxdf(x)=△x→0lim△xf(x+△x)−f(x)∂x∂f(x,y)=△x→0lim△xf(x+△x,y)−f(x,y)∂y∂f(x,y)=△y→0lim△yf(x,y+△y)−f(x,y) 
The image is a twodimensional discrete function (discrete according to pixels), △ x , △ y \vartriangle x,\vartriangle y △ x and △ y shall be at least 1. Therefore, the gradient of the image is calculated by the positive difference between the two axes to obtain the gradient matrix of the image M M M:
KaTeX parse error: No such environment: align at position 8: \begin{̲a̲l̲i̲g̲n̲}̲ & M(x,y)=\sqrt...
Template operation:

Template (convolution kernel): a small matrix w with size n*n (n is generally odd, called template size), and the value on the matrix is called weight. During calculation, the anchor point (generally the center point) of the template is aligned with the pixel by pixel p in the image, and the template weight value and the image pixel value within the coverage of the template are used as input (note the template) in one operation w w w. Input image i i i).

Template convolution (weighted summation):
p = ∑ i = 1 n ∑ j = 1 n w i , j ∗ i i , j ∑ i = 1 n ∑ j = 1 n w i , j p=\frac{\sum_{i=1}^n \sum_{j=1}^n w_{i,j}*i_{i,j}}{\sum_{i=1}^n \sum_{j=1}^n w_{i,j}} p=∑i=1n∑j=1nwi,j∑i=1n∑j=1nwi,j∗ii,j 
Template sorting:
 Maximum template sorting: p = m a x ( s o r t e d ( i ) ) p=max(sorted(i)) p=max(sorted(i))
 Minimum template sorting: p = m i n ( s o r t e d ( i ) ) p=min(sorted(i)) p=min(sorted(i))
 Median template sort: p = m e d i a n ( s o r t e d ( i ) ) p=median(sorted(i)) p=median(sorted(i))
Comprehensive example
import cv2 img = cv2.imread('data/lena.jpeg')
Image smoothing (denoising, blur)
Comprehensive example
import cv2 img = cv2.imread('data/lena.jpeg') # 1. Mean filtering meanBlur_img = cv2.blur(src=img, ksize=(3,3)) # 2. Median filtering medianBlur_img = cv2.meadianBlur(src=img, ksize=3) # 3. Gaussian filtering im_gaussian_blur = cv2.GaussianBlur(src=im, ksize=(5,5), sigmaX=3)

Mean filtering cv2.blur(src, ksize, dst=None, anchor=None, borderType=None)
Description: mean filter refers to a filter with template weight of 1. It takes the neighborhood average of pixels as the output result through convolution operation.
$$
\begin{align}&For ex amp le, 3 * 3 mean filter (ksize=(3,3)):
\begin{bmatrix}
1 & 1 & 1 \
1 & 1 & 1 \
1 & 1 & 1
\end{bmatrix}\end{align}
$$
Purpose: the image is smooth, but the image will blur with the increase of template sizecharacteristic:
 Advantages: fast speed and simple algorithm
 Disadvantages: can not remove the noise, can only weakly weaken it

notes:
""" blur(src, ksize[, dst[, anchor[, borderType]]]) > dst . @brief Blurs an image using the normalized box filter. ... . @param src input image; it can have any number of channels, which are processed independently, but . the depth should be CV_8U, CV_16U, CV_16S, CV_32F or CV_64F. . @param dst output image of the same size and type as src. . @param ksize blurring kernel size. : tuple . @param anchor anchor point; default value Point(1,1) means that the anchor is at the kernel . center. . @param borderType border mode used to extrapolate pixels outside of the image, see #BorderTypes. #BORDER_WRAP is not supported. . @sa boxFilter, bilateralFilter, GaussianBlur, medianBlur """

Median filter cv2.medianBlur(src, ksize, dst=None)
Description: median filter belongs to the filter of template sorting operation. It outputs the sorted median value of pixels in the neighborhood instead of the original pixel value.
Purpose: denoising (good suppression effect on salt and pepper noise) while preserving the sharpness of the original image (sharp edge and less blur as far as possible); However, it will destroy the linear relationship in the image and is not suitable for processing precision images with more point and line details.
notes:
""" medianBlur(src, ksize[, dst]) > dst . @brief Blurs an image using the median filter. ... . @note The median filter uses #BORDER_REPLICATE internally to cope with border pixels, see #BorderTypes . . @param src input 1, 3, or 4channel image; when ksize is 3 or 5, the image depth should be . CV_8U, CV_16U, or CV_32F, for larger aperture sizes, it can only be CV_8U. . @param dst destination array of the same size and type as src. . @param ksize aperture linear size; it must be odd and greater than 1, for example: 3, 5, 7 ... : integer . @sa bilateralFilter, blur, boxFilter, GaussianBlur """


Gaussian filter CV2. Gaussian blur (SRC, ksize, sigma x, DST = none, sigma y = none, bordertype = none)
Description: the template of Gaussian filter determines the template coefficient according to the Gaussian distribution. The weight near the center is greater than that at the edge.
$$
\begin{align}&For ex amp le, 5 * 5 Gaussian filter (ksize=5):
\begin{bmatrix}
1 & 4 & 7 & 4 & 1 \
4 & 16 & 26 & 16 & 4 \
7 & 26 & 41 & 26 & 7 \
4 & 16 & 26 & 16 & 4 \
1 & 4 & 7 & 4 & 1
\end{bmatrix}\end{align}
$$
Purpose: smooth and denoise, and reduce the influence of the increase of template size on image fuzziness in mean filtering.
notes:
""" GaussianBlur(src, ksize, sigmaX[, dst[, sigmaY[, borderType]]]) > dst . @brief Blurs an image using a Gaussian filter. . . The function convolves the source image with the specified Gaussian kernel. Inplace filtering is . supported. . . @param src input image; the image can have any number of channels, which are processed . independently, but the depth should be CV_8U, CV_16U, CV_16S, CV_32F or CV_64F. . @param dst output image of the same size and type as src. . @param ksize Gaussian kernel size. ksize.width and ksize.height can differ but they both must be : tuple . positive and odd. Or, they can be zero's and then they are computed from sigma. . @param sigmaX Gaussian kernel standard deviation in X direction. . @param sigmaY Gaussian kernel standard deviation in Y direction; if sigmaY is zero, it is set to be . equal to sigmaX, if both sigmas are zeros, they are computed from ksize.width and ksize.height, . respectively (see #getGaussianKernel for details); to fully control the result regardless of . possible future modifications of all this semantics, it is recommended to specify all of ksize, . sigmaX, and sigmaY. . @param borderType pixel extrapolation method, see #BorderTypes. #BORDER_WRAP is not supported. . . @sa sepFilter2D, filter2D, blur, boxFilter, bilateralFilter, medianBlur """

@param:
sigmaX: int Standard deviation of Gaussian kernel weight in Xaxis sigmaY: int Standard deviation of Gaussian kernel weight in Yaxis Default = sign

Image sharpening (edge detection)
Sharpening is to highlight the details (boundary), so it is necessary to strengthen the pixels at the edge (for example, directly use the gradient value as the gray level or RGB component of the pixel)
, sharpening is to reduce blur in the image by enhancing highfrequency components,
Enhance the detail edge and contour of the image and enhance the gray contrast, which is convenient for target recognition and processing in the later stage. Sharpening not only enhances the edge of the image, but also increases the noise of the image
The edge detection operator checks the field of each pixel and quantifies the gray change rate, usually including the determination of direction. Most of them are convolution methods based on directional template.
For edge detection, as long as the gray level of pixels exceeding the threshold is set to 0 according to the set threshold, otherwise it is set to 255.
Comprehensive example
import cv2 img = cv2.imread('data/lena.png', 0) # Read grayscale image # First order differential sharpening 1.sobel filtering im_sobel = cv2.Sobel(im, ddepth=cv2.CV_64F, dx=1, dy=1, ksize=5) cv2.imshow('im_sobel', im_sobel) # Second order differential sharpening 1.Laplacian filtering im_lap = cv2.Laplacian(im, cv2.CV_64F) cv2.imshow('im_lap', im_lap) cv2.imshow('im_lapim', im_lap  im) # Multistage algorithm 1.Canny filtering im_canny = cv2.Canny(im, 50, 240) im_blur_canny = cv2.Canny(src=cv2.GaussianBlur(im, ksize=(3,3), sigmaX=0.5), 50, 240)
First order differential sharpening

Robert operator

Sobel operator cv2.Sobel(src, ddepth, dx, dy, dst=None, ksize=None, scale=None, delta=None, borderType=None)
Description: The simplest approximation of the gradient is g x = ( w 7 + 2 w 8 + w 9 ) − ( w 1 + 2 w 2 + w 3 ) g y = ( w 3 + 2 w 6 + w 9 ) − ( w 1 + 2 w 4 + w 7 ) g_x=(w_7+2w_8+w_9)(w_1+2w_2+w_3) \\ g_y=(w_3+2w_6+w_9)(w_1+2w_4+w_7) gx=(w7+2w8+w9)−(w1+2w2+w3)gy=(w3+2w6+w9)−(w1+2w4+w7)
 Operator is [ − 1 − 2 − 1 0 0 0 1 2 1 ] and [ − 1 0 1 − 2 0 2 − 1 0 1 ] \Begin {bMatrix}  1 &  2 &  1 \ \ 0 & 0 \ \ 1 & 2 & 1 \ \ end {bMatrix} and \ begin {bMatrix}  1 & 0 & 1 \ \ 2 & 0 & 2 \ \ 1 & 0 & 1 \ \ end {bMatrix} ⎣− 101 − 202 − 101 ⎦⎤ and ⎣− 1 − 2 − 1 000 ⎦⎤121 ⎦⎤
 Calculate the gradient image as M ( x , y ) = ∣ g x ∣ + ∣ g y ∣ M(x,y)=g_x + g_y M(x,y)=∣gx∣+∣gy∣
Purpose:

notes:
""" Sobel(src, ddepth, dx, dy[, dst[, ksize[, scale[, delta[, borderType]]]]]) > dst . @brief Calculates the first, second, third, or mixed image derivatives using an extended Sobel operator. ... . @param src input image. . @param dst output image of the same size and the same number of channels as src . . @param ddepth output image depth, see @ref filter_depths "combinations"; in the case of . 8bit input images it will result in truncated derivatives. . @param dx order of the derivative x. . @param dy order of the derivative y. . @param ksize size of the extended Sobel kernel; it must be 1, 3, 5, or 7. . @param scale optional scale factor for the computed derivative values; by default, no scaling is . applied (see #getDerivKernels for details). . @param delta optional delta value that is added to the results prior to storing them in dst. . @param borderType pixel extrapolation method, see #BorderTypes. #BORDER_WRAP is not supported. . @sa Scharr, Laplacian, sepFilter2D, filter2D, GaussianBlur, cartToPolar """

Prewitt operator
Second order differential sharpening

Laplacian deformation operator (SRC, ddepth, DST = none, ksize = none, scale = none, delta = none, bordertype = none)
Description: Differential is defined as ∇ 2 f = ∂ f ∂ x + ∂ f ∂ y = f ( x + 1 , y ) + f ( x − 1 , y ) + f ( x , y + 1 ) + f ( x , y − 1 ) − 4 f ( x , y ) \nabla^2f = \frac{\partial f}{\partial x} + \frac{\partial f}{\partial y} = f(x+1,y)+f(x1,y)+f(x,y+1)+f(x,y1)4f(x,y) ∇2f=∂x∂f+∂y∂f=f(x+1,y)+f(x−1,y)+f(x,y+1)+f(x,y−1)−4f(x,y)
 Operator is [ 0 1 0 1 − 4 1 0 1 0 ] \begin{bmatrix} 0 & 1 & 0 \\ 1 & 4 & 1 \\ 0 & 1 & 0 \\ \end{bmatrix} ⎣⎡0101−41010⎦⎤
 Calculate the differential image as correlation / convolution
Purpose:

notes:
""" Laplacian(src, ddepth[, dst[, ksize[, scale[, delta[, borderType]]]]]) > dst . @brief Calculates the Laplacian of an image. ... . @param src Source image. . @param dst Destination image of the same size and the same number of channels as src . . @param ddepth Desired depth of the destination image. . @param ksize Aperture size used to compute the secondderivative filters. See #getDerivKernels for . details. The size must be positive and odd. . @param scale Optional scale factor for the computed Laplacian values. By default, no scaling is . applied. See #getDerivKernels for details. . @param delta Optional delta value that is added to the results prior to storing them in dst . . @param borderType Pixel extrapolation method, see #BorderTypes. #BORDER_WRAP is not supported. . @sa Sobel, Scharr """

Log edge operator

Gauss Laplace operator
Multistage algorithm

Canny algorithm (image, threshold1, threshold2, edges = none, aperturesize = none, l2gradient = none)
Steps: Image denoising: prevent false edge recognition.
 Calculate image gradient: get the set of all possible edges.
 Non maximum suppression: keep the maximum gray transformation point in the gradient direction in the local range, and thin the edge (multiple pixel wide edges – > single pixel wide edges).
 Double threshold filtering: retain strong edges, complete unclosed edges, and discard weak edges.

notes:
""" Canny(image, threshold1, threshold2[, edges[, apertureSize[, L2gradient]]]) > edges . @brief Finds edges in an image using the Canny algorithm @cite Canny86 . ... . @param image 8bit input image. . @param edges output edge map; single channels 8bit image, which has the same size as image . . @param threshold1 first threshold for the hysteresis procedure. Low threshold . @param threshold2 second threshold for the hysteresis procedure. High threshold ... """
Custom filtering
Comprehensive example
import numpy as np import cv2 im = cv2.imread('data/lena.jpg') # Custom arithmetic average operator (smoothing) simple_mean_kernel = np.array([ [1, 1, 1], [1, 1, 1], [1, 1, 1] ], dtype=float)/9 # Smooth filtering im_filter = cv2.filter2D(src=im, ddepth=1, kernel=simple_mean_kernel) # Custom Laplace biaxial operator laplacian_x_y = np.array([ [0, 1, 0], [1, 4, 1], [0, 1, 0] ]) / 3.0 # To prevent the image from being too dark # Sharpening filter im_sharpen_2 = cv2.filter2D(im, 1, laplacian_x_y)

Two dimensional filtering filter2D(src, ddepth, kernel, dst=None, anchor=None, delta=None, borderType=None)

notes:
""" filter2D(src, ddepth, kernel[, dst[, anchor[, delta[, borderType]]]]) > dst . @brief Convolves an image with the kernel. ... . @param src input image. . @param dst output image of the same size and the same number of channels as src. . @param ddepth desired depth of the destination image, see @ref filter_depths "combinations" . @param kernel convolution kernel (or rather a correlation kernel), a singlechannel floating point . matrix; if you want to apply different kernels to different channels, split the image into . separate color planes using split and process them individually. . @param anchor anchor of the kernel that indicates the relative position of a filtered point within . the kernel; the anchor should lie within the kernel; default value (1,1) means that the anchor . is at the kernel center. . @param delta optional value added to the filtered pixels before storing them in dst. . @param borderType pixel extrapolation method, see #BorderTypes. #BORDER_WRAP is not supported. . @sa sepFilter2D, dft, matchTemplate """

theory
Spatial domain processing
Operate on image pixels, which can be divided into two categories:

Grayscale transformation: operation on a single pixel of the image, mainly for the purpose of contrast kernel threshold processing (point processing technology)

Spatial filtering: operations involving improving performance, such as sharpening the image through neighborhood processing of each pixel (neighborhood processing technology)
Spatial domain processing representation
$$
\begin{align}
& g(x,y) = T[f(x,y)] \
&Of which:\
&\ qquad f (x, y) is the input image\
&\ qquad g (x, y) is the processed output image\
&\ qquad t is an operator about f defined on the neighborhood of point (x,y), which can be applied to a single image or image set\
&\ qquad the typical neighborhood is generally a small rectangle centered on (x,y). When the neighborhood contains pixels outside the image, two processing methods are specified:\
&\ qquad \ qquad [1] \ ignore outer neighbors\
&\ qquad \ qquad [2] \ fill outer adjacent points (generally, the filling value is 0)
\end{align}
$$
This process is called spatial filtering, and the neighborhood together with predefined operations is called spatial filter (spatial mask / core / template / window)
wave filtering
The word filter is borrowed from frequency domain processing, which means to accept (pass) or reject a certain frequency component. The classification methods are as follows:
 Low pass filter: a filter that accepts (passes) low frequencies and blurs (smoothes) an image with its final effect.
 High pass filter:
Spatial filter
The spatial filter has linear and nonlinear operations (while the frequency domain filter cannot do nonlinear filtering):
 Linear spatial filter: performs linear operations on image pixels. There are two types of operations (but convolution is always confused with correlation):
 Correlation: the process of moving the filter template over the image and calculating the sum of the product of each position.
 Correlation is a function of filter displacement.
 A function (filter) is associated with a discrete unit impulse (a function containing all zeros and a single 1), and a 180 ° inverted version of the function is generated at the position of the impulse (a single 1)
 For the image (twodimensional function), a filter of size m*n w ( x , y ) w(x,y) w(x,y) and an image f ( x , y ) f(x,y) f(x,y) is related and recorded as w ( x , y ) ☆ f ( x , y ) = yes chart image upper each individual image element ( x , y ) ∑ s = − m / / 2 m / / 2 ∑ t = − n / / 2 n / / 2 w ( s , t ) f ( x + s , y + t ) w(x,y)☆f(x,y)=_ {for each pixel on the image (x,y)} \sum_{s=m//2}^{m//2} \sum_{t=n//2}^{n//2} w(s,t)f(x+s,y+t) w(x,y) ☆ f(x,y) = for each pixel on the image (x,y) ∑ s = − m//2m//2 ∑ t = − n//2n//2 w(s,t)f(x+s,y+t)
 Convolution: the filter is first rotated 180 ° and then correlated.
 The filter is rotated in advance and then correlated, that is, the filter is convoluted with the unit impulse to obtain a copy of the function at the impulse.
 Two dimensional image convolution operation, recorded as w ( x , y ) ★ f ( x , y ) = yes chart image upper each individual image element ( x , y ) ∑ s = − m / / 2 m / / 2 ∑ t = − n / / 2 n / / 2 w ( s , t ) f ( x − s , y − t ) w(x,y)★f(x,y)=_ {for each pixel on the image (x,y)} \sum_{s=m//2}^{m//2} \sum_{t=n//2}^{n//2} w(s,t)f(xs,yt) w(x,y) ★ f(x,y) = for each pixel on the image (x,y) ∑ s = − m//2m//2 ∑ t = − n//2n//2 w(s,t)f(x − s,y − T)
 Correlation: the process of moving the filter template over the image and calculating the sum of the product of each position.
 Nonlinear spatial filter:
 Statistical sorting: the response of this filter is based on the sorting of pixels contained in the image area surrounded by the filter, and the value determined by the statistical sorting result is used to replace the value of the central pixel.
Generation of spatial filter template
Smoothing spatial filter
Smoothing filters are used for blur processing and noise reduction. Blur processing is often used in preprocessing tasks (such as removing trivial details in the image and bridging the gap between lines / curves before large target extraction, also known as denoising). The grayscale of smaller objects is mixed with the background, and larger objects become like "spots" and easy to detect. Blur an image to get a rough description of ROI.
Smooth linear (mean) spatial filter
The output (response) of the smoothing phenomenon space filter is the simple average of the pixels contained in the neighborhood of the filter template, also known as the mean filter. Because the typical random noise consists of sharp change of gray level, the average gray value of pixels in the neighborhood determined by the filter template is used to replace the pixel value on the image, which reduces the sharp change of image gray level. However, the image edge is also composed of sharp gray changes, so the mean filtering will bring the negative effect of fuzzy edge.
Main applications of mean filtering:
 Remove irrelevant details in the image. This "irrelevant" refers to the pixel area which is smaller than the size of the filter template.
 Get a rough description of ROI, so that the gray level of smaller objects is fused with the background, and larger objects become like "spots" and easy to detect. (the size of the template is determined by the size of the smaller object to be fused by the background)
Classification of mean filter:
 Arithmetic mean filter
 Weighted average filter: set the weight based on the distance from the template anchor (Center). The farther the weight is, the smaller the weight is, which can reduce the degree of ambiguity in smoothing.
Smooth spatial nonlinear (statistical sorting) spatial filter
Classification of statistical sorting filter:
 Median filter: replace the value of the pixel with the median of the gray level in the neighborhood of the pixel.
 Application: for certain types of random noise (impulse noise / salt and pepper noise: superimposed on the image in the form of black and white dots), it has excellent denoising ability; And it has more ambiguity than the linear smoothing filter of the same size.
 Method: use the m*m median filter to remove the areas that are brighter / darker than the adjacent pixels < m 2 / 2 <m^2/2 < m2 / 2 isolated pixel family, "remove" means that it is forced to be the median gray level of the neighborhood, and the larger family is less affected.
 Maximum filter
 Minimum filter
Sharpening Spatial Filters
The main purpose of sharpening processing is to highlight the transition part of gray scale. The edge of digital image is often similar to slope transition in gray level (noise is also the part of gray level mutation. It is easy to get pseudo edge (noise) by sharpening directly on the image without smoothing).
The mean value processing in smoothing processing is similar to integration, and logically, the anti derivation sharpening processing should be similar to spatial differentiation. And the differential of any order is a prior operation, so the sharpening operators defined by the differential are linear operators.
The sharpening operator is defined and realized by digital differentiation (the response intensity of the differential operator is directly proportional to the mutation degree of the point where the image is operated by the operator), and the image differential enhances the edge or other mutations (such as noise) and weakens the area with slow gray change.
The differential operator only emphasizes the sudden change of gray level in the image, not the region where the gray level changes slowly. Differential sharpening produces an image that superimposes light gray edges and abrupt points on a dark background.
The sum of all coefficients of the differential operator template is 0, just like the expected value of the differential operator, indicating that the response of the gray constant region is 0 (black background).
Digital image differential (derivative) definition
digital image f ( x , y ) f(x,y) f(x,y) first order differential
Basic definition: ∂ f ∂ x = f ( x + 1 , y ) − f ( x , y ) ， ∂ f ∂ y = f ( x , y + 1 ) − f ( x , y ) \frac{\partial f}{\partial x} = f(x+1,y)f(x,y)，\frac{\partial f}{\partial y} = f(x,y+1)f(x,y) ∂x∂f=f(x+1,y)−f(x,y)，∂y∂f=f(x,y+1)−f(x,y)
Definition of terms:
 The branch in the constant gray area is zero
 The differential value is nonzero at the grayscale step \ slope
 The differential value along the gray slope is nonzero
digital image f ( x , y ) f(x,y) f(x,y) second order differential
Basic definition: ∂ 2 f ∂ x 2 = f ( x + 1 , y ) + f ( x − 1 , y ) − 2 f ( x , y ) ， ∂ 2 f ∂ y 2 = f ( x , y + 1 ) + f ( x , y − 1 ) − 2 f ( x , y ) \frac{\partial^2 f}{\partial x^2} = f(x+1,y)+f(x1,y)2f(x,y)，\frac{\partial^2 f}{\partial y^2} = f(x,y+1)+f(x,y1)2f(x,y) ∂x2∂2f=f(x+1,y)+f(x−1,y)−2f(x,y)，∂y2∂2f=f(x,y+1)+f(x,y−1)−2f(x,y)
Definition of terms:
 In the constant region, the differential value is zero
 The differential value is nonzero at the beginning of the grayscale step \ slope
 The differential value along the gray slope is nonzero
On the protruding edge, the firstorder differential is not as good as the secondorder differential
 First order differential: because the differential along the slope is nonzero, the firstorder differential of the image produces thicker edges.
 Second order differential: it can be seen from the observation that the gray slope \ step is equally separated by zero at the beginning and end of the secondorder differential, resulting in a pixel wide double edge in the secondorder differential of the image, which is stronger than the firstorder differential in enhancing details.
Realization of secondorder differential of twodimensional function and its application in image sharpening
Methods: a discrete formula of secondorder differential is defined, and then a filter template based on the formula is constructed
Isotropic filter: the response of this filter is independent of the sudden change direction of the image acted by the filter (that is, the isotropic filter does not rotate. Filter the source image after rotating = = filter the source image before rotating)
Laplace differential operator: the simplest isotropic differential operator

A twodimensional image f ( x , y ) f(x,y) Of f(x,y) x , y x,y x. The Laplace operator in the yaxis is defined as: ∇ 2 f = ∂ 2 f ∂ x 2 + ∂ 2 f ∂ y 2 = f ( x + 1 , y ) + f ( x − 1 , y ) + f ( x , y + 1 ) + f ( x , y − 1 ) − 4 f ( x , y ) \nabla^2f=\frac{\partial^2 f}{\partial x^2} + \frac{\partial^2 f}{\partial y^2}=f(x+1,y)+f(x1,y)+f(x,y+1)+f(x,y1)4f(x,y) ∂ 2f = ∂ x2 ∂ 2f + ∂ y2 ∂ 2f = f(x+1,y)+f(x − 1,y)+f(x,y+1)+f(x,y − 1) − 4f(x,y). The formula is implemented as a filter template (it can be seen that the linear filter with changed weight still uses correlation / convolution for calculation):
[ ( x − 1 , y − 1 ) ( x , y − 1 ) ( x + 1 , y − 1 ) ( x − 1 , y ) ( x , y ) ( x + 1 , y ) ( x − 1 , y + 1 ) ( x , y + 1 ) ( x + 1 , y + 1 ) ] = [ 0 1 0 1 − 4 1 0 1 0 ] \begin{bmatrix} (x1,y1) & (x,y1) & (x+1,y1) \\ (x1,y) & (x,y) & (x+1,y) \\ (x1,y+1) & (x,y+1) & (x+1,y+1) \\ \end{bmatrix} = \begin{bmatrix} 0 & 1 & 0 \\ 1 & 4 & 1 \\ 0 & 1 & 0 \\ \end{bmatrix} ⎣⎡(x−1,y−1)(x−1,y)(x−1,y+1)(x,y−1)(x,y)(x,y+1)(x+1,y−1)(x+1,y)(x+1,y+1)⎦⎤=⎣⎡0101−41010⎦⎤ 
Add two diagonal axes, and the Laplace operator is defined as: ∇ 2 f = f ( x + 1 , y ) + f ( x − 1 , y ) + f ( x , y + 1 ) + f ( x , y − 1 ) + f ( x − 1 , y − 1 ) + f ( x + 1 , y + 1 ) + f ( x + 1 , y − 1 ) + f ( x − 1 , y + 1 ) − 8 f ( x , y ) \nabla^2f=f(x+1,y)+f(x1,y)+f(x,y+1)+f(x,y1)+f(x1,y1)+f(x+1,y+1)+f(x+1,y1)+f(x1,y+1)8f(x,y) ∇2f=f(x+1,y)+f(x−1,y)+f(x,y+1)+f(x,y−1)+f(x−1,y−1)+f(x+1,y+1)+f(x+1,y−1)+f(x−1,y+1)−8f(x,y)
Filter template implementation (the result for 15 ° amplitude is isotropic):
[ ( x − 1 , y − 1 ) ( x , y − 1 ) ( x + 1 , y − 1 ) ( x − 1 , y ) ( x , y ) ( x + 1 , y ) ( x − 1 , y + 1 ) ( x , y + 1 ) ( x + 1 , y + 1 ) ] = [ 1 1 1 1 − 8 1 1 1 1 ] \begin{bmatrix} (x1,y1) & (x,y1) & (x+1,y1) \\ (x1,y) & (x,y) & (x+1,y) \\ (x1,y+1) & (x,y+1) & (x+1,y+1) \\ \end{bmatrix} = \begin{bmatrix} 1 & 1 & 1 \\ 1 & 8 & 1 \\ 1 & 1 & 1 \\ \end{bmatrix} ⎣⎡(x−1,y−1)(x−1,y)(x−1,y+1)(x,y−1)(x,y)(x,y+1)(x+1,y−1)(x+1,y)(x+1,y+1)⎦⎤=⎣⎡1111−81111⎦⎤ 
Equivalent but different sign Laplacian (subtraction for image merging):
$$
\begin{bmatrix}
0 & 1 & 0 \
1 & 4 & 1 \
0 & 1 & 0 \
\end{bmatrix},\
\begin{bmatrix}
1 & 1 & 1 \
1 & 8 & 1 \
1 & 1 & 1 \
\end{bmatrix}
$$
Basic methods of Laplacian image enhancement (restoring background characteristics and maintaining Laplacian sharpening effect):
 Original image  image processed by central negative Laplace filter
 Original image + image processed by Laplace filter with positive center value
The method is defined as: g ( x , y ) = f ( x , y ) + c [ ∇ 2 f ( x , y ) ] ， his in c by ± 1 take Decide to PULL universal PULL Si count son of in heart system number g(x,y)=f(x,y)+c[\nabla^2f(x,y)], where c is ± 1, depending on the central coefficient of the Laplace operator g(x,y)=f(x,y)+c [∇ 2f(x,y)], where c is ± 1, depending on the central coefficient of the Laplace operator
Realization of firstorder differential of twodimensional function and its application in (nonlinear) image sharpening
Gradient definition: the firstorder differential in image processing is realized by gradient amplitude, f ( x , y ) f(x,y) f(x,y) in coordinates ( x , y ) (x,y) The gradient of (x,y) is defined as a binary vector: ∇ f = g r a d ( f ) = [ g x g y ] = [ ∂ f ∂ x ∂ f ∂ y ] \nabla f = grad(f) = \begin{bmatrix}g_x \\ g_y \end{bmatrix} = \begin{bmatrix}\frac{\partial f}{\partial x} \\ \frac{\partial f}{\partial y} \end{bmatrix} ∂ f=grad(f)=[gx gy] = [∂ x ∂ f ∂ y ∂ F], which indicates f f f in coordinates ( x , y ) (x,y) The direction of the maximum rate of change at (x,y).
Gradient vector ∇ f \nabla f ≓ amplitude value of f (length): M ( x , y ) = m a g ( ∇ f ) = g x 2 + g y 2 or ∣ g x ∣ + ∣ g y ∣ M(x,y)=mag(\nabla f)=\sqrt{g_x^2+g_y^2} or  g_x+g_y M(x,y)=mag(∇f)=gx2+gy2 Or ∣ gx ∣ + ∣ gy ∣, indicating that the change rate of gradient vector direction is in the coordinate ( x , y ) (x,y) Value at (x,y).
At this point, we have: f ( x , y ) f(x,y) f(x,y) original image M ( x , y ) M(x,y) M(x,y) gradient image (the same size as the original image, recording the gradient direction change rate of each pixel of the original image; the component of the gradient vector is differential, so the gradient vector is a linear operator, but the gradient image is not a linear operator due to the square sum)
Define the discrete approximation of the formula:
Assume that the weight matrix is [ w 1 w 2 w 3 w 4 w 5 w 6 w 7 w 8 w 9 ] \begin{bmatrix} w_1 & w_2 & w_3 \\ w_4 & w_5 & w_6 \\ w_7 & w_8 & w_9 \\ \end{bmatrix} ⎣⎡w1w4w7w2w5w8w3w6w9⎦⎤

Early definition of the simplest approximation of firstorder differential: g x = f ( x + 1 , y ) − f ( x , y ) = w 6 − w 5 g y = f ( x , y + 1 ) − f ( x , y ) = w 8 − w 5 g_x=f(x+1,y)f(x,y)=w_6w_5 \\ g_y=f(x,y+1)f(x,y)=w_8w_5 gx=f(x+1,y)−f(x,y)=w6−w5gy=f(x,y+1)−f(x,y)=w8−w5

Roberts's simplest approximation of cross difference: g x = f ( x + 1 , y + 1 ) − f ( x , y ) = w 9 − w 5 g y = f ( x , y + 1 ) − f ( x + 1 , y ) = w 8 − w 6 g_x=f(x+1,y+1)f(x,y)=w_9w_5 \\ g_y=f(x,y+1)f(x+1,y)=w_8w_6 gx=f(x+1,y+1)−f(x,y)=w9−w5gy=f(x,y+1)−f(x+1,y)=w8−w6
Calculate gradient image: M ( x , y ) = ( w 9 − w 5 ) 2 + ( w 8 − w 6 ) 2 or ∣ w 9 − w 5 ∣ + ∣ w 8 − w 6 ∣ M(x,y)=\sqrt{(w_9w_5)^2 + (w_8w_6)^2} or  w_9w_5 + w_8w_6 M(x,y)=(w9−w5)2+(w8−w6)2 Or ∣ w9 − w5 ∣ + ∣ w8 − w6 ∣ 
The simplest approximation using a 3 * 3 template:
KaTeX parse error: No such environment: align at position 8: \begin{̲a̲l̲i̲g̲n̲}̲ &g_x\\ & \qqua...
Calculate gradient image: M ( x , y ) = ∣ ( w 7 + 2 w 8 + w 9 ) − ( w 1 + 2 w 2 + w 3 ) ∣ + ∣ ( w 3 + 2 w 6 + w 9 ) − ( w 1 + 2 w 4 + w 7 ) ∣ M(x,y)=(w_7+2w_8+w_9)(w_1+2w_2+w_3) + (w_3+2w_6+w_9)(w_1+2w_4+w_7) M(x,y)=∣(w7+2w8+w9)−(w1+2w2+w3)∣+∣(w3+2w6+w9)−(w1+2w4+w7)∣
Form filter template:

Robert cross gradient operator:
[ − 1 0 0 1 ] and [ 0 − 1 1 0 ] \Begin {bMatrix}  1 & 0 \ \ 0 & 1 \ \ end {bMatrix} and \ begin {bMatrix} 0 &  1 \ \ 1 & 0 \ \ end {bMatrix} [− 10 − 01] and [01 − 10]
Because there is no symmetry center, even size templates are difficult to achieve. 
Soble (3 * 3 template) operator:
[ − 1 − 2 − 1 0 0 0 1 2 1 ] and [ − 1 0 1 − 2 0 2 − 1 0 1 ] \Begin {bMatrix}  1 &  2 &  1 \ \ 0 & 0 \ \ 1 & 2 & 1 \ \ end {bMatrix} and \ begin {bMatrix}  1 & 0 & 1 \ \ 2 & 0 & 2 \ \ 1 & 0 & 1 \ \ end {bMatrix} ⎣− 101 − 202 − 101 ⎦⎤ and ⎣− 1 − 2 − 1 000 ⎦⎤121 ⎦⎤
Frequency domain processing
Operate on the Fourier transform of the image, not for the image itself
Image contour
Two basic implementations based on gray image: discontinuity and similarity
 Discontinuity: Based on its gray mutation
 Similarity: an image is segmented into similar regions according to a set of predefined criteria
Image contour, find contour, draw contour, fit contour
Although image edge detection technology can detect edges, most of the edges are discontinuous, so the obtained edges are discontinuous. The graphic contour refers to connecting the edges to form a whole for subsequent calculation (obtaining the size, position, direction and other information of the target image).
Find and draw profiles
Comprehensive example
import cv2 import numpy as np im = cv2.imread("data/3.png") cv2.imshow("orig", im) gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY) ret, binary = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY) # 1. Find contour contours, hierarchy = cv2.findContours(binary, # Binary processed image cv2.RETR_EXTERNAL, # Only the outer contour is detected cv2.CHAIN_APPROX_NONE) # Store all contour points # 2. Draw outline im_cnt = cv2.drawContours(im, # Draw an image (draw an outline on the figure) contours, # Contour point list 1, # Draw all contours (0, 0, 255), # Outline color: Red 2) # Outline thickness ( 1 is a solid outline) cv2.imshow("im_cnt", im_cnt) cv2.waitKey() cv2.destroyAllWindows()
Outline: a series of points that represent a curve in an image in some way.

Find contours findcontours (image, mode, method, contours = none, hierarchy = none, offset = none)  > Image: ndarray, contours: list [list], hierarchy

notes:
""" findContours(image, mode, method[, contours[, hierarchy[, offset]]]) > contours, hierarchy . @brief Finds contours in a binary image. ... . @param image Source, an 8bit singlechannel image. Nonzero pixels are treated as 1's. Zero . pixels remain 0's, so the image is treated as binary . You can use #compare, #inRange, #threshold , . #adaptiveThreshold, #Canny, and others to create a binary image out of a grayscale or color one. . If mode equals to #RETR_CCOMP or #RETR_FLOODFILL, the input can also be a 32bit integer image of labels (CV_32SC1). . @param contours Detected contours. Each contour is stored as a vector of points (e.g. . std::vector<std::vector<cv::Point> >). . @param hierarchy Optional output vector (e.g. std::vector<cv::Vec4i>), containing information about the image topology. It has . as many elements as the number of contours. For each ith contour contours[i], the elements . hierarchy[i][0] , hierarchy[i][1] , hierarchy[i][2] , and hierarchy[i][3] are set to 0based indices . in contours of the next and previous contours at the same hierarchical level, the first child . contour and the parent contour, respectively. If for the contour i there are no next, previous, . parent, or nested contours, the corresponding elements of hierarchy[i] will be negative. . @param mode Contour retrieval mode, see #RetrievalModes . @param method Contour approximation method, see #ContourApproximationModes . @param offset Optional offset by which every contour point is shifted. This is useful if the . contours are extracted from the image ROI and then they should be analyzed in the whole image . context. """

@param:
image Original image (grayscale image will be automatically processed into binary image).
In actual operation, the image of the contour to be found can be processed into a binary image by using functions such as threshold processing in advance.
Finally, the processed 1 image must be a gray binary image with black background and white foreground.mode Contour retrieval mode method Approximate method of contour (representation) mode Value meaning cv2.RETR_EXTERNAL Only the outer contour is detected cv2.RETR_LIST All contours are detected, and no hierarchical relationship is established for the detected contours cv2.RETR_CCOMP Retrieve all contours and organize them into a twolevel hierarchy. The upper layer is the outer boundary and the lower layer is the boundary of the inner hole cv2.RETR_TREE All contours are detected and a hierarchical tree structure is established method Value meaning cv2.CHAIN_APPROX_NONE Store all contour points, and the pixel position difference of two adjacent points shall not exceed 1, i.e m a x ( a b s ( x 1 − x 2 ) , a b s ( y 2 − y 1 ) ) < = 1 max(abs(x1x2),abs(y2y1))<=1 max(abs(x1−x2),abs(y2−y1))<=1 cv2.CHAIN_APPROX_SIMPLE Simplified format.
Compress the elements in horizontal, vertical and diagonal directions, and only retain the end coordinates in this directioncv2.CHAIN_APPROX_TC89_L1 A style of using teh chinl chain approximation algorithm cv2.CHAIN_APPROX_TC89_KCOS A style of using teh chinl chain approximation algorithm 
@return:
image
openCV 4.x has no such return valueConsistent with the original image in the function parameter contours
ndarray.ndim=3Returns the outline of the.
The return value returns a set of contour information. Each contour is composed of several points (each contour is represented by a list).
For example, contours[i] is the ith contour (subscript starts from 0), contours[i][j] is the jth point in the ith contour, and each point consists of (x,y).hierarchy The topological information of the image (reflecting the contour hierarchy).
The contours in the image may be in different positions. For example, one contour is inside another contour. In this case, we call the outer contour the parent contour and the inner contour the child contour.
According to the above relationship classification, a parentchild relationship is established between all contours in an image. Each contour contour [i] corresponds to four elements to illustrate the hierarchical relationship of the current contour. Its form is: [Next,Previous,First_Child,Parent], representing the index number of the next contour, the index number of the previous contour, the index number of the first sub contour and the index number of the parent contour respectively


Draw contours (image, contours, contouridx, color, thickness = none, linetype = none, hierarchy = none, MAXLEVEL = none, offset = none)  > Image: ndarray

notes:
""" drawContours(image, contours, contourIdx, color[, thickness[, lineType[, hierarchy[, maxLevel[, offset]]]]]) > image . @brief Draws contours outlines or filled contours. ... . @param image Destination image. . @param contours All the input contours. Each contour is stored as a point vector. . @param contourIdx Parameter indicating a contour to draw. If it is negative, all the contours are drawn. . @param color Color of the contours. . @param thickness Thickness of lines the contours are drawn with. If it is negative (for example, . thickness=#FILLED ), the contour interiors are drawn. . @param lineType Line connectivity. See #LineTypes . @param hierarchy Optional information about hierarchy. It is only needed if you want to draw only . some of the contours (see maxLevel ). . @param maxLevel Maximal level for drawn contours. If it is 0, only the specified contour is drawn. . If it is 1, the function draws the contour(s) and all the nested contours. If it is 2, the function . draws the contours, all the nested contours, all the nestedtonested contours, and so on. This . parameter is only taken into account when there is hierarchy available. . @param offset Optional contour shift parameter. Shift all the drawn contours by the specified . \f$\texttt{offset}=(dx,dy)\f$ . . @note When thickness=#FILLED, the function is designed to handle connected components with holes correctly . even when no hierarchy date is provided. This is done by analyzing all the outlines together . using evenodd rule. This may give incorrect results if you have a joint collection of separately retrieved . contours. In order to solve this problem, you need to call #drawContours separately for each subgroup . of contours, or iterate over the collection using contourIdx parameter. """

@param:
image Image to be outlined contours The contour to be drawn. The type of this parameter is the same as the output contour of the function CV2. Find contours(), both of which are of type List[List] contourIdx The edge index to be drawn tells the function cv2.drawContours() whether to draw a certain contour or all contours.
If the parameter is an integer or zero, it means to draw the contour of the corresponding index number in the contours; If the value is negative (usually " 1"), all contours are drawn.color The color drawn is expressed in BGR format (0 ~ 255,0 ~ 255,0 ~ 255) thickness Line thickness

Contour fitting
Contour fitting: the actual calculation of the contour does not require a complete curve, but uses an approximate polygon close to the contour to approximate it.
Comprehensive example
""" """ import cv2 import numpy as np im = cv2.imread('../data/cloud.png', 1) adp1 = im.copy() adp2 = im.copy() im_gray = cv2.cvtColor(im, code=cv2.COLOR_BGR2GRAY) t, im_binary = cv2.threshold(im_gray, 127, 255, cv2.THRESH_BINARY) # Contour lookup img, contours, hie = cv2.findContours(im_binary, mode=cv2.RETR_LIST, method=cv2.CHAIN_APPROX_NONE) # Generate ellipse positioning information according to contour ellipse_data = cv2.fitEllipse(contours[0]) print(ellipse_data) # ((Center) (short radius, long radius) (angle)) cv2.ellipse(im, ellipse_data, (0, 0, 255), 3) cv2.imshow('im', im) # Generate rectangular positioning information according to the contour rectangle_data = cv2.boundingRect(contours[0]) print(rectangle_data) # (upper left starting point x,y rectangle width, height) x, y, w, h = rectangle_data brcnt = np.array([[x, y], [x + w, y], [x + w, y + h], [x, y + h]]) # Orderly manufacturing coordinates cv2.drawContours(im, [brcnt], 1, (0, 255, 0), 3) cv2.imshow('im', im) # Generate circular positioning information according to the contour circle_data = cv2.minEnclosingCircle(contours[0]) print(circle_data) # ((center x,y) radius) (x, y), radius = circle_data center = int(x), int(y) radius = int(radius) cv2.circle(im, center, radius, (255, 0, 0), 3) # center, radius must be integer cv2.imshow('im', im) # Generate polygon positioning information according to contour # Precision 1 # adp1 = im.copy() epsilon1 = 0.005 * cv2.arcLength(contours[0], True) approx1 = cv2.approxPolyDP(contours[0], epsilon1, True) cv2.drawContours(adp1, [approx1], 0, (0, 0, 255), 2) cv2.imshow('adp1', adp1) # Precision 2 # adp2= im.copy() epsilon2 = 0.01 * cv2.arcLength(contours[0], True) approx2 = cv2.approxPolyDP(contours[0], epsilon2, True) cv2.drawContours(adp2, [approx2], 0, (255, 0, 0), 2) cv2.imshow('adp2', adp2) cv2.waitKey() cv2.destroyAllWindows()

Rectangular bounding box

Generate rectangular positioning information boundingRect(array) according to the contour

notes:
""" boundingRect(array) > retval . @brief Calculates the upright bounding rectangle of a point set or nonzero pixels of grayscale image. . . The function calculates and returns the minimal upright bounding rectangle for the specified point set or . nonzero pixels of grayscale image. . . @param array Input grayscale image or 2D point set, stored in std::vector or Mat. """


Draw a rectangle cv2.rectangle(img, pt1, pt2, color, thickness=None, lineType=None, shift=None) or cv2.drawContours()

notes:
""" rectangle(img, pt1, pt2, color[, thickness[, lineType[, shift]]]) > img . @brief Draws a simple, thick, or filled upright rectangle. ... . @param img Image. . @param pt1 Vertex of the rectangle. . @param pt2 Vertex of the rectangle opposite to pt1 . . @param color Rectangle color or brightness (grayscale image). . @param thickness Thickness of lines that make up the rectangle. Negative values, like #FILLED, . mean that the function has to draw a filled rectangle. . @param lineType Type of the line. See #LineTypes . @param shift Number of fractional bits in the point coordinates. ... """



Minimum enclosing circle

Generate the minimum enclosing circle positioning information CV2. Minenclosing circle (points) according to the contour

notes:
""" minEnclosingCircle(points) > center, radius . @brief Finds a circle of the minimum area enclosing a 2D point set. ... . @param points Input vector of 2D points, stored in std::vector\<\> or Mat . @param center Output center of the circle. . @param radius Output radius of the circle. """


Draw a circle cv2.circle(img, center, radius, color, thickness=None, lineType=None, shift=None)

notes:
""" circle(img, center, radius, color[, thickness[, lineType[, shift]]]) > img . @brief Draws a circle. ... . @param img Image where the circle is drawn. . @param center Center of the circle. . @param radius Radius of the circle. . @param color Circle color. . @param thickness Thickness of the circle outline, if positive. Negative values, like #FILLED, . mean that a filled circle is to be drawn. . @param lineType Type of the circle boundary. See #LineTypes . @param shift Number of fractional bits in the coordinates of the center and in the radius value. """



Optimal fitting ellipse

Generate the optimal fitting ellipse positioning information cv2.fitEllipse(points) according to the contour

notes:
""" fitEllipse(points) > retval . @brief Fits an ellipse around a set of 2D points. ... . @param points Input 2D point set, stored in std::vector\<\> or Mat """


Draw ellipse cv2.ellipse(img, center, axes, angle, startAngle, endAngle, color, thickness=None, lineType=None, shift=None)

notes:
""" ellipse(img, center, axes, angle, startAngle, endAngle, color[, thickness[, lineType[, shift]]]) > img . @brief Draws a simple or thick elliptic arc or fills an ellipse sector. ... . @param img Image. . @param center Center of the ellipse. . @param axes Half of the size of the ellipse main axes. . @param angle Ellipse rotation angle in degrees. . @param startAngle Starting angle of the elliptic arc in degrees. . @param endAngle Ending angle of the elliptic arc in degrees. . @param color Ellipse color. . @param thickness Thickness of the ellipse arc outline, if positive. Otherwise, this indicates that . a filled ellipse sector is to be drawn. . @param lineType Type of the ellipse boundary. See #LineTypes . @param shift Number of fractional bits in the coordinates of the center and values of axes. ... """



Approximating polygon

Generate approximation accuracy based on contour perimeter
epsilon = 0.005 * cv2.arcLength(contours[0], True)

Generate polygon positioning information cv2.approxPolyDP(curve, epsilon, closed, approxCurve=None) according to contour and accuracy

notes:
""" approxPolyDP(curve, epsilon, closed[, approxCurve]) > approxCurve . @brief Approximates a polygonal curve(s) with the specified precision. ... . @param curve Input vector of a 2D point stored in std::vector or Mat . @param approxCurve Result of the approximation. The type should match the type of the input curve. . @param epsilon Parameter specifying the approximation accuracy. This is the maximum distance . between the original curve and its approximation. . @param closed If true, the approximated curve is closed (its first and last vertices are . connected). Otherwise, it is not closed. """


Draw polygons using cv2.drwContours()

Contour information
 Contour perimeter cv.arcLength(curve, closed)
 Contour area cv2.contourArea(contour, oriented=None)
Image preprocessing in AI
The purpose of image preprocessing is that the image data is more suitable for the training of AI model
Data enhancement (extended dataset)
Zoom, stretch, add noise, flip, rotate, translate, cut, contrast adjust, channel transform
Increase the number of images and enhance image quality
liptic arc or fills an ellipse sector.
...
. @param img Image.
. @param center Center of the ellipse.
. @param axes Half of the size of the ellipse main axes.
. @param angle Ellipse rotation angle in degrees.
. @param startAngle Starting angle of the elliptic arc in degrees.
. @param endAngle Ending angle of the elliptic arc in degrees.
. @param color Ellipse color.
. @param thickness Thickness of the ellipse arc outline, if positive. Otherwise, this indicates that
. a filled ellipse sector is to be drawn.
. @param lineType Type of the ellipse boundary. See #LineTypes
. @param shift Number of fractional bits in the coordinates of the center and values of axes.
...
"""
```

Approximating polygon

Generate approximation accuracy based on contour perimeter
epsilon = 0.005 * cv2.arcLength(contours[0], True)

Generate polygon positioning information cv2.approxPolyDP(curve, epsilon, closed, approxCurve=None) according to contour and accuracy

notes:
""" approxPolyDP(curve, epsilon, closed[, approxCurve]) > approxCurve . @brief Approximates a polygonal curve(s) with the specified precision. ... . @param curve Input vector of a 2D point stored in std::vector or Mat . @param approxCurve Result of the approximation. The type should match the type of the input curve. . @param epsilon Parameter specifying the approximation accuracy. This is the maximum distance . between the original curve and its approximation. . @param closed If true, the approximated curve is closed (its first and last vertices are . connected). Otherwise, it is not closed. """


Draw polygons using cv2.drwContours()

Contour information
 Contour perimeter cv.arcLength(curve, closed)
 Contour area cv2.contourArea(contour, oriented=None)
Image preprocessing in AI
The purpose of image preprocessing is that the image data is more suitable for the training of AI model
Data enhancement (extended dataset)
Zoom, stretch, add noise, flip, rotate, translate, cut, contrast adjust, channel transform
Increase the number of images and enhance image quality