OpenCV-Python Image Processing Common Methods Summary (Part One)

The cover image is published by Pexels on Pixabay, and I added the OpenCV logo to it.

After completing my graduation project, I haven't touched anything related to image processing. After joining the company, I started learning and working as a front-end developer (which is why I have Vue study notes). But when I started working on image processing again, I realized that I had completely forgotten how to call functions. So I decided to organize a list.

I'm used to importing OpenCV as cv, so the following calls are based on cv. In the format, dst represents the mat object of the target image, and src represents the mat object of the original image (the one read with imread).

Getting the size of an image#

By printing src.shape, you can see that the result is (height, width, number of channels). So to get the height and width of the image, you can use the following statement:

src_height, src_width = src.shape[0:2]

Image resizing function resize()#

Common function format:

dst = cv.resize(src, dsize)

Where dsize is a tuple similar to (int(source_width / 2), int(source_height / 2)), which represents the scaling factor of the image. In the example, it is scaled down by half while maintaining the aspect ratio. The parameter can be adjusted.

Image color conversion function cvtColor()#

Common function format:

dst = cv.cvtColor(src, colorCode)

The colorCode part has corresponding codes in the library, such as cv.COLOR_RGB2GRAY.

This function is commonly used for grayscale conversion, etc.

Image denoising method GaussianBlur()#

Common function format:

dst = cv.GaussianBlur(src, ksize, sigmaX)

ksize is the size of the convolution kernel, which can only be a positive odd tuple, such as (3, 3), (5, 5), etc. In simple terms, it is the neighborhood size, which processes the area within a certain number of pixels around a pixel. The larger the ksize, the blurrier the result.

sigmaX is actually σX, which refers to the standard deviation in the X direction of the image. It is a necessary parameter. If σY is not specified, it is set based on σX.

Image binarization function threshold()#

Common function format:

dst = cv.threshold(src, thresh, maxval, type)

Where thresh is the threshold value, maxval is the maximum value to be set, and it only takes effect when type is cv.THRESH_BINARY or cv.THRESH_BINARY_INV.

type is the binarization method, which has corresponding values in the library, as shown below:

cv.THRESH_BINARY: Sets the value to maxval if the current point is greater than thresh, otherwise sets it to 0.

cv.THRESH_BINARY_INV: Sets the value to 0 if the current point is greater than the threshold, otherwise sets it to maxval.

THRESH_TRUNC: Sets the value to the threshold if the current point is greater than the threshold, otherwise does not change it.

THRESH_TOZERO: Does not change the value if the current point is greater than the threshold, otherwise sets it to 0.

THRESH_TOZERO_INV: Sets the value to 0 if the current point is greater than the threshold, otherwise does not change it.

Canny edge detection Canny()#

Common function format:

dst = cv.Canny(src, thresh1, thresh2)

Where pixels with values lower than thresh1 are considered non-edges, pixels with values higher than thresh2 are considered edges, and pixels with values between the two thresholds are considered edges if they are adjacent to pixels that are considered edges.

Contour detection findContours() and contour drawing drawContours()#

Common function format:

contours, hierarchy = cv.findContours(src, mode, method)
dst = cv.drawContours(src, contours, contoursIdx, color, thickness)

In findContours, mode is the contour retrieval mode, for example, cv.RETR_TREE can establish the complete hierarchy of contours, and method is the contour approximation method, for example, cv.CHAIN_APPROX_SIMPLE represents the contour with as few pixels as possible.

In drawContours, contours are the contours detected in the previous step, contoursIdx specifies the contour to be drawn, if it is -1, all contours are drawn, color specifies the color, which can be represented in a format similar to (255, 0, 0) for an RGB color, thickness specifies the thickness of the contour line, which is an optional parameter.

Usually, contour detection is performed on the image after edge detection, and the contour effect obtained is generally good.

Hough transform HoughLines()#

Common function format:

lines = cv.HoughLines(src, rho, theta,thresh)

The output is a set of detected lines. rho is the distance accuracy in pixels, theta is the angle accuracy in radians. Here, polar coordinates are used to represent the lines. thresh is the threshold. In practical use, if there are no special circumstances, it is generally set to 1, np.pi / 180, 0.

This function is used for line detection, and the application scenario is to rotate the image to align with the angle corresponding to the slope of the line. For example, the following code is used to rotate the image according to the angle corresponding to the slope of the line:

lines = cv.HoughLines(img_canny, 1, np.pi / 180, 0)
for rho, theta in lines[0]:
    a = np.cos(theta)
    b = np.sin(theta)
    x0 = a * rho
    y0 = b * rho
    x1 = int(x0 + 1000 * (-b))
    y1 = int(y0 + 1000 * a)
    x2 = int(x0 - 1000 * (-b))
    y2 = int(y0 - 1000 * a)
    if x1 == x2 or y1 == y2:
        continue
    t = float(y2 - y1) / (x2 - x1)
    rotate_angle = math.degrees(math.atan(t))
    img_result = ndimage.rotate(img_resize, rotate_angle)