Computer vision

参考书籍

  • Computer Vision: Algorithms and Applications, 2nd Edition by Richard Szeliski 理论基础
  • Digital Image Processing, Third Edition, by Rafael C. Gonzalez and Richard E. Woods 着重于信号处理
  • Learning OpenCV, by Adrian Kaehler and Gary Bradski 经典动物书,与第一本配套看,着重于实现与API手册

图像的形成 Image formation

3D到2D的映射

相机内参

  • 内参
    • :skew
    • :aspect ratio
    • 在许多应用中简化为

3D空间投影至2D成像平面

  • 假设成像平面坐标为,3D空间内坐标为,相机内参为(例如简化形态
  • 代码示例
def proj_to_2d(k, pt_3d):
    ## project 3d to 2d using the camera's intrisic matrix
    ## [u,v,1] = (1/z)*(k*[x,y,z]^T)
    k = np.array(k).reshape((3,3))
    px = np.dot(k, pt_3d.reshape((3,1)))
    px = (int(px[0,0]/pt_3d[2]), int(px[1,0]/pt_3d[2]))
    return px

2D到3D的逆映射

透视n点的位姿估计问题Perspective-n-Point pose computation

深度图的逆映射

  • 如RealSense产生的,和RGB图像大小一致,作为相应像素点的值,为3D空间投影至2D成像平面的逆运算。 (实际操作中还加了负号改变方向?)
import open3d as o3d
 
self.pcd = o3d.geometry.PointCloud()
 
def img_callback(self, data):
        if self.is_k_empty == False:
            self.height = data.height
            self.width = data.width
 
            np_cloud = np.zeros((self.height*self.width,3))
            # print(self.k)
            for iy in range(self.height):
                for ix in range(self.width):
                    idx = iy*self.width+ix
                    z = (data.data[idx*2+1]*256+data.data[idx*2])/1000.0
                    if z!=0:
                        ## x, y are on the camera plane, z is the depth
                        #np_cloud[idx][0] = z*(ix-self.k[2])/self.k[0] #x
                        #np_cloud[idx][1] = z*(iy-self.k[5])/self.k[4] #y
                        #np_cloud[idx][2] = z
                        ## same coordinate as `/camera/depth/image_rect_raw`
                        ## y (left & right), z (up & down) are on the camera plane, x is the depth
                        np_cloud[idx][1] = -z*(ix-self.k[2])/self.k[0]
                        np_cloud[idx][2] = -z*(iy-self.k[5])/self.k[4]
                        np_cloud[idx][0] = z
 
            self.pcd.points = o3d.utility.Vector3dVector(np_cloud)
            ## publish as a ROS message
            # header = Header()
            # header.stamp = rospy.Time.now()
            # header.frame_id = "camera_depth_frame"
            # fields = self.FIELDS_XYZ
 
            # pc2_data = pc2.create_cloud(header, fields, np.asarray(np_cloud))
            # self.pub.publish(pc2_data)
            self.is_data_updated = True

色彩空间

RGB

HSV/HSL

  • 三个参数分别为色调Hue,饱和度Saturation和亮度Value
  • 采用HSV色彩空间进行颜色特征检测时如何选择取值区间:
    • 可以先截取一张图片,使用工具HsvRangeTool帮助找出上下限
    • 通过调整H的区间选取特征颜色
    • 通过调整S选择白色的成分(保留适当的饱和度)
    • 通过调整V选择黑色的成分

LAB

  • 三个参数分别为亮度Luminosity,洋红色到绿色A,黄色到蓝色B

LUV

  • 三个参数分别为L为物体亮度,U和V为色度,由CIE XYZ空间变换而来

图像处理Image processing

滤波器

  • 图像平滑滤波:均值滤波,中值滤波,高斯滤波
  • Sobel

相邻算子的操作More neighborhood operators

形态学Morphology

  • 侵蚀erosion
  • 膨胀dilation
  • majority
  • opening:先做侵蚀,再做膨胀,在噪点较多的情况下,移除面积较小的噪点,同时保持目标的面积
  • closing:先做膨胀,再做侵蚀,某些情况下(如阈值较高,或原图像质量不好)去掉了较多实际应划归目标的像素,使得处理前应该是一整块的目标中间有断开的部分,先通过膨胀将断开的目标碎片连接起来,再通过侵蚀保持其外形大小

特征检测与匹配Feature detection and matching

特征点

边缘

边缘检测Edge detection

Canny
LoG和DoG

Edge linking

线的检测

霍夫变换Hough transforms

图像分割Segmentation

基于阈值的分割

固定阈值分割

直方图双峰法

Prewitt et al.

迭代阈值图像分割

自适应阈值图像风格

大津法/OTSU/最大类间方差法
  • Otsu的实现
def otsu(src):
    ```
    input: a single channel image
    output: a binary image, using otsu thresholding
    ```
    hist, _ = np.histogram(src, bins=list(range(257)))
 
    # OTSU method to estimate the threshold
    hist_norm = hist/hist.sum()
    Q = hist_norm.cumsum()
    bins = np.arange(256)
    fn_min = np.inf
    thresh = -1
    for i in range(255):
        p1,p2 = np.hsplit(hist_norm,[i]) # probabilities
        q1,q2 = Q[i],Q[255]-Q[i] # cum sum of classes
        if q1 < 1.e-6 or q2 < 1.e-6:
            continue
        b1,b2 = np.hsplit(bins,[i]) # weights
        # finding means and variances
        m1,m2 = np.sum(p1*b1)/q1, np.sum(p2*b2)/q2
        v1,v2 = np.sum(((b1-m1)**2)*p1)/q1,np.sum(((b2-m2)**2)*p2)/q2
        # calculates the minimization function
        fn = v1*q1 + v2*q2
        if fn < fn_min:
            fn_min = fn
            thresh = i
 
    # print("Threshold is %d"%thresh)
    # ## if you want to check the histogram
    # plt.hist(v.ravel(),256,[0,256])
    # plt.show()
 
    [height, width] = src.shape
    binary = np.zeros([height, width])
    for iy in range(height):
        for ix in range(width):
            if src[iy, ix] > thresh:
                binary[iy, ix] = 255
            else:
                binary[iy, ix] = 0
 
    return binary
均值法

最佳阈值

应用聚类算法

K-means和GMM

DBSCAN (Density-Based Spatial Clustering of Applications with Noise)

超像素Superpixels

  • https://cm_westwood.gitee.io/image_processing_homework/#header-n54
  • 算法的提出 Ren, Malik. Learning a classification model for segmentation[C]. international conference on computer vision, 2003: 10-17.
  • 评价标准 Den Bergh M V, Boix X, Roig G, et al. SEEDS: Superpixels Extracted Via Energy-Driven Sampling[J]. International Journal of Computer Vision, 2015, 111(3): 298-314.
经典算法 SLIC
  • Stutz D, Hermans A, Leibe B, et al. Superpixels: An evaluation of the state-of-the-art[J]. Computer Vision and Image Understanding, 2018: 1-27.
改进算法SEEDS,ETPS
  • Yao J, Boben M, Fidler S, et al. Real-time coarse-to-fine topologically preserving segmentation[C]. computer vision and pattern recognition, 2015: 2947-2955.

  • https://zhuanlan.zhihu.com/p/30732385

    • 基于阈值的分割
      • 固定阈值分割
      • 直方图双峰法Prewitt et al.
      • 迭代阈值图像分割
      • 自适应阈值图像风格
        • 大津法/OTSU/最大类间方差法
        • 均值法
      • 最佳阈值
    • 基于边缘的分割
      • Canny边缘检测器
        • 需要两个参数maxVal和minVal
      • Harris角点检测器
      • SIFT检测器
      • SURF检测器
    • 基于区域的分割
      • 种子区域生长法, Levine et al.
      • 区域分裂合并法Gonzalez et al.
      • 分水岭法Meyer et al.
    • 基于图论的分割
      • GraphCut
    • 基于能量泛函的分割

Feature-based alignment

特征检测

哈尔级联Haar Cascades

  • 主要用于人脸检测?

尺度不变特征变换SIFT与加速鲁棒特征SURF

  •  SIFT - ScaleInvariant Feature Transform
  • SURF - Speeded Up Robust Feature
    • SIFT的加速版
    • 尺度不变

但是SIFT相对于SURF的优点就是,由于SIFT基于浮点内核计算特征点,因此通常认为, SIFT算法检测的特征在空间和尺度上定位更加精确,所以在要求匹配极度精准且不考虑匹配速度的场合可以考虑使用SIFT算法。

二级制鲁棒独立基本特征BRIEF与旋转的BRIEF——ORB

ORB没有解决尺度不变性

FAST

  • 加速分割测试获得特征, Features from Accelerated Segment Test
  • 速度快

哈里斯角点检测Harris

iterative algorithms

  • iterative algorithms
    • ICP
      • Go-ICP
  • RANSAC
    • RANdom SAmple Consensus (Fischler and Bolles 1981, Random sample consensus: A paradigm for model fitting with application to image analysis and automated cartography)

      根据一组包含异常数据的样本数据集,计算出数据的数学模型参数,得到有效样本数据的算法

常用的测量区域属性(measure properties of image regions)的方法

立体视觉的图像匹配

相似度的度量/模板匹配

Sum of Absolute Diffeerence (SAD)

  • 公式

Sum of Squared Differences (SSD)

  • 公式

Zero-mean SAD

  • 公式

Locally scaled SAD

  • 公式

Normalized Cross Correlation(NCC)

  • 公式

未分类

The main differences between the PFH and FPFH formulations are summarized below:
1.  the FPFH does not fully interconnect all neighbors of ![p_q](https://pcl.readthedocs.io/projects/tutorials/en/latest/_images/math/1557bf30f68d8d9460f124be9bad1e7739a98601.png) as it can be seen from the figure, and is thus missing some value pairs which might contribute to capture the geometry around the query point;
2.  the PFH models a precisely determined surface around the query point, while the FPFH includes additional point pairs outside the **r** radius sphere (though at most **2r** away);
3.  because of the re-weighting scheme, the FPFH combines SPFH values and recaptures some of the point neighboring value pairs;
4.  the overall complexity of FPFH is greatly reduced, thus making possible to use it in real-time applications;
5.  the resultant histogram is simplified by decorrelating the values, that is simply creating _d_ separate feature histograms, one for each feature dimension, and concatenate them together (see figure below).