Computer vision
参考书籍
- Computer Vision: Algorithms and Applications, 2nd Edition by Richard Szeliski 理论基础
- Digital Image Processing, Third Edition, by Rafael C. Gonzalez and Richard E. Woods 着重于信号处理
- Learning OpenCV, by Adrian Kaehler and Gary Bradski 经典动物书,与第一本配套看,着重于实现与API手册
图像的形成 Image formation
3D到2D的映射
相机内参
- 内参
:skew :aspect ratio - 在许多应用中简化为
,
3D空间投影至2D成像平面
- 假设成像平面坐标为
,3D空间内坐标为 ,相机内参为 (例如简化形态 , ) - 代码示例
def proj_to_2d(k, pt_3d):
## project 3d to 2d using the camera's intrisic matrix
## [u,v,1] = (1/z)*(k*[x,y,z]^T)
k = np.array(k).reshape((3,3))
px = np.dot(k, pt_3d.reshape((3,1)))
px = (int(px[0,0]/pt_3d[2]), int(px[1,0]/pt_3d[2]))
return px
2D到3D的逆映射
透视n点的位姿估计问题Perspective-n-Point pose computation
- 已知物体尺寸
- 可调用OpenCV的
solvePnP()
函数求解 - 应用举例:二维码的位姿检定
深度图的逆映射
- 如RealSense产生的,和RGB图像大小一致,
作为相应像素点的值,为3D空间投影至2D成像平面的逆运算。 (实际操作中 和 还加了负号改变方向?)
import open3d as o3d
self.pcd = o3d.geometry.PointCloud()
def img_callback(self, data):
if self.is_k_empty == False:
self.height = data.height
self.width = data.width
np_cloud = np.zeros((self.height*self.width,3))
# print(self.k)
for iy in range(self.height):
for ix in range(self.width):
idx = iy*self.width+ix
z = (data.data[idx*2+1]*256+data.data[idx*2])/1000.0
if z!=0:
## x, y are on the camera plane, z is the depth
#np_cloud[idx][0] = z*(ix-self.k[2])/self.k[0] #x
#np_cloud[idx][1] = z*(iy-self.k[5])/self.k[4] #y
#np_cloud[idx][2] = z
## same coordinate as `/camera/depth/image_rect_raw`
## y (left & right), z (up & down) are on the camera plane, x is the depth
np_cloud[idx][1] = -z*(ix-self.k[2])/self.k[0]
np_cloud[idx][2] = -z*(iy-self.k[5])/self.k[4]
np_cloud[idx][0] = z
self.pcd.points = o3d.utility.Vector3dVector(np_cloud)
## publish as a ROS message
# header = Header()
# header.stamp = rospy.Time.now()
# header.frame_id = "camera_depth_frame"
# fields = self.FIELDS_XYZ
# pc2_data = pc2.create_cloud(header, fields, np.asarray(np_cloud))
# self.pub.publish(pc2_data)
self.is_data_updated = True
色彩空间
RGB
HSV/HSL
- 三个参数分别为色调Hue,饱和度Saturation和亮度Value
- 采用HSV色彩空间进行颜色特征检测时如何选择取值区间:
- 可以先截取一张图片,使用工具HsvRangeTool帮助找出上下限
- 通过调整H的区间选取特征颜色
- 通过调整S选择白色的成分(保留适当的饱和度)
- 通过调整V选择黑色的成分
LAB
- 三个参数分别为亮度Luminosity,洋红色到绿色A,黄色到蓝色B
LUV
- 三个参数分别为L为物体亮度,U和V为色度,由CIE XYZ空间变换而来
图像处理Image processing
滤波器
- 图像平滑滤波:均值滤波,中值滤波,高斯滤波
- Sobel
相邻算子的操作More neighborhood operators
形态学Morphology
- 侵蚀erosion
- 膨胀dilation
- majority
- opening:先做侵蚀,再做膨胀,在噪点较多的情况下,移除面积较小的噪点,同时保持目标的面积
- closing:先做膨胀,再做侵蚀,某些情况下(如阈值较高,或原图像质量不好)去掉了较多实际应划归目标的像素,使得处理前应该是一整块的目标中间有断开的部分,先通过膨胀将断开的目标碎片连接起来,再通过侵蚀保持其外形大小
特征检测与匹配Feature detection and matching
特征点
边缘
边缘检测Edge detection
Canny
LoG和DoG
Edge linking
线的检测
霍夫变换Hough transforms
图像分割Segmentation
基于阈值的分割
固定阈值分割
直方图双峰法
Prewitt et al.
迭代阈值图像分割
自适应阈值图像风格
大津法/OTSU/最大类间方差法
- Otsu的实现
def otsu(src):
```
input: a single channel image
output: a binary image, using otsu thresholding
```
hist, _ = np.histogram(src, bins=list(range(257)))
# OTSU method to estimate the threshold
hist_norm = hist/hist.sum()
Q = hist_norm.cumsum()
bins = np.arange(256)
fn_min = np.inf
thresh = -1
for i in range(255):
p1,p2 = np.hsplit(hist_norm,[i]) # probabilities
q1,q2 = Q[i],Q[255]-Q[i] # cum sum of classes
if q1 < 1.e-6 or q2 < 1.e-6:
continue
b1,b2 = np.hsplit(bins,[i]) # weights
# finding means and variances
m1,m2 = np.sum(p1*b1)/q1, np.sum(p2*b2)/q2
v1,v2 = np.sum(((b1-m1)**2)*p1)/q1,np.sum(((b2-m2)**2)*p2)/q2
# calculates the minimization function
fn = v1*q1 + v2*q2
if fn < fn_min:
fn_min = fn
thresh = i
# print("Threshold is %d"%thresh)
# ## if you want to check the histogram
# plt.hist(v.ravel(),256,[0,256])
# plt.show()
[height, width] = src.shape
binary = np.zeros([height, width])
for iy in range(height):
for ix in range(width):
if src[iy, ix] > thresh:
binary[iy, ix] = 255
else:
binary[iy, ix] = 0
return binary
均值法
最佳阈值
应用聚类算法
K-means和GMM
DBSCAN (Density-Based Spatial Clustering of Applications with Noise)
超像素Superpixels
- https://cm_westwood.gitee.io/image_processing_homework/#header-n54
- 算法的提出 Ren, Malik. Learning a classification model for segmentation[C]. international conference on computer vision, 2003: 10-17.
- 评价标准 Den Bergh M V, Boix X, Roig G, et al. SEEDS: Superpixels Extracted Via Energy-Driven Sampling[J]. International Journal of Computer Vision, 2015, 111(3): 298-314.
经典算法 SLIC
- Stutz D, Hermans A, Leibe B, et al. Superpixels: An evaluation of the state-of-the-art[J]. Computer Vision and Image Understanding, 2018: 1-27.
改进算法SEEDS,ETPS
-
Yao J, Boben M, Fidler S, et al. Real-time coarse-to-fine topologically preserving segmentation[C]. computer vision and pattern recognition, 2015: 2947-2955.
-
https://zhuanlan.zhihu.com/p/30732385
- 基于阈值的分割
- 固定阈值分割
- 直方图双峰法Prewitt et al.
- 迭代阈值图像分割
- 自适应阈值图像风格
- 大津法/OTSU/最大类间方差法
- 均值法
- 最佳阈值
- 基于边缘的分割
- Canny边缘检测器
- 需要两个参数maxVal和minVal
- Harris角点检测器
- SIFT检测器
- SURF检测器
- Canny边缘检测器
- 基于区域的分割
- 种子区域生长法, Levine et al.
- 区域分裂合并法Gonzalez et al.
- 分水岭法Meyer et al.
- 基于图论的分割
- GraphCut
- 基于能量泛函的分割
- 基于阈值的分割
Feature-based alignment
特征检测
哈尔级联Haar Cascades
- 主要用于人脸检测?
尺度不变特征变换SIFT与加速鲁棒特征SURF
- SIFT - ScaleInvariant Feature Transform
- SURF - Speeded Up Robust Feature
- SIFT的加速版
- 尺度不变
但是SIFT相对于SURF的优点就是,由于SIFT基于浮点内核计算特征点,因此通常认为, SIFT算法检测的特征在空间和尺度上定位更加精确,所以在要求匹配极度精准且不考虑匹配速度的场合可以考虑使用SIFT算法。
二级制鲁棒独立基本特征BRIEF与旋转的BRIEF——ORB
ORB没有解决尺度不变性
FAST
- 加速分割测试获得特征, Features from Accelerated Segment Test
- 速度快
哈里斯角点检测Harris
iterative algorithms
- iterative algorithms
- ICP
- Go-ICP
- ICP
- RANSAC
- RANdom SAmple Consensus (Fischler and Bolles 1981, Random sample consensus: A paradigm for model fitting with application to image analysis and automated cartography)
根据一组包含异常数据的样本数据集,计算出数据的数学模型参数,得到有效样本数据的算法
- RANdom SAmple Consensus (Fischler and Bolles 1981, Random sample consensus: A paradigm for model fitting with application to image analysis and automated cartography)
常用的测量区域属性(measure properties of image regions)的方法
- https://www.mathworks.com/help/images/ref/regionprops.html
- https://docs.opencv.org/4.x/d1/d32/tutorial_py_contour_properties.html
立体视觉的图像匹配
相似度的度量/模板匹配
Sum of Absolute Diffeerence (SAD)
- 公式
Sum of Squared Differences (SSD)
- 公式
Zero-mean SAD
- 公式
Locally scaled SAD
- 公式
Normalized Cross Correlation(NCC)
- 公式
未分类
- Fast Point Feature Histograms
https://www.cvl.iis.u-tokyo.ac.jp/class2016/2016w/papers/6.3DdataProcessing/Rusu_FPFH_ICRA2009.pdf
Differences between PFH and FPFH
The main differences between the PFH and FPFH formulations are summarized below: 1. the FPFH does not fully interconnect all neighbors of  as it can be seen from the figure, and is thus missing some value pairs which might contribute to capture the geometry around the query point; 2. the PFH models a precisely determined surface around the query point, while the FPFH includes additional point pairs outside the **r** radius sphere (though at most **2r** away); 3. because of the re-weighting scheme, the FPFH combines SPFH values and recaptures some of the point neighboring value pairs; 4. the overall complexity of FPFH is greatly reduced, thus making possible to use it in real-time applications; 5. the resultant histogram is simplified by decorrelating the values, that is simply creating _d_ separate feature histograms, one for each feature dimension, and concatenate them together (see figure below).