What are features
Features Extraction
特征提取的主要目的是:降维!
特征提取的主要目的是:降维!
特征提取的主要目的是:降维!
重要的事情说三遍!!!
本篇文章学习一下什么是特征,在刚开始读研的时候,看论文,跑网上的实例,始终体会特征这两个字的真正含义,没办法在大脑中构建一个关于特征的清晰的具象化,数字化,量化的概念,直到后来慢慢学习的多了,了解的多了,实践的多了,才有了一些自己的认知。本文就记录一下自己的认知过程,希望对今后自己能有些启发,对看到了人们能有些帮助,就够了。
在学习OpenCV的过程中,第一章就是介绍 What are Features, why corners are important.
比如说我们看到任何动物,看到任何常见物体,都能迅速的做出判断,它是什么。这个结果是如何产生的呢?中间又经历了哪些过程呢?我想这大概率就是特征工程的起源,想真实反映大脑的思维,并通过计算机技术在编程中实现,不过就目前来讲很难说我们分辨某个事物是通过目前计算机领域所定义的特征来识别的,也很难描述人脑是如何找到这些特征的,不过在计算机图形学领域,确实已经做出了一些富有成效的工作。
例如下面这张图:
The image is very simple. At the top of image, six small image patches are given. Question for you is to find the exact location of these patches in the original image. How many correct results can you find?
A and B are flat surfaces and they are spread over a lot of area. It is difficult to find the exact location of these patches.
C and D are much more simple. They are edges of the building. You can find an approximate location, but exact location is still difficult. This is because the pattern is same everywhere along the edge. At the edge, however, it is different. An edge is therefore better feature compared to flat area, but not good enough (It is good in jigsaw puzzle for comparing continuity of edges).
Finally, E and F are some corners of the building. And they can be easily found. Because at the corners, wherever you move this patch, it will look different. So they can be considered as good features. So now we move into simpler (and widely used image) for better understanding.
译文:
这个图像非常简单。在图像的顶部,给出了六个小图像。您的问题是要在原始图像中找到这些小图像的确切位置。你能找到多少正确的结果?
A和B是平坦的表面,它们分布在很多区域。很难找到这些小图像的确切位置。
C和D更简单。它们是建筑物的边缘。你可以找到一个大概的位置,但确切的位置仍然很困难。这是因为沿边缘的模式是相同的。然而,在边缘,它是不同的。因此,与平坦区域相比,边缘的特征更好,但不够好(对于比较边缘的连续性,它在拼图游戏中效果很好)。
最后,E和F是建筑物的一些角落。他们可以很容易的在原图像中找到。因为对于‘角’而言,无论你如何移动这个‘角’的小图像,它都不会在原图像中轻易找到相同的。所以他们可以当做最合适的特征也(就是feature)。
现在我们了解一个更简单的例子(并广泛使用的图像)以便更好地理解。图片如下:
Just like above, the blue patch is flat area and difficult to find and track. Wherever you move the blue patch it looks the same. The black patch has an edge. If you move it in vertical direction (i.e. along the gradient) it changes. Moved along the edge (parallel to edge), it looks the same. And for red patch, it is a corner. Wherever you move the patch, it looks different, means it is unique. So basically, corners are considered to be good features in an image. (Not just corners, in some cases blobs are considered good features).
So now we answered our question, "what are these features?". But next question arises. How do we find them? Or how do we find the corners?. We answered that in an intuitive way, i.e., look for the regions in images which have maximum variation when moved (by a small amount) in all regions around it. This would be projected into computer language in coming chapters. So finding these image features is called Feature Detection.
就像上面那样,蓝色斑块是平坦的区域,很难找到它在原图中的具体位置。因为在原图中蓝色的区域,不论你如何移动色块,它看起来都一样。黑色框内有一个边缘。如果沿垂直方向移动它,会发现它与垂直边缘不同。如果沿着水平边缘移动,看起来是一样的。对于红色框,它是一个角落。无论你在哪里移动这部分,它看起来都不一样,意味着它是独一无二的。所以基本上,角落被认为是图像中的优秀特征。(不仅仅是角落,在某些情况下,斑点也被认为是很好的特征)。
所以现在我们回答了我们的问题,“特征是什么?”。
但同时下一个问题出现了——我们如何找到它们?或者我们如何找到角落?我们以一种直观的方式回答了这个问题,例如,在周围的所有地区移动(少量)时,寻找图像中具有最大变化的区域。这将在未来的章节中被投射到计算机语言中。因此找到这些图像特征称为特征检测。
现在再引入一个名词:特征检测(Feature Detection)
We found the features in the images. Once you have found it, you should be able to find the same in the other images. How is this done? We take a region around the feature, we explain it in our own words, like "upper part is blue sky, lower part is region from a building, on that building there is glass etc" and you search for the same area in the other images. Basically, you are describing the feature. Similarly, a computer also should describe the region around the feature so that it can find it in other images. So called description is called Feature Description. Once you have the features and its description, you can find same features in all images and align them, stitch them together or do whatever you want.
一旦我们在图像中找到了这些特征,你就能够在其他图像中找到相同的图像。这是如何完成的?我们在特征周围确定一个区域,用我们自己的话来描述这个区域,比如“上半部分是蓝天,下半部分是建筑物的区域,那个建筑物是玻璃等等”,然后你在另一个图片中搜索相同的区域。基本上,你正在描述的就是这个图像的特征。同样,计算机也应该描述该特征周围的区域,以便它可以在其他图像中找到相同的特征。所谓的描述称为特征描述。一旦你有了这些特性和描述,你就可以在所有图像中找到相同的特征并对齐它们,将它们缝合在一起或做任何你想做的事情。
所以在这个模块中,我们正在寻找OpenCV中的不同算法来查找特征,描述它们,匹配它们等。