This Computer Vision stuff is taking some time to get comfortable with. Most learning resources seem to expect you to have some background in image processing. Maybe that is typically a reasonable assumption to make on their part, but it’s something I am not particularly familiar with. This has meant spending some time brushing up on the basics.
To this end, after some searching, I found this course on Udacity. The lecture videos were recorded by Dr. Aaron Bobick (then at Georgia Tech). I’ve found Dr. Bobick’s enthusiasm for the subject matter infectious and I would recommend this course to anyone who might be interested in Computer Vision.
In an early lesson you learn about how, for the purposes of Computer Vision, images should be thought of as functions. I didn’t understand what this meant at first, but it seems key to understanding how Computer Vision algorithms work. I’ll try to relay what I’ve learned.
Here’s a simple image function:
f(x,y) => i
The function takes an x and y coordinate, and returns a value the describes the intensity of the light at that position in the image.
So, this image:
… could also be described like this:
We can see that the places in the image where the light intensity changes quickly will create large peaks and valleys. This allows us to see (and think about) the image in different ways than we typically might at a first glance.
Here is a look at a blurred version of the image, and it’s associated 3D topological brightness-intensity map:
The topology map allows us to see just how much smoother the transition between peaks and valleys is. Viewing the blurred image in this way makes it more obvious what effect, mathematically, a blurring algorithm is having on the pixels’ brightness values, and makes imagining the ways we might need to manipulate the image when performing feature detection more intuitive.