CV Basics - 1 | Notion

by: https://x.com/deeplearnerd

In the previous blog, we set the stage for this series by exploring some foundational concepts. We discussed the difference between how humans perceive images and how computers process them, delved into the idea of noise in images, touched upon transformations and their applications, and got a glimpse into the basics of image processing.

In this segment I’ll be covering:

Edge Analysis: Understanding image gradients, Sobel operators, and Canny edge detection
Feature Understanding: Harris corner detection, SIFT keypoint basics, and feature description
Pattern Detection: Template matching, Haar cascades, and sliding window concept

Edge Analysis

When we as humans look at a photograph, our brain instantly recognises objects by their boundaries - where one object ends and another begins. In images, these boundaries are represented by something much simpler: sudden changes in pixel intensity.

Look at this image of Kakashi (my favourite anime character btw). At the boundary kakashi, pixels rapidly change from one colour to another (we’ll call it intensity). These abrupt changes in pixel values are what we call edges. They can represent object boundaries, changes in texture, or sudden variations in depth and lighting.

Edge detection is relevant in CV for:

Feature Detection: Edges are fundamental features that help computers understand the structure of images.
Object Recognition: Before a computer can recognises a cat or a car, it needs to understand where objects begin and end. Edges provide these essential boundaries.
Image Segmentation: When processing medical images or satellite photos, edges help separate different regions of interest - like distinguishing a tumor from healthy tissue or a forest from urban areas.
Data Reduction: Instead of processing every pixel in an image, working with edges significantly reduces the amount of data while preserving crucial structural information.

Image Gradients

Before we begin into the foundation, here is a topic called Grad-Cam I covered in my earlier blog to help you peak interest that works on similar concepts: Peeking Inside the Black Box: Visualising CNNs with Grad-CAM

When we talk about finding edges in images, we're really hunting for places where pixel values change dramatically. Let's break this down mathematically:

For a grayscale image $I(x,y)$, gradients are partial derivatives in both directions: