by: https://x.com/deeplearnerd
In the previous blog, we set the stage for this series by exploring some foundational concepts. We discussed the difference between how humans perceive images and how computers process them, delved into the idea of noise in images, touched upon transformations and their applications, and got a glimpse into the basics of image processing.
In this segment I’ll be covering:
When we as humans look at a photograph, our brain instantly recognises objects by their boundaries - where one object ends and another begins. In images, these boundaries are represented by something much simpler: sudden changes in pixel intensity.
Look at this image of Kakashi (my favourite anime character btw). At the boundary kakashi, pixels rapidly change from one colour to another (we’ll call it intensity). These abrupt changes in pixel values are what we call edges. They can represent object boundaries, changes in texture, or sudden variations in depth and lighting.
Edge detection is relevant in CV for:
Before we begin into the foundation, here is a topic called Grad-Cam I covered in my earlier blog to help you peak interest that works on similar concepts: Peeking Inside the Black Box: Visualising CNNs with Grad-CAM
When we talk about finding edges in images, we're really hunting for places where pixel values change dramatically. Let's break this down mathematically:
For a grayscale image $I(x,y)$, gradients are partial derivatives in both directions: