How I Built an Intruder Detection and Alert System Using HoG and SVM (From Idea to Working Prototype)

Security cameras are everywhere now – in homes, offices, schools and streets.
But most of them do the same thing: record everything and hope someone checks the footage later.

I wanted to build something a little smarter.

Instead of just recording video, I wanted a system that could:

Detect when a person enters a restricted area, and
Trigger an alert in real time

That’s how my project “Intruder Detection and Alert System using HoG and SVM” started.

In this post, I’ll share:

Why I decided to build this system
What it actually does (in simple language)
The core ideas behind it: HoG and SVM
How well it worked in practice
What I plan to improve next

This is the first part of a small series where I’ll go deeper into the technical details, but let’s start with the story and high-level overview.

🔍 The Motivation: From Passive Cameras to Active Alerts

Imagine this situation:

You have a camera installed outside your door or in a lab/server room. A person enters the area at midnight, but:

Nobody is watching the live feed
The footage is saved, but only reviewed after something goes wrong

In other words, the system is reactive, not proactive.

My goal was to turn a normal camera into a simple, intelligent intruder detector that could:

Continuously monitor the video feed
Detect if a human is present in the frame
Classify that as a potential intruder event
Trigger an alert (like a sound, message, or log entry)

I didn’t want to jump straight into heavy deep learning models. Instead, I chose a more classic computer vision + machine learning approach, using:

HoG (Histogram of Oriented Gradients) for feature extraction
SVM (Support Vector Machine) for classification

This combination has been widely used for human detection in images and works quite well if implemented properly.

🧠 High-Level Overview: How the System Works

Here’s the basic flow of my Intruder Detection and Alert System:

Video Input
- A camera (or a video file) is connected to the system.
- The system reads frames one by one.
Pre-processing
- Frames are resized and converted to grayscale (to simplify calculations).
Feature Extraction with HoG
- For each frame (or for regions in the frame), I compute HoG features, which basically describe the shape and edges in the image.
Classification with SVM
- The HoG features are passed into a trained SVM model.
- The SVM decides whether the region contains a person or no person.
Decision & Alert
- If a person is detected in a restricted area or within a defined region of interest:
  - Mark it as an intruder event
  - Trigger an alert (sound/log/message)
Display / Logging
- The frame with detection bounding boxes is displayed.
- Events can be logged with timestamps for later review.

It’s like turning a regular video stream into a real-time person detector + alarm system.

🧮 What Is HoG (Histogram of Oriented Gradients) – In Simple Terms

HoG stands for Histogram of Oriented Gradients. That sounds complicated, but the intuition is simple:

Images are made of pixels.
HoG looks at how the brightness changes across those pixels (gradients).
It captures edges and directions – for example, vertical lines (like a person’s body), horizontal lines (like shoulders), diagonal edges, etc.
By combining these patterns over small regions, HoG creates a feature vector that represents the shape of what’s in the image.

Why this is useful for human detection:

Humans have a consistent shape: head, shoulders, torso, legs.
Even if the person is wearing different clothes, the outline/edges are similar.
HoG captures this outline in a way that a machine learning model can work with.

In later parts of this series, I’ll explain how HoG works step by step, but for now, you can remember it as:

HoG converts an image into a numerical description of its edges and shapes.

This numerical description (feature vector) is what we feed into the SVM.

📊 What Is SVM (Support Vector Machine) – In Simple Terms

Once we have the HoG features, we need a model that can say:

“This looks like a person”
“This looks like background/no person”

This is where SVM (Support Vector Machine) comes in.

SVM is a classification algorithm in machine learning. Its job is to:

Look at examples of features labeled as “person” and “not person”
Learn a boundary that separates the two classes as clearly as possible
During testing, use this learned boundary to classify new, unseen examples

You can imagine it like:

Plotting all the examples on a graph (based on their features)
Drawing the best possible separating line (or hyperplane) between the two groups
SVM chooses the line that maximizes the “margin” between classes – meaning it tries to be as confident as possible.

In later posts, I’ll explain:

How SVM learns
What support vectors are
Why margin matters
Which kernel I used for this project

For now, think of SVM as:

A smart decision-maker that uses the HoG features to decide: “Intruder” or “No intruder”.

🧪 How I Built and Trained the System (High-Level)

Here’s a simplified version of my build process:

Data Collection
- I collected images/frames containing persons and no persons.
- These could be from my own camera or from public datasets.
Feature Extraction (Offline)
- For each image, I extracted HoG features.
- These features became my training data.
Training the SVM Model
- I used the features + labels (“person” vs “not person”) to train the SVM classifier.
- This step teaches the model how to recognize a person based on HoG features.
Integration with Live Video
- After the model performed well on test data, I integrated it with a live video feed using OpenCV.
- For each frame, I applied the same feature extraction and classification process.
Alert Mechanism
- When the model detected an intruder, it triggered an action (for example:
  - showing a warning on the screen,
  - logging the event,
  - or playing a sound).

📈 How Well Did It Perform?

For a classic computer vision + ML approach (without deep learning), the results were quite promising:

It correctly detected intruders in many normal lighting conditions
It worked well when the person was clearly visible and not too far from the camera
It was efficient enough to run in near real time on a normal machine

However, there were also challenges:

Performance dropped in low-light conditions
If the person was very small in the frame (far away), detection became harder
Rapid movement or partial occlusion sometimes caused missed detections
Background changes (like moving curtains, shadows) occasionally led to false alarms

I’ll share more detailed performance metrics, testing setup, and failure cases in a separate post, but in general:

The system proved that a traditional HoG + SVM pipeline can still be a solid baseline for simple intruder detection tasks.

🌱 What I Learned from This Project

Beyond the code and algorithms, this project taught me a lot about:

Breaking a complex problem into steps: input → features → model → output
The importance of data quality for machine learning
How small implementation details (frame size, parameters, thresholds) can have a huge impact on performance
The difference between something that works on paper and something that works reliably in the real world

It also opened the door to future improvements:

Trying deep learning-based detectors like HOG+SVM vs. modern CNNs
Adding tracking, not just detection (to follow intruders across frames)
Integrating with IoT devices for remote alerts (SMS, email, app notification)

Comments

Anonymous said…

Great content. Can you help me with my project sir !!! 🙏

November 26, 2025 at 6:59 PM

Diwakar Sapkota said…

Thanks mate.

November 26, 2025 at 9:16 PM

Nice explanation by you sir,

December 3, 2025 at 7:53 AM

Great content

December 4, 2025 at 4:06 PM

Master Level Project .. Nice one

December 4, 2025 at 4:16 PM

Bits, Boots and Beyond

Search This Blog

Bits Boots and Beyond

Recent - Posts

QBasic for Beginners – Post 5: IF…THEN…ELSE Statement