๐Ÿš— Detecting Parking Lot Occupancy with Deep Learning and Faster R-CNN

๐Ÿš— Detecting Parking Lot Occupancy with Deep Learning and Faster R-CNN

In this post, we’ll walk through how to build a parking lot occupancy detection model using PyTorch and the Faster R-CNN object detection architecture. We’ll use a COCO-style dataset (in this case, PKLot) that contains images of parking lots with annotations identifying parked vehicles.

We'll explore how the model works, how to prepare your dataset and dataloader, and how to train and visualize predictions.


๐Ÿ“ฆ Objective

The goal is to detect whether a parking space is occupied by identifying cars in images. The system will learn to detect vehicles using annotated bounding boxes and classify each parking spot accordingly.


๐Ÿง  Model Architecture: Faster R-CNN

Faster R-CNN is a popular two-stage object detection model:

  1. Region Proposal Network (RPN) suggests object-like regions.

  2. ROI Pooling + Classifier classifies and refines these region proposals.

We use Faster R-CNN with a ResNet-50 FPN backbone, which is both powerful and fast enough for practical applications.


๐Ÿ“ Dataset: PKLot in COCO Format

The PKLot dataset contains labeled images of parking lots. The annotations follow the COCO format, which allows us to easily use tools like pycocotools and integrate with torchvision’s detection models.

# Dataset class using COCO format
class CocoDetectionDataset(Dataset):
    def __init__(self, img_folder, ann_file, transforms=None):
        self.coco = COCO(ann_file)
        self.img_folder = img_folder
        self.ids = list(sorted(self.coco.imgs.keys()))
        self.transforms = transforms

    def __getitem__(self, index):
        ...

This class loads images and their annotations, converts bounding boxes from COCO's [x, y, width, height] format to [x_min, y_min, x_max, y_max], and prepares the target labels.


๐Ÿงช Transformations

We use basic preprocessing:

transform = T.Compose([
    T.ToTensor()
])

This converts images to tensors suitable for input to a neural network.


๐Ÿ” DataLoader and Training Setup

dataset = CocoDetectionDataset(root_dir, ann_path, transforms=transform)
data_loader = DataLoader(dataset, batch_size=4, shuffle=True, collate_fn=lambda x: tuple(zip(*x)))

We define a simple collate_fn to handle the variable-sized targets typical of detection tasks.

Then, we initialize the model:

model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
num_classes = len(dataset.coco.getCatIds()) + 1  # +1 for background
in_features = model.roi_heads.box_predictor.cls_score.in_features
model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)

And set up our training:

optimizer = torch.optim.SGD(params, lr=0.005, momentum=0.9, weight_decay=0.0005)

๐Ÿ” Training Loop (Simplified)

We loop through the dataset and train for a few epochs:

for imgs, targets in data_loader:
    if any(t["boxes"].nelement() == 0 for t in targets):
        continue
    ...
    loss_dict = model(imgs, targets)
    losses = sum(loss for loss in loss_dict.values())
    losses.backward()
    optimizer.step()

This loop handles training the model on the GPU, calculating the losses, and updating weights.


๐ŸŽฏ Visualizing Predictions

Once training is done, we can test the model on new images and visualize the results.

def visualize_prediction(model, image_path, device, threshold=0.5):
    ...

This function loads an image, passes it through the model, and draws bounding boxes on the image for any detected objects with a score above the threshold.

![Visualized Result Placeholder]

Replace the test image path with an actual file from the test set:

test_image_path = '/kaggle/input/pklot-dataset/test/sample_image.jpg'
visualize_prediction(model, test_image_path, device)

You’ll see bounding boxes around vehicles with the detection confidence score.


✅ Summary

  • We used Faster R-CNN to build a powerful parking occupancy detection system.

  • The dataset was in COCO format, making it easy to work with standard PyTorch tools.

  • We built a full training and evaluation pipeline using PyTorch and torchvision.

This project can be extended by:

  • Using a custom parking spot detection post-process.

  • Deploying the model via a web API.

  • Counting empty vs. occupied spots in real-time.


๐Ÿ’ก Source Code

Here’s the full source code used in this project :

import os
import json
import torch
import torchvision
import torchvision.transforms as T
from torch.utils.data import Dataset, DataLoader
from PIL import Image
from pycocotools.coco import COCO
import numpy as np

# ---- Dataset Class ----
class CocoDetectionDataset(Dataset):
    def __init__(self, img_folder, ann_file, transforms=None):
        self.coco = COCO(ann_file)
        self.img_folder = img_folder
        self.ids = list(sorted(self.coco.imgs.keys()))
        self.transforms = transforms

    def __getitem__(self, index):
        coco = self.coco
        img_id = self.ids[index]
        ann_ids = coco.getAnnIds(imgIds=img_id)
        anns = coco.loadAnns(ann_ids)

        path = coco.loadImgs(img_id)[0]['file_name']
        img = Image.open(os.path.join(self.img_folder, path)).convert("RGB")

        boxes = []
        labels = []

        for ann in anns:
            bbox = ann['bbox']
            xmin, ymin, width, height = bbox
            boxes.append([xmin, ymin, xmin + width, ymin + height])
            labels.append(ann['category_id'])

        boxes = torch.as_tensor(boxes, dtype=torch.float32)
        labels = torch.as_tensor(labels, dtype=torch.int64)

        target = {
            "boxes": boxes,
            "labels": labels,
            "image_id": torch.tensor([img_id])
        }

        if self.transforms:
            img = self.transforms(img)

        return img, target

    def __len__(self):
        return len(self.ids)

# ---- Simple Transforms ----
transform = T.Compose([
    T.ToTensor()
])

# ---- Paths ----
root_dir = '/kaggle/input/pklot-dataset/train'
ann_path = '/kaggle/input/pklot-dataset/train/_annotations.coco.json'

# ---- Data Loaders ----
dataset = CocoDetectionDataset(root_dir, ann_path, transforms=transform)
data_loader = DataLoader(dataset, batch_size=4, shuffle=True, collate_fn=lambda x: tuple(zip(*x)))

# ---- Model ----
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
num_classes = len(dataset.coco.getCatIds()) + 1  # add 1 for background
in_features = model.roi_heads.box_predictor.cls_score.in_features
model.roi_heads.box_predictor = torchvision.models.detection.faster_rcnn.FastRCNNPredictor(in_features, num_classes)

# ---- Training Setup ----
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
model.to(device)

params = [p for p in model.parameters() if p.requires_grad]
optimizer = torch.optim.SGD(params, lr=0.005, momentum=0.9, weight_decay=0.0005)

# ---- Training Loop (Simple) ----
model.train()
for epoch in range(2):  # keep small for test
    print(f"\n=== Epoch {epoch+1} ===")
    step = 0  # step counter for each epoch

    for imgs, targets in data_loader:
        skip = False
        for img, tgt in zip(imgs, targets):
            if tgt["boxes"].nelement() == 0:
                skip = True
        if skip:
            continue

        imgs = [img.to(device) for img in imgs]
        targets = [{k: v.to(device) for k, v in t.items()} for t in targets]

        loss_dict = model(imgs, targets)
        losses = sum(loss for loss in loss_dict.values())

        optimizer.zero_grad()
        losses.backward()
        optimizer.step()

        # ๐ŸŸข Print progress every 10 steps
        if step % 10 == 0:
            print(f"[Epoch {epoch+1} | Step {step}] Loss: {losses.item():.4f}")
        step += 1

    print(f"✅ Epoch {epoch+1} completed.")
    print(f"Epoch {epoch} Loss: {losses.item():.4f}")

def visualize_prediction(model, image_path, device, threshold=0.5):
    import matplotlib.pyplot as plt
    import matplotlib.patches as patches

    # Load and preprocess the image
    image = Image.open(image_path).convert("RGB")
    transform = T.Compose([T.ToTensor()])
    img_tensor = transform(image).unsqueeze(0).to(device)

    # Set model to eval mode
    model.eval()
    with torch.no_grad():
        prediction = model(img_tensor)[0]

    # Plot the image
    fig, ax = plt.subplots(1, figsize=(12, 8))
    ax.imshow(image)

    # Draw bounding boxes above threshold
    for box, score, label in zip(prediction["boxes"], prediction["scores"], prediction["labels"]):
        if score >= threshold:
            x_min, y_min, x_max, y_max = box.cpu().numpy()
            rect = patches.Rectangle((x_min, y_min), x_max - x_min, y_max - y_min,
                                     linewidth=2, edgecolor='r', facecolor='none')
            ax.add_patch(rect)
            ax.text(x_min, y_min - 10, f'{label.cpu().item()} ({score.cpu().item():.2f})',
                    color='red', fontsize=12, backgroundcolor='white')

    plt.title(f"Detections (threshold={threshold})")
    plt.axis('off')
    plt.show()

# Example usage
test_image_path = '/kaggle/input/pklot-dataset/test/2012-09-15_07_02_01_jpg.rf.079ed65a302f63446ed4f99348787027.jpg'  # Replace with a real image path
visualize_prediction(model, test_image_path, device)

๐Ÿ“‚ How to Use Kaggle Datasets in a Notebook

When you’re using Kaggle Notebooks, you have direct access to any Kaggle-hosted dataset without downloading anything manually. Here's how to include a dataset in your notebook:

๐Ÿชœ Step-by-step Guide:

  1. Go to your notebook on Kaggle.

  2. On the right sidebar, find the "Add data" button and click it.

  3. You can search for your dataset in the search bar (e.g., search for “PKLot”).

  4. Click “Add” next to the dataset you want.

  5. Once added, you’ll see it mounted in your environment under a directory like:

    /kaggle/input/pklot-dataset/
    

    This path can now be used directly in your code.


๐Ÿงช Example from This Project

Here’s how we reference the dataset path after adding the dataset in the Kaggle interface:

# Image and annotation paths inside the Kaggle environment
root_dir = '/kaggle/input/pklot-dataset/train'
ann_path = '/kaggle/input/pklot-dataset/train/_annotations.coco.json'

You don’t need to worry about downloading or extracting ZIP files — Kaggle does all that for you in the background once you “Add Data” to the notebook.



Comments

Popular posts from this blog

๐Ÿ“ Fun Fruit Math Game for Kids – Learn Multiplication & Division with Smiles!

Visualize Permutations and Combinations with Fruits!

๐Ÿž️ River Distance Explorer – Learn Trigonometry Through a Fun Interactive Game