Drone Forensics Suite – IT22004154

High‑Level Overview

This project is a desktop application built with PySide6 for performing drone SD‑card forensics. It recovers visible and deleted evidence, analyses metadata and GPS logs, and runs a CNN‑based tampering detector with Grad‑CAM visualisation on suspect images.

Application Entry & UI Shell

PySide6 Qt QMainWindow + QStackedWidget

Entry points

run.py: minimal launcher that creates a QApplication and shows MainWindow.
ui_desktop/main.py: alternative entry that ensures the project root is on sys.path and opens MainWindow.

Main window composition

ui_desktop/main_window.MainWindow derives from QMainWindow.
Left side: QListWidget sidebar hosting page names: Dashboard, Data Recovery Engine, Image Tampering Detection, EXIF Correlation & GPS Viewer, Correlation & Timeline, Chain of Custody, Settings.
Right side: QStackedWidget with page widgets in exactly the same order.
Navigation: when the current row changes, the stack index switches and a forensic chain event is logged via core.forensic_chain.log_event.

Main pages

DashboardPage: overview of recovered evidence folders.
RecoveryEnginePage: SD‑card data recovery pipeline (visible files + deep scan).
TamperingPage: CNN + metadata‑based image tampering analysis.
MetadataPage: EXIF and log‑based GPS correlation with map view.
CorrelationPage: placeholder for future correlation/timeline module.
ChainOfCustodyPage: live evidence log and CSV export.
SettingsPage: app‑wide light/dark theme toggle.

Dashboard & Evidence Sessions

Evidence browsing Recovered_Evidence / recovery_output

Implemented in ui_desktop/pages/dashboard_page.py. It presents a table of past recovery sessions stored under recovery_output/.

Each row represents one evidence folder named like evidence_000001_YYYYMMDD_HHMMSS.
Columns: Evidence Folder name, Created (timestamp encoded in the name), total file count, Status, and Actions.
“Open Folder” action uses QDesktopServices.openUrl to open the folder in the OS file explorer.

Data Recovery Engine

SD‑card forensics Visible file copy Signature‑based carving

UI workflow (RecoveryEnginePage)

class RecoveryWorker(QThread):
    def run(self):
        def log_cb(m): self.log_signal.emit(m)
        def prog_cb(v): self.progress_signal.emit(v)

        stats = run_recovery(
            self.drive,
            self.out_dir,
            progress_cb=prog_cb,
            log_cb=log_cb,
            cancel_cb=lambda: self._cancel
        )
        self.finished.emit(stats)

Qt worker thread that calls the Python recovery engine and bridges logs/progress back to the GUI via signals.

Drive selection via QComboBox: enumerates available drive letters and shows capacity in GB.
“Run Recovery” spawns a background QThread (RecoveryWorker) that calls recovery_engine.sd_recovery.run_recovery.
Progress is shown with QProgressBar, percentage label, ETA label, and a small spinner animation.
Two log panes: a user‑friendly log and a technical log; messages are received over Qt signals.
Recovered evidence is browsed via a QTreeWidget rooted at the created evidence folder, allowing double‑click to open individual files.
Export features: “Export Selected” and “Export ALL Evidence”, either into a generic directory or a case‑specific storage area if provided.

Core recovery logic (sd_recovery.py)

def run_recovery(drive_letter, out_folder, progress_cb=None, log_cb=None, cancel_cb=None):
    drive_root = _normalize_drive(drive_letter)
    evidence_dir = _next_evidence_folder(Path(out_folder))

    stats = {}
    stats.update(_copy_visible_files(
        drive_root=Path(drive_root),
        evidence_dir=evidence_dir,
        progress_cb=progress_cb,
        log_cb=log_cb,
        cancel_cb=cancel_cb,
        progress_start=0,
        progress_end=60,
    ))

    deep_stats = _raw_carve(
        drive_root=drive_root,
        evidence_dir=evidence_dir,
        total_bytes=total_bytes,
        progress_cb=progress_cb,
        log_cb=log_cb,
        cancel_cb=cancel_cb,
        progress_start=60,
        progress_end=100,
    )
    stats.update(deep_stats)
    return stats

High‑level recovery workflow: create evidence folder, copy visible files, then perform raw signature‑based carving.

Normalises the target drive letter to a Windows root such as D:\.
Creates an incrementing evidence folder under the chosen output directory using the pattern evidence_<ID>_YYYYMMDD_HHMMSS.
Visible file copy: walks the drive and copies forensic‑relevant extensions (images, videos, logs, CSV, XLSX, ZIP, etc.) into visible_files/, tracking totals and updating progress.
Deep scan / raw carving: uses a raw volume path like \\\\.\\D: and scans in 8 MB chunks, keeping a 2 MB tail buffer.
Signature table supports JPEG (FF D8 FF ... FF D9), PNG (header and IEND chunk), GIF, and generic ZIP/XLSX fragments.
Carved outputs: deleted images into deleted_images/, Excel/ZIP fragments into deleted_images/carved_xlsx_*.xlsx, and text‑like GPS/log fragments into deleted_logs/log_fragment_*.txt.
Additional helper jpeg_carver.py demonstrates a simpler JPEG‑only carving loop driven by raw_access.read_raw_drive.

Chain of Custody Logging

Forensic audit trail CSV export

Implemented in ui_desktop/pages/chain_custody_page.py and backed by core.forensic_chain.

Maintains a table of log entries with columns: Timestamp (UTC), Action, Page, and Machine ID.
Uses a timer to auto‑refresh the table every 2 seconds so new events appear in near real‑time.
Listens to the global logging system using register_listener and removes itself via unregister_listener on close.
Exports the log to CSV: either into a case‑specific path or via a user‑selected Save dialog.

Metadata & GPS Correlation

EXIF GPS DJI logs Google Maps embed

User interaction (MetadataPage)

Selects a still image (*.jpg / *.jpeg / *.png) and an optional folder with flight logs.
Shows a scaled preview of the image and a tree view of the selected logs folder (focused on .csv and .txt files).
Runs a multi‑step correlation pipeline to find the best GPS fix for the selected image.
Results, debug information, and any error conditions are written into a read‑only text area.
Resolved locations are rendered onto an interactive Google Maps iframe using the configured API key.

GPS fix object

Encapsulated in a small data class: latitude, longitude, source description, and human‑readable details.
Validity checks ensure latitude is in \[-90, 90\] and longitude in \[-180, 180\].

Correlation pipeline

def _try_exif_gps(self, path: str) -> Optional[GpsFix]:
    with open(path, "rb") as f:
        tags = exifread.process_file(f, details=False)

    gps_lat = tags.get("GPS GPSLatitude")
    gps_lon = tags.get("GPS GPSLongitude")
    lat_ref = tags.get("GPS GPSLatitudeRef")
    lon_ref = tags.get("GPS GPSLongitudeRef")

    def conv(v):
        d, m, s = v.values
        d = float(d.num) / float(d.den)
        m = float(m.num) / float(m.den)
        s = float(s.num) / float(s.den)
        return d + (m / 60.0) + (s / 3600.0)

EXIF GPS extraction: converts degrees/minutes/seconds to decimal latitude and longitude before validation.

Step 1 – EXIF GPS: reads EXIF tags via exifread, extracts GPSLatitude / GPSLongitude as DMS, converts to decimal, applies N/S/E/W reference, and validates the coordinates.
Step 2 – DJI structured logs: in the logs folder, looks for image_ids.csv and gps.csv by name, matches rows by filename to obtain a time or index, then scans gps.csv for the closest record in time/index containing valid lat/lon columns.
Step 3 – Bruteforce search: for any CSV in the tree, heuristically discovers lat/lon columns by header names; for TXT files, uses a regex to find pairs of decimal numbers that form valid lat/lon ranges.
If a fix is found, it is applied and logged; otherwise a “No GPS found” message is shown and an empty informational map is rendered.

CNN‑based Tampering Detection

PyTorch EfficientNet‑B0 backbone Grad‑CAM

Implemented across ml_module/model.py, ml_module/train.py, ml_module/infer.py, ml_module/grad_cam.py, and used by ui_desktop/pages/tampering_page.py.

CNN architecture (TamperNet)

TamperNet wraps a pre‑trained torchvision.models.efficientnet_b0. The network is initialised with ImageNet weights and its final classifier is replaced with a new linear layer that matches the number of tampering classes (typically 2: original vs manipulated).

class TamperNet(nn.Module):
    def __init__(self, num_classes=2):
        super().__init__()
        self.backbone = models.efficientnet_b0(
            weights=models.EfficientNet_B0_Weights.IMAGENET1K_V1
        )
        in_features = self.backbone.classifier[1].in_features
        self.backbone.classifier[1] = nn.Linear(in_features, num_classes)

    def forward(self, x):
        return self.backbone(x)

CNN definition: an EfficientNet‑B0 backbone whose final linear layer is replaced to output the desired tampering classes.

Block	Type	Role
Stem	Conv + BN + SiLU	Initial feature extraction and down‑sampling.
MBConv stack	Depthwise separable convs with squeeze‑and‑excitation	Efficient multi‑scale feature extraction through repeated inverted residual blocks.
Head conv	1×1 Conv + BN + SiLU	Aggregates rich spatial features before pooling.
Global pooling	AdaptiveAvgPool2d	Reduces each channel to a single activation (spatial average).
Classifier	Dropout + `Linear(in_features, num_classes)`	Maps global features to tampering class logits (replaced for this project).

Training (train.py)

train_loader, val_loader, classes = get_dataloaders(batch_size)
model = TamperNet(num_classes=len(classes)).to(DEVICE)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=lr)

for epoch in range(num_epochs):
    model.train()
    for imgs, labels in train_loader:
        imgs, labels = imgs.to(DEVICE), labels.to(DEVICE)
        outputs = model(imgs)
        loss = criterion(outputs, labels)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

Core training loop: standard supervised cross‑entropy training on mini‑batches of augmented images.

Uses a train/validation split built from torchvision.datasets.ImageFolder located under dataset/train and dataset/val.
Image transforms: resize to 299×299, random horizontal flip and small rotation for augmentation, then convert to tensor.
Optimisation: Cross‑Entropy loss and Adam optimiser with a configurable learning rate.
For each epoch: iterates over training batches, performs forward + backward + step, accumulates loss, then evaluates accuracy on the validation set.
After training, saves a checkpoint tamper_detector.pth containing both the model weights and the ordered class list.

Inference & Grad‑CAM

def analyze_image(pil_image: Image.Image):
    model, classes = _load_model()
    x = transform(pil_image).unsqueeze(0).to(DEVICE)

    features = []
    gradients = []

    def save_features(module, input, output):
        features.append(output)

    def save_gradients(module, grad_input, grad_output):
        gradients.append(grad_output[0])

    target_layer = model.backbone.features[-1][0]
    target_layer.register_forward_hook(save_features)
    target_layer.register_backward_hook(save_gradients)

    output = model(x)
    probs = torch.softmax(output, dim=1)[0]
    class_idx = int(torch.argmax(probs).item())

    model.zero_grad()
    output[0, class_idx].backward()

Key Grad‑CAM logic: hooks capture feature maps and gradients at the last conv block for the predicted class.

Inference (infer.py): loads the checkpoint, rebuilds TamperNet with the correct output size, and exposes classify_image(path) which returns the predicted label and confidence after a softmax.
Grad‑CAM (grad_cam.py): re‑loads the same checkpoint and attaches forward and backward hooks to the last EfficientNet convolutional block (model.backbone.features[-1][0]).
For a given image: performs a forward pass, selects the predicted class logit, back‑propagates, and records feature maps and gradients at the hook layer.
Computes per‑channel weights as the mean gradient and forms a class activation map by weighted summation over feature channels, then normalises and resizes it to the original image shape.
Converts the CAM into a colour heatmap and overlays it on the original image using OpenCV; both the overlay and raw heatmap are returned.

Tampering UI (TamperingPage)

Lets the examiner select an image, then triggers the full tampering analysis.
Calls analyze_image to obtain the model label, confidence, overlay image, and heatmap.
Runs parallel metadata analysis of EXIF tags to detect signs of editing software, missing EXIF, or absent GPS.
Combines ML and metadata results into a final verdict category such as: “High confidence image tampering”, “Content manipulation suspected”, “Metadata manipulation suspected”, or “No tampering detected”.
Displays the original and heatmap overlays side‑by‑side inside the Qt UI.

Settings and Theming

Dark mode by default Global Qt style sheet

SettingsPage receives the QApplication instance and exposes a single “Enable Dark Mode” checkbox.
Two style constants (DARK_STYLE and LIGHT_STYLE) are applied as application‑wide Qt style sheets.
Dark mode is enabled by default so the UI matches the forensic lab style of this document.

Supporting Utilities

Raw device access Dataset loaders

recovery_engine/raw_access.read_raw_drive: a low‑level helper to read arbitrary byte ranges from a Windows physical drive path, used by JPEG carving.
recovery_engine/scan_existing.scan_existing_files: a simplified scan that categorises existing images, videos, and text logs on a drive.
ml_module/dataset_loader.get_dataloaders: builds training and validation DataLoader instances for the tampering CNN.
ui_desktop/pages/correlation_page.CorrelationPage: currently a placeholder widget for future correlation and timeline visualisation.

Summary

In summary, the project combines SD‑card recovery, chain‑of‑custody logging, metadata and GPS correlation, and a modern CNN‑based tampering detector into a single PySide6 desktop toolkit aimed at drone‑related digital forensics. The GUI orchestrates these subsystems into a workflow that starts from an SD card and ends with visually explainable tampering evidence and an auditable event log.