Files
autopilot/models/SiaN/README_homography.md
2026-02-16 19:07:31 +03:00

7.5 KiB

Homography Estimation System

This system provides a complete pipeline for estimating homography matrices between Google and Yandex map images using deep learning.

Overview

Homography estimation is crucial for aligning images from different sources (Google Maps and Yandex Maps in this case). The system includes:

  1. Dataset handling - Loading and preprocessing image pairs
  2. Data augmentation - Homography-based augmentation for robust training
  3. CNN model - Deep learning model for homography estimation
  4. Training pipeline - Complete training and evaluation workflow
  5. Inference tools - Tools for using trained models on new data

Installation

Prerequisites

  • Python 3.8+
  • PyTorch 1.9+
  • OpenCV
  • PIL/Pillow
  • NumPy

Install dependencies

pip install torch torchvision opencv-python pillow numpy matplotlib tqdm tensorboard

Dataset Structure

The system expects image pairs in the following format:

dataset/
├── 0000_google.png
├── 0000_yandex.png
├── 0001_google.png
├── 0001_yandex.png
└── ...

Each pair consists of:

  • {idx:04d}_google.png - Google map image
  • {idx:04d}_yandex.png - Yandex map image

Quick Start

1. Explore the dataset

from models.homography import HomographyDataset

dataset = HomographyDataset(
    root_dir=r"C:\Users\admin\Projects\autopilot\datasets\ya_go_maps\images",
    augment=True,
    image_size=(256, 256)
)

print(f"Found {len(dataset)} image pairs")
sample = dataset[0]
print(f"Sample homography matrix:\n{sample['homography'].numpy()}")

2. Train a model

python models/train_homography.py \
  --data_dir "C:\Users\admin\Projects\autopilot\datasets\ya_go_maps\images" \
  --epochs 50 \
  --batch_size 16 \
  --lr 1e-3 \
  --output_dir "runs/my_experiment"

3. Perform inference

python models/infer_homography.py \
  --model_path "runs/my_experiment/checkpoint_best.pth" \
  --mode single \
  --google_path "path/to/google.png" \
  --yandex_path "path/to/yandex.png" \
  --output_vis "alignment_result.png"

File Structure

models/
├── homography.py              # Dataset class and data loaders
├── homography_cnn.py          # CNN model architecture
├── train_homography.py        # Training script
├── infer_homography.py        # Inference script
├── example_homography.py      # Example usage
├── homography.ipynb           # Jupyter notebook (empty)
└── README_homography.md       # This file

Model Architecture

The homography estimation model (HomographyCNN) consists of:

  1. Dual encoders - Separate feature extraction for Google and Yandex images
  2. Residual blocks - For deep feature learning
  3. Fusion layers - Combine features from both images
  4. Regression head - Predict 3x3 homography matrix

Key Features:

  • Residual connections for stable training
  • Batch normalization options
  • Dropout for regularization
  • Geometric consistency loss

Training Configuration

Default training parameters:

  • Optimizer: Adam with learning rate 1e-3
  • Loss function: Combined matrix + geometric + regularization loss
  • Batch size: 32
  • Image size: 256x256
  • Train/val split: 80/20

Inference Modes

The inference script supports three modes:

1. Single image pair

python infer_homography.py --mode single \
  --google_path google.png --yandex_path yandex.png

2. Dataset evaluation

python infer_homography.py --mode dataset \
  --dataset_dir path/to/dataset --num_samples 100

3. Batch processing

python infer_homography.py --mode batch \
  --input_dir path/to/input --output_dir path/to/output

Evaluation Metrics

The system computes several metrics:

  • Matrix MSE: Mean squared error of homography matrix elements
  • Corner error: Average pixel error at image corners
  • Geometric consistency: Warping error across grid points

Data Augmentation

The dataset applies homography-based augmentation:

  • Random rotation (-30° to 30°)
  • Random scaling (0.8x to 1.2x)
  • Random translation (-50 to 50 pixels)
  • Small perspective distortion

Integration with Autopilot System

The homography estimation can be integrated into the autopilot system:

from models.infer_homography import HomographyInference

# Initialize inference
inference = HomographyInference(model_path="path/to/model.pth")

# During flight loop
homography = inference.predict(google_img, yandex_img)
# Use homography to update drone position

Troubleshooting

Common Issues

  1. Out of memory

    • Reduce batch size
    • Use smaller image size
    • Enable gradient checkpointing
  2. Poor convergence

    • Adjust learning rate
    • Increase model capacity
    • Add more data augmentation
  3. Inference errors

    • Check image formats (must be RGB)
    • Verify model was trained with same image size
    • Ensure proper normalization

Debugging Tips

  • Use example_homography.py to test components
  • Enable TensorBoard for training visualization
  • Check homography matrix normalization (last element should be ~1)

Advanced Usage

Custom Model Architecture

from models.homography_cnn import create_homography_model

model = create_homography_model(
    model_type="cnn",
    input_size=(512, 512),
    hidden_channels=128,
    num_blocks=6,
    dropout_rate=0.5
)

Custom Loss Function

from models.homography_cnn import HomographyLoss

loss_fn = HomographyLoss(
    matrix_weight=0.7,
    geometric_weight=0.3,
    reg_weight=0.05,
    grid_size=16
)

Transfer Learning

# Load pretrained model
checkpoint = torch.load("pretrained.pth")
model.load_state_dict(checkpoint["model_state_dict"])

# Fine-tune on new data
for param in model.google_encoder.parameters():
    param.requires_grad = False  # Freeze encoder

Performance Tips

  1. Training speed

    • Use multiple GPU workers for data loading
    • Enable mixed precision training
    • Use gradient accumulation for larger effective batch sizes
  2. Memory efficiency

    • Use gradient checkpointing
    • Implement progressive resizing
    • Use memory-efficient optimizers
  3. Inference speed

    • Use TensorRT or ONNX for deployment
    • Implement model quantization
    • Use batch inference when possible

Future Improvements

  1. Model enhancements

    • Transformer-based architecture
    • Multi-scale feature fusion
    • Uncertainty estimation
  2. Training improvements

    • Self-supervised pre-training
    • Curriculum learning
    • Adversarial training
  3. Deployment features

    • Real-time inference optimization
    • Mobile deployment
    • Web API

Citation

If you use this system in your research, please cite:

@software{homography_estimation_2024,
  title = {Homography Estimation for Map Alignment},
  author = {Autopilot Team},
  year = {2024},
  url = {https://github.com/your-repo/homography-estimation}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For questions and issues:

  1. Check the troubleshooting section
  2. Review the example scripts
  3. Open an issue on GitHub
  4. Contact the development team

Last updated: October 2024