7.5 KiB
Homography Estimation System
This system provides a complete pipeline for estimating homography matrices between Google and Yandex map images using deep learning.
Overview
Homography estimation is crucial for aligning images from different sources (Google Maps and Yandex Maps in this case). The system includes:
- Dataset handling - Loading and preprocessing image pairs
- Data augmentation - Homography-based augmentation for robust training
- CNN model - Deep learning model for homography estimation
- Training pipeline - Complete training and evaluation workflow
- Inference tools - Tools for using trained models on new data
Installation
Prerequisites
- Python 3.8+
- PyTorch 1.9+
- OpenCV
- PIL/Pillow
- NumPy
Install dependencies
pip install torch torchvision opencv-python pillow numpy matplotlib tqdm tensorboard
Dataset Structure
The system expects image pairs in the following format:
dataset/
├── 0000_google.png
├── 0000_yandex.png
├── 0001_google.png
├── 0001_yandex.png
└── ...
Each pair consists of:
{idx:04d}_google.png- Google map image{idx:04d}_yandex.png- Yandex map image
Quick Start
1. Explore the dataset
from models.homography import HomographyDataset
dataset = HomographyDataset(
root_dir=r"C:\Users\admin\Projects\autopilot\datasets\ya_go_maps\images",
augment=True,
image_size=(256, 256)
)
print(f"Found {len(dataset)} image pairs")
sample = dataset[0]
print(f"Sample homography matrix:\n{sample['homography'].numpy()}")
2. Train a model
python models/train_homography.py \
--data_dir "C:\Users\admin\Projects\autopilot\datasets\ya_go_maps\images" \
--epochs 50 \
--batch_size 16 \
--lr 1e-3 \
--output_dir "runs/my_experiment"
3. Perform inference
python models/infer_homography.py \
--model_path "runs/my_experiment/checkpoint_best.pth" \
--mode single \
--google_path "path/to/google.png" \
--yandex_path "path/to/yandex.png" \
--output_vis "alignment_result.png"
File Structure
models/
├── homography.py # Dataset class and data loaders
├── homography_cnn.py # CNN model architecture
├── train_homography.py # Training script
├── infer_homography.py # Inference script
├── example_homography.py # Example usage
├── homography.ipynb # Jupyter notebook (empty)
└── README_homography.md # This file
Model Architecture
The homography estimation model (HomographyCNN) consists of:
- Dual encoders - Separate feature extraction for Google and Yandex images
- Residual blocks - For deep feature learning
- Fusion layers - Combine features from both images
- Regression head - Predict 3x3 homography matrix
Key Features:
- Residual connections for stable training
- Batch normalization options
- Dropout for regularization
- Geometric consistency loss
Training Configuration
Default training parameters:
- Optimizer: Adam with learning rate 1e-3
- Loss function: Combined matrix + geometric + regularization loss
- Batch size: 32
- Image size: 256x256
- Train/val split: 80/20
Inference Modes
The inference script supports three modes:
1. Single image pair
python infer_homography.py --mode single \
--google_path google.png --yandex_path yandex.png
2. Dataset evaluation
python infer_homography.py --mode dataset \
--dataset_dir path/to/dataset --num_samples 100
3. Batch processing
python infer_homography.py --mode batch \
--input_dir path/to/input --output_dir path/to/output
Evaluation Metrics
The system computes several metrics:
- Matrix MSE: Mean squared error of homography matrix elements
- Corner error: Average pixel error at image corners
- Geometric consistency: Warping error across grid points
Data Augmentation
The dataset applies homography-based augmentation:
- Random rotation (-30° to 30°)
- Random scaling (0.8x to 1.2x)
- Random translation (-50 to 50 pixels)
- Small perspective distortion
Integration with Autopilot System
The homography estimation can be integrated into the autopilot system:
from models.infer_homography import HomographyInference
# Initialize inference
inference = HomographyInference(model_path="path/to/model.pth")
# During flight loop
homography = inference.predict(google_img, yandex_img)
# Use homography to update drone position
Troubleshooting
Common Issues
-
Out of memory
- Reduce batch size
- Use smaller image size
- Enable gradient checkpointing
-
Poor convergence
- Adjust learning rate
- Increase model capacity
- Add more data augmentation
-
Inference errors
- Check image formats (must be RGB)
- Verify model was trained with same image size
- Ensure proper normalization
Debugging Tips
- Use
example_homography.pyto test components - Enable TensorBoard for training visualization
- Check homography matrix normalization (last element should be ~1)
Advanced Usage
Custom Model Architecture
from models.homography_cnn import create_homography_model
model = create_homography_model(
model_type="cnn",
input_size=(512, 512),
hidden_channels=128,
num_blocks=6,
dropout_rate=0.5
)
Custom Loss Function
from models.homography_cnn import HomographyLoss
loss_fn = HomographyLoss(
matrix_weight=0.7,
geometric_weight=0.3,
reg_weight=0.05,
grid_size=16
)
Transfer Learning
# Load pretrained model
checkpoint = torch.load("pretrained.pth")
model.load_state_dict(checkpoint["model_state_dict"])
# Fine-tune on new data
for param in model.google_encoder.parameters():
param.requires_grad = False # Freeze encoder
Performance Tips
-
Training speed
- Use multiple GPU workers for data loading
- Enable mixed precision training
- Use gradient accumulation for larger effective batch sizes
-
Memory efficiency
- Use gradient checkpointing
- Implement progressive resizing
- Use memory-efficient optimizers
-
Inference speed
- Use TensorRT or ONNX for deployment
- Implement model quantization
- Use batch inference when possible
Future Improvements
-
Model enhancements
- Transformer-based architecture
- Multi-scale feature fusion
- Uncertainty estimation
-
Training improvements
- Self-supervised pre-training
- Curriculum learning
- Adversarial training
-
Deployment features
- Real-time inference optimization
- Mobile deployment
- Web API
Citation
If you use this system in your research, please cite:
@software{homography_estimation_2024,
title = {Homography Estimation for Map Alignment},
author = {Autopilot Team},
year = {2024},
url = {https://github.com/your-repo/homography-estimation}
}
License
This project is licensed under the MIT License - see the LICENSE file for details.
Support
For questions and issues:
- Check the troubleshooting section
- Review the example scripts
- Open an issue on GitHub
- Contact the development team
Last updated: October 2024