# Homography Estimation System This system provides a complete pipeline for estimating homography matrices between Google and Yandex map images using deep learning. ## Overview Homography estimation is crucial for aligning images from different sources (Google Maps and Yandex Maps in this case). The system includes: 1. **Dataset handling** - Loading and preprocessing image pairs 2. **Data augmentation** - Homography-based augmentation for robust training 3. **CNN model** - Deep learning model for homography estimation 4. **Training pipeline** - Complete training and evaluation workflow 5. **Inference tools** - Tools for using trained models on new data ## Installation ### Prerequisites - Python 3.8+ - PyTorch 1.9+ - OpenCV - PIL/Pillow - NumPy ### Install dependencies ```bash pip install torch torchvision opencv-python pillow numpy matplotlib tqdm tensorboard ``` ## Dataset Structure The system expects image pairs in the following format: ``` dataset/ ├── 0000_google.png ├── 0000_yandex.png ├── 0001_google.png ├── 0001_yandex.png └── ... ``` Each pair consists of: - `{idx:04d}_google.png` - Google map image - `{idx:04d}_yandex.png` - Yandex map image ## Quick Start ### 1. Explore the dataset ```python from models.homography import HomographyDataset dataset = HomographyDataset( root_dir=r"C:\Users\admin\Projects\autopilot\datasets\ya_go_maps\images", augment=True, image_size=(256, 256) ) print(f"Found {len(dataset)} image pairs") sample = dataset[0] print(f"Sample homography matrix:\n{sample['homography'].numpy()}") ``` ### 2. Train a model ```bash python models/train_homography.py \ --data_dir "C:\Users\admin\Projects\autopilot\datasets\ya_go_maps\images" \ --epochs 50 \ --batch_size 16 \ --lr 1e-3 \ --output_dir "runs/my_experiment" ``` ### 3. Perform inference ```bash python models/infer_homography.py \ --model_path "runs/my_experiment/checkpoint_best.pth" \ --mode single \ --google_path "path/to/google.png" \ --yandex_path "path/to/yandex.png" \ --output_vis "alignment_result.png" ``` ## File Structure ``` models/ ├── homography.py # Dataset class and data loaders ├── homography_cnn.py # CNN model architecture ├── train_homography.py # Training script ├── infer_homography.py # Inference script ├── example_homography.py # Example usage ├── homography.ipynb # Jupyter notebook (empty) └── README_homography.md # This file ``` ## Model Architecture The homography estimation model (`HomographyCNN`) consists of: 1. **Dual encoders** - Separate feature extraction for Google and Yandex images 2. **Residual blocks** - For deep feature learning 3. **Fusion layers** - Combine features from both images 4. **Regression head** - Predict 3x3 homography matrix ### Key Features: - Residual connections for stable training - Batch normalization options - Dropout for regularization - Geometric consistency loss ## Training Configuration Default training parameters: - **Optimizer**: Adam with learning rate 1e-3 - **Loss function**: Combined matrix + geometric + regularization loss - **Batch size**: 32 - **Image size**: 256x256 - **Train/val split**: 80/20 ## Inference Modes The inference script supports three modes: ### 1. Single image pair ```bash python infer_homography.py --mode single \ --google_path google.png --yandex_path yandex.png ``` ### 2. Dataset evaluation ```bash python infer_homography.py --mode dataset \ --dataset_dir path/to/dataset --num_samples 100 ``` ### 3. Batch processing ```bash python infer_homography.py --mode batch \ --input_dir path/to/input --output_dir path/to/output ``` ## Evaluation Metrics The system computes several metrics: - **Matrix MSE**: Mean squared error of homography matrix elements - **Corner error**: Average pixel error at image corners - **Geometric consistency**: Warping error across grid points ## Data Augmentation The dataset applies homography-based augmentation: - Random rotation (-30° to 30°) - Random scaling (0.8x to 1.2x) - Random translation (-50 to 50 pixels) - Small perspective distortion ## Integration with Autopilot System The homography estimation can be integrated into the autopilot system: ```python from models.infer_homography import HomographyInference # Initialize inference inference = HomographyInference(model_path="path/to/model.pth") # During flight loop homography = inference.predict(google_img, yandex_img) # Use homography to update drone position ``` ## Troubleshooting ### Common Issues 1. **Out of memory** - Reduce batch size - Use smaller image size - Enable gradient checkpointing 2. **Poor convergence** - Adjust learning rate - Increase model capacity - Add more data augmentation 3. **Inference errors** - Check image formats (must be RGB) - Verify model was trained with same image size - Ensure proper normalization ### Debugging Tips - Use `example_homography.py` to test components - Enable TensorBoard for training visualization - Check homography matrix normalization (last element should be ~1) ## Advanced Usage ### Custom Model Architecture ```python from models.homography_cnn import create_homography_model model = create_homography_model( model_type="cnn", input_size=(512, 512), hidden_channels=128, num_blocks=6, dropout_rate=0.5 ) ``` ### Custom Loss Function ```python from models.homography_cnn import HomographyLoss loss_fn = HomographyLoss( matrix_weight=0.7, geometric_weight=0.3, reg_weight=0.05, grid_size=16 ) ``` ### Transfer Learning ```python # Load pretrained model checkpoint = torch.load("pretrained.pth") model.load_state_dict(checkpoint["model_state_dict"]) # Fine-tune on new data for param in model.google_encoder.parameters(): param.requires_grad = False # Freeze encoder ``` ## Performance Tips 1. **Training speed** - Use multiple GPU workers for data loading - Enable mixed precision training - Use gradient accumulation for larger effective batch sizes 2. **Memory efficiency** - Use gradient checkpointing - Implement progressive resizing - Use memory-efficient optimizers 3. **Inference speed** - Use TensorRT or ONNX for deployment - Implement model quantization - Use batch inference when possible ## Future Improvements 1. **Model enhancements** - Transformer-based architecture - Multi-scale feature fusion - Uncertainty estimation 2. **Training improvements** - Self-supervised pre-training - Curriculum learning - Adversarial training 3. **Deployment features** - Real-time inference optimization - Mobile deployment - Web API ## Citation If you use this system in your research, please cite: ```bibtex @software{homography_estimation_2024, title = {Homography Estimation for Map Alignment}, author = {Autopilot Team}, year = {2024}, url = {https://github.com/your-repo/homography-estimation} } ``` ## License This project is licensed under the MIT License - see the LICENSE file for details. ## Support For questions and issues: 1. Check the troubleshooting section 2. Review the example scripts 3. Open an issue on GitHub 4. Contact the development team --- *Last updated: October 2024*