295 lines
7.5 KiB
Markdown
295 lines
7.5 KiB
Markdown
# Homography Estimation System
|
|
|
|
This system provides a complete pipeline for estimating homography matrices between Google and Yandex map images using deep learning.
|
|
|
|
## Overview
|
|
|
|
Homography estimation is crucial for aligning images from different sources (Google Maps and Yandex Maps in this case). The system includes:
|
|
|
|
1. **Dataset handling** - Loading and preprocessing image pairs
|
|
2. **Data augmentation** - Homography-based augmentation for robust training
|
|
3. **CNN model** - Deep learning model for homography estimation
|
|
4. **Training pipeline** - Complete training and evaluation workflow
|
|
5. **Inference tools** - Tools for using trained models on new data
|
|
|
|
## Installation
|
|
|
|
### Prerequisites
|
|
- Python 3.8+
|
|
- PyTorch 1.9+
|
|
- OpenCV
|
|
- PIL/Pillow
|
|
- NumPy
|
|
|
|
### Install dependencies
|
|
```bash
|
|
pip install torch torchvision opencv-python pillow numpy matplotlib tqdm tensorboard
|
|
```
|
|
|
|
## Dataset Structure
|
|
|
|
The system expects image pairs in the following format:
|
|
```
|
|
dataset/
|
|
├── 0000_google.png
|
|
├── 0000_yandex.png
|
|
├── 0001_google.png
|
|
├── 0001_yandex.png
|
|
└── ...
|
|
```
|
|
|
|
Each pair consists of:
|
|
- `{idx:04d}_google.png` - Google map image
|
|
- `{idx:04d}_yandex.png` - Yandex map image
|
|
|
|
## Quick Start
|
|
|
|
### 1. Explore the dataset
|
|
```python
|
|
from models.homography import HomographyDataset
|
|
|
|
dataset = HomographyDataset(
|
|
root_dir=r"C:\Users\admin\Projects\autopilot\datasets\ya_go_maps\images",
|
|
augment=True,
|
|
image_size=(256, 256)
|
|
)
|
|
|
|
print(f"Found {len(dataset)} image pairs")
|
|
sample = dataset[0]
|
|
print(f"Sample homography matrix:\n{sample['homography'].numpy()}")
|
|
```
|
|
|
|
### 2. Train a model
|
|
```bash
|
|
python models/train_homography.py \
|
|
--data_dir "C:\Users\admin\Projects\autopilot\datasets\ya_go_maps\images" \
|
|
--epochs 50 \
|
|
--batch_size 16 \
|
|
--lr 1e-3 \
|
|
--output_dir "runs/my_experiment"
|
|
```
|
|
|
|
### 3. Perform inference
|
|
```bash
|
|
python models/infer_homography.py \
|
|
--model_path "runs/my_experiment/checkpoint_best.pth" \
|
|
--mode single \
|
|
--google_path "path/to/google.png" \
|
|
--yandex_path "path/to/yandex.png" \
|
|
--output_vis "alignment_result.png"
|
|
```
|
|
|
|
## File Structure
|
|
|
|
```
|
|
models/
|
|
├── homography.py # Dataset class and data loaders
|
|
├── homography_cnn.py # CNN model architecture
|
|
├── train_homography.py # Training script
|
|
├── infer_homography.py # Inference script
|
|
├── example_homography.py # Example usage
|
|
├── homography.ipynb # Jupyter notebook (empty)
|
|
└── README_homography.md # This file
|
|
```
|
|
|
|
## Model Architecture
|
|
|
|
The homography estimation model (`HomographyCNN`) consists of:
|
|
|
|
1. **Dual encoders** - Separate feature extraction for Google and Yandex images
|
|
2. **Residual blocks** - For deep feature learning
|
|
3. **Fusion layers** - Combine features from both images
|
|
4. **Regression head** - Predict 3x3 homography matrix
|
|
|
|
### Key Features:
|
|
- Residual connections for stable training
|
|
- Batch normalization options
|
|
- Dropout for regularization
|
|
- Geometric consistency loss
|
|
|
|
## Training Configuration
|
|
|
|
Default training parameters:
|
|
- **Optimizer**: Adam with learning rate 1e-3
|
|
- **Loss function**: Combined matrix + geometric + regularization loss
|
|
- **Batch size**: 32
|
|
- **Image size**: 256x256
|
|
- **Train/val split**: 80/20
|
|
|
|
## Inference Modes
|
|
|
|
The inference script supports three modes:
|
|
|
|
### 1. Single image pair
|
|
```bash
|
|
python infer_homography.py --mode single \
|
|
--google_path google.png --yandex_path yandex.png
|
|
```
|
|
|
|
### 2. Dataset evaluation
|
|
```bash
|
|
python infer_homography.py --mode dataset \
|
|
--dataset_dir path/to/dataset --num_samples 100
|
|
```
|
|
|
|
### 3. Batch processing
|
|
```bash
|
|
python infer_homography.py --mode batch \
|
|
--input_dir path/to/input --output_dir path/to/output
|
|
```
|
|
|
|
## Evaluation Metrics
|
|
|
|
The system computes several metrics:
|
|
- **Matrix MSE**: Mean squared error of homography matrix elements
|
|
- **Corner error**: Average pixel error at image corners
|
|
- **Geometric consistency**: Warping error across grid points
|
|
|
|
## Data Augmentation
|
|
|
|
The dataset applies homography-based augmentation:
|
|
- Random rotation (-30° to 30°)
|
|
- Random scaling (0.8x to 1.2x)
|
|
- Random translation (-50 to 50 pixels)
|
|
- Small perspective distortion
|
|
|
|
## Integration with Autopilot System
|
|
|
|
The homography estimation can be integrated into the autopilot system:
|
|
|
|
```python
|
|
from models.infer_homography import HomographyInference
|
|
|
|
# Initialize inference
|
|
inference = HomographyInference(model_path="path/to/model.pth")
|
|
|
|
# During flight loop
|
|
homography = inference.predict(google_img, yandex_img)
|
|
# Use homography to update drone position
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
1. **Out of memory**
|
|
- Reduce batch size
|
|
- Use smaller image size
|
|
- Enable gradient checkpointing
|
|
|
|
2. **Poor convergence**
|
|
- Adjust learning rate
|
|
- Increase model capacity
|
|
- Add more data augmentation
|
|
|
|
3. **Inference errors**
|
|
- Check image formats (must be RGB)
|
|
- Verify model was trained with same image size
|
|
- Ensure proper normalization
|
|
|
|
### Debugging Tips
|
|
|
|
- Use `example_homography.py` to test components
|
|
- Enable TensorBoard for training visualization
|
|
- Check homography matrix normalization (last element should be ~1)
|
|
|
|
## Advanced Usage
|
|
|
|
### Custom Model Architecture
|
|
```python
|
|
from models.homography_cnn import create_homography_model
|
|
|
|
model = create_homography_model(
|
|
model_type="cnn",
|
|
input_size=(512, 512),
|
|
hidden_channels=128,
|
|
num_blocks=6,
|
|
dropout_rate=0.5
|
|
)
|
|
```
|
|
|
|
### Custom Loss Function
|
|
```python
|
|
from models.homography_cnn import HomographyLoss
|
|
|
|
loss_fn = HomographyLoss(
|
|
matrix_weight=0.7,
|
|
geometric_weight=0.3,
|
|
reg_weight=0.05,
|
|
grid_size=16
|
|
)
|
|
```
|
|
|
|
### Transfer Learning
|
|
```python
|
|
# Load pretrained model
|
|
checkpoint = torch.load("pretrained.pth")
|
|
model.load_state_dict(checkpoint["model_state_dict"])
|
|
|
|
# Fine-tune on new data
|
|
for param in model.google_encoder.parameters():
|
|
param.requires_grad = False # Freeze encoder
|
|
```
|
|
|
|
## Performance Tips
|
|
|
|
1. **Training speed**
|
|
- Use multiple GPU workers for data loading
|
|
- Enable mixed precision training
|
|
- Use gradient accumulation for larger effective batch sizes
|
|
|
|
2. **Memory efficiency**
|
|
- Use gradient checkpointing
|
|
- Implement progressive resizing
|
|
- Use memory-efficient optimizers
|
|
|
|
3. **Inference speed**
|
|
- Use TensorRT or ONNX for deployment
|
|
- Implement model quantization
|
|
- Use batch inference when possible
|
|
|
|
## Future Improvements
|
|
|
|
1. **Model enhancements**
|
|
- Transformer-based architecture
|
|
- Multi-scale feature fusion
|
|
- Uncertainty estimation
|
|
|
|
2. **Training improvements**
|
|
- Self-supervised pre-training
|
|
- Curriculum learning
|
|
- Adversarial training
|
|
|
|
3. **Deployment features**
|
|
- Real-time inference optimization
|
|
- Mobile deployment
|
|
- Web API
|
|
|
|
## Citation
|
|
|
|
If you use this system in your research, please cite:
|
|
|
|
```bibtex
|
|
@software{homography_estimation_2024,
|
|
title = {Homography Estimation for Map Alignment},
|
|
author = {Autopilot Team},
|
|
year = {2024},
|
|
url = {https://github.com/your-repo/homography-estimation}
|
|
}
|
|
```
|
|
|
|
## License
|
|
|
|
This project is licensed under the MIT License - see the LICENSE file for details.
|
|
|
|
## Support
|
|
|
|
For questions and issues:
|
|
1. Check the troubleshooting section
|
|
2. Review the example scripts
|
|
3. Open an issue on GitHub
|
|
4. Contact the development team
|
|
|
|
---
|
|
|
|
*Last updated: October 2024* |