Files
autopilot/models/SiaN/README_homography.md
2026-02-16 19:07:31 +03:00

295 lines
7.5 KiB
Markdown

# Homography Estimation System
This system provides a complete pipeline for estimating homography matrices between Google and Yandex map images using deep learning.
## Overview
Homography estimation is crucial for aligning images from different sources (Google Maps and Yandex Maps in this case). The system includes:
1. **Dataset handling** - Loading and preprocessing image pairs
2. **Data augmentation** - Homography-based augmentation for robust training
3. **CNN model** - Deep learning model for homography estimation
4. **Training pipeline** - Complete training and evaluation workflow
5. **Inference tools** - Tools for using trained models on new data
## Installation
### Prerequisites
- Python 3.8+
- PyTorch 1.9+
- OpenCV
- PIL/Pillow
- NumPy
### Install dependencies
```bash
pip install torch torchvision opencv-python pillow numpy matplotlib tqdm tensorboard
```
## Dataset Structure
The system expects image pairs in the following format:
```
dataset/
├── 0000_google.png
├── 0000_yandex.png
├── 0001_google.png
├── 0001_yandex.png
└── ...
```
Each pair consists of:
- `{idx:04d}_google.png` - Google map image
- `{idx:04d}_yandex.png` - Yandex map image
## Quick Start
### 1. Explore the dataset
```python
from models.homography import HomographyDataset
dataset = HomographyDataset(
root_dir=r"C:\Users\admin\Projects\autopilot\datasets\ya_go_maps\images",
augment=True,
image_size=(256, 256)
)
print(f"Found {len(dataset)} image pairs")
sample = dataset[0]
print(f"Sample homography matrix:\n{sample['homography'].numpy()}")
```
### 2. Train a model
```bash
python models/train_homography.py \
--data_dir "C:\Users\admin\Projects\autopilot\datasets\ya_go_maps\images" \
--epochs 50 \
--batch_size 16 \
--lr 1e-3 \
--output_dir "runs/my_experiment"
```
### 3. Perform inference
```bash
python models/infer_homography.py \
--model_path "runs/my_experiment/checkpoint_best.pth" \
--mode single \
--google_path "path/to/google.png" \
--yandex_path "path/to/yandex.png" \
--output_vis "alignment_result.png"
```
## File Structure
```
models/
├── homography.py # Dataset class and data loaders
├── homography_cnn.py # CNN model architecture
├── train_homography.py # Training script
├── infer_homography.py # Inference script
├── example_homography.py # Example usage
├── homography.ipynb # Jupyter notebook (empty)
└── README_homography.md # This file
```
## Model Architecture
The homography estimation model (`HomographyCNN`) consists of:
1. **Dual encoders** - Separate feature extraction for Google and Yandex images
2. **Residual blocks** - For deep feature learning
3. **Fusion layers** - Combine features from both images
4. **Regression head** - Predict 3x3 homography matrix
### Key Features:
- Residual connections for stable training
- Batch normalization options
- Dropout for regularization
- Geometric consistency loss
## Training Configuration
Default training parameters:
- **Optimizer**: Adam with learning rate 1e-3
- **Loss function**: Combined matrix + geometric + regularization loss
- **Batch size**: 32
- **Image size**: 256x256
- **Train/val split**: 80/20
## Inference Modes
The inference script supports three modes:
### 1. Single image pair
```bash
python infer_homography.py --mode single \
--google_path google.png --yandex_path yandex.png
```
### 2. Dataset evaluation
```bash
python infer_homography.py --mode dataset \
--dataset_dir path/to/dataset --num_samples 100
```
### 3. Batch processing
```bash
python infer_homography.py --mode batch \
--input_dir path/to/input --output_dir path/to/output
```
## Evaluation Metrics
The system computes several metrics:
- **Matrix MSE**: Mean squared error of homography matrix elements
- **Corner error**: Average pixel error at image corners
- **Geometric consistency**: Warping error across grid points
## Data Augmentation
The dataset applies homography-based augmentation:
- Random rotation (-30° to 30°)
- Random scaling (0.8x to 1.2x)
- Random translation (-50 to 50 pixels)
- Small perspective distortion
## Integration with Autopilot System
The homography estimation can be integrated into the autopilot system:
```python
from models.infer_homography import HomographyInference
# Initialize inference
inference = HomographyInference(model_path="path/to/model.pth")
# During flight loop
homography = inference.predict(google_img, yandex_img)
# Use homography to update drone position
```
## Troubleshooting
### Common Issues
1. **Out of memory**
- Reduce batch size
- Use smaller image size
- Enable gradient checkpointing
2. **Poor convergence**
- Adjust learning rate
- Increase model capacity
- Add more data augmentation
3. **Inference errors**
- Check image formats (must be RGB)
- Verify model was trained with same image size
- Ensure proper normalization
### Debugging Tips
- Use `example_homography.py` to test components
- Enable TensorBoard for training visualization
- Check homography matrix normalization (last element should be ~1)
## Advanced Usage
### Custom Model Architecture
```python
from models.homography_cnn import create_homography_model
model = create_homography_model(
model_type="cnn",
input_size=(512, 512),
hidden_channels=128,
num_blocks=6,
dropout_rate=0.5
)
```
### Custom Loss Function
```python
from models.homography_cnn import HomographyLoss
loss_fn = HomographyLoss(
matrix_weight=0.7,
geometric_weight=0.3,
reg_weight=0.05,
grid_size=16
)
```
### Transfer Learning
```python
# Load pretrained model
checkpoint = torch.load("pretrained.pth")
model.load_state_dict(checkpoint["model_state_dict"])
# Fine-tune on new data
for param in model.google_encoder.parameters():
param.requires_grad = False # Freeze encoder
```
## Performance Tips
1. **Training speed**
- Use multiple GPU workers for data loading
- Enable mixed precision training
- Use gradient accumulation for larger effective batch sizes
2. **Memory efficiency**
- Use gradient checkpointing
- Implement progressive resizing
- Use memory-efficient optimizers
3. **Inference speed**
- Use TensorRT or ONNX for deployment
- Implement model quantization
- Use batch inference when possible
## Future Improvements
1. **Model enhancements**
- Transformer-based architecture
- Multi-scale feature fusion
- Uncertainty estimation
2. **Training improvements**
- Self-supervised pre-training
- Curriculum learning
- Adversarial training
3. **Deployment features**
- Real-time inference optimization
- Mobile deployment
- Web API
## Citation
If you use this system in your research, please cite:
```bibtex
@software{homography_estimation_2024,
title = {Homography Estimation for Map Alignment},
author = {Autopilot Team},
year = {2024},
url = {https://github.com/your-repo/homography-estimation}
}
```
## License
This project is licensed under the MIT License - see the LICENSE file for details.
## Support
For questions and issues:
1. Check the troubleshooting section
2. Review the example scripts
3. Open an issue on GitHub
4. Contact the development team
---
*Last updated: October 2024*