autopilot/models/SiaN/README_homography.md

# Homography Estimation System

This system provides a complete pipeline for estimating homography matrices between Google and Yandex map images using deep learning.

## Overview

Homography estimation is crucial for aligning images from different sources (Google Maps and Yandex Maps in this case). The system includes:

1. **Dataset handling** - Loading and preprocessing image pairs
2. **Data augmentation** - Homography-based augmentation for robust training
3. **CNN model** - Deep learning model for homography estimation
4. **Training pipeline** - Complete training and evaluation workflow
5. **Inference tools** - Tools for using trained models on new data

## Installation

### Prerequisites
- Python 3.8+
- PyTorch 1.9+
- OpenCV
- PIL/Pillow
- NumPy

### Install dependencies
```bash
pip install torch torchvision opencv-python pillow numpy matplotlib tqdm tensorboard
```

## Dataset Structure

The system expects image pairs in the following format:
```
dataset/
├── 0000_google.png
├── 0000_yandex.png
├── 0001_google.png
├── 0001_yandex.png
└── ...
```

Each pair consists of:
- `{idx:04d}_google.png` - Google map image
- `{idx:04d}_yandex.png` - Yandex map image

## Quick Start

### 1. Explore the dataset
```python
from models.homography import HomographyDataset

dataset = HomographyDataset(
    root_dir=r"C:\Users\admin\Projects\autopilot\datasets\ya_go_maps\images",
    augment=True,
    image_size=(256, 256)
)

print(f"Found {len(dataset)} image pairs")
sample = dataset[0]
print(f"Sample homography matrix:\n{sample['homography'].numpy()}")
```

### 2. Train a model
```bash
python models/train_homography.py \
  --data_dir "C:\Users\admin\Projects\autopilot\datasets\ya_go_maps\images" \
  --epochs 50 \
  --batch_size 16 \
  --lr 1e-3 \
  --output_dir "runs/my_experiment"
```

### 3. Perform inference
```bash
python models/infer_homography.py \
  --model_path "runs/my_experiment/checkpoint_best.pth" \
  --mode single \
  --google_path "path/to/google.png" \
  --yandex_path "path/to/yandex.png" \
  --output_vis "alignment_result.png"
```

## File Structure

```
models/
├── homography.py              # Dataset class and data loaders
├── homography_cnn.py          # CNN model architecture
├── train_homography.py        # Training script
├── infer_homography.py        # Inference script
├── example_homography.py      # Example usage
├── homography.ipynb           # Jupyter notebook (empty)
└── README_homography.md       # This file
```

## Model Architecture

The homography estimation model (`HomographyCNN`) consists of:

1. **Dual encoders** - Separate feature extraction for Google and Yandex images
2. **Residual blocks** - For deep feature learning
3. **Fusion layers** - Combine features from both images
4. **Regression head** - Predict 3x3 homography matrix

### Key Features:
- Residual connections for stable training
- Batch normalization options
- Dropout for regularization
- Geometric consistency loss

## Training Configuration

Default training parameters:
- **Optimizer**: Adam with learning rate 1e-3
- **Loss function**: Combined matrix + geometric + regularization loss
- **Batch size**: 32
- **Image size**: 256x256
- **Train/val split**: 80/20

## Inference Modes

The inference script supports three modes:

### 1. Single image pair
```bash
python infer_homography.py --mode single \
  --google_path google.png --yandex_path yandex.png
```

### 2. Dataset evaluation
```bash
python infer_homography.py --mode dataset \
  --dataset_dir path/to/dataset --num_samples 100
```

### 3. Batch processing
```bash
python infer_homography.py --mode batch \
  --input_dir path/to/input --output_dir path/to/output
```

## Evaluation Metrics

The system computes several metrics:
- **Matrix MSE**: Mean squared error of homography matrix elements
- **Corner error**: Average pixel error at image corners
- **Geometric consistency**: Warping error across grid points

## Data Augmentation

The dataset applies homography-based augmentation:
- Random rotation (-30° to 30°)
- Random scaling (0.8x to 1.2x)
- Random translation (-50 to 50 pixels)
- Small perspective distortion

## Integration with Autopilot System

The homography estimation can be integrated into the autopilot system:

```python
from models.infer_homography import HomographyInference

# Initialize inference
inference = HomographyInference(model_path="path/to/model.pth")

# During flight loop
homography = inference.predict(google_img, yandex_img)
# Use homography to update drone position
```

## Troubleshooting

### Common Issues

1. **Out of memory**
   - Reduce batch size
   - Use smaller image size
   - Enable gradient checkpointing

2. **Poor convergence**
   - Adjust learning rate
   - Increase model capacity
   - Add more data augmentation

3. **Inference errors**
   - Check image formats (must be RGB)
   - Verify model was trained with same image size
   - Ensure proper normalization

### Debugging Tips

- Use `example_homography.py` to test components
- Enable TensorBoard for training visualization
- Check homography matrix normalization (last element should be ~1)

## Advanced Usage

### Custom Model Architecture
```python
from models.homography_cnn import create_homography_model

model = create_homography_model(
    model_type="cnn",
    input_size=(512, 512),
    hidden_channels=128,
    num_blocks=6,
    dropout_rate=0.5
)
```

### Custom Loss Function
```python
from models.homography_cnn import HomographyLoss

loss_fn = HomographyLoss(
    matrix_weight=0.7,
    geometric_weight=0.3,
    reg_weight=0.05,
    grid_size=16
)
```

### Transfer Learning
```python
# Load pretrained model
checkpoint = torch.load("pretrained.pth")
model.load_state_dict(checkpoint["model_state_dict"])

# Fine-tune on new data
for param in model.google_encoder.parameters():
    param.requires_grad = False  # Freeze encoder
```

## Performance Tips

1. **Training speed**
   - Use multiple GPU workers for data loading
   - Enable mixed precision training
   - Use gradient accumulation for larger effective batch sizes

2. **Memory efficiency**
   - Use gradient checkpointing
   - Implement progressive resizing
   - Use memory-efficient optimizers

3. **Inference speed**
   - Use TensorRT or ONNX for deployment
   - Implement model quantization
   - Use batch inference when possible

## Future Improvements

1. **Model enhancements**
   - Transformer-based architecture
   - Multi-scale feature fusion
   - Uncertainty estimation

2. **Training improvements**
   - Self-supervised pre-training
   - Curriculum learning
   - Adversarial training

3. **Deployment features**
   - Real-time inference optimization
   - Mobile deployment
   - Web API

## Citation

If you use this system in your research, please cite:

```bibtex
@software{homography_estimation_2024,
  title = {Homography Estimation for Map Alignment},
  author = {Autopilot Team},
  year = {2024},
  url = {https://github.com/your-repo/homography-estimation}
}
```

## License

This project is licensed under the MIT License - see the LICENSE file for details.

## Support

For questions and issues:
1. Check the troubleshooting section
2. Review the example scripts
3. Open an issue on GitHub
4. Contact the development team

---

*Last updated: October 2024*