Files

russian_proger 6d4208d100 feat: add SiaN model

2026-02-16 19:07:31 +03:00

7.5 KiB

Raw Blame History

Homography Estimation System

This system provides a complete pipeline for estimating homography matrices between Google and Yandex map images using deep learning.

Overview

Homography estimation is crucial for aligning images from different sources (Google Maps and Yandex Maps in this case). The system includes:

Dataset handling - Loading and preprocessing image pairs
Data augmentation - Homography-based augmentation for robust training
CNN model - Deep learning model for homography estimation
Training pipeline - Complete training and evaluation workflow
Inference tools - Tools for using trained models on new data

Installation

Prerequisites

Python 3.8+
PyTorch 1.9+
OpenCV
PIL/Pillow
NumPy

Install dependencies

pip install torch torchvision opencv-python pillow numpy matplotlib tqdm tensorboard

Dataset Structure

The system expects image pairs in the following format:

dataset/
├── 0000_google.png
├── 0000_yandex.png
├── 0001_google.png
├── 0001_yandex.png
└── ...

Each pair consists of:

{idx:04d}_google.png - Google map image
{idx:04d}_yandex.png - Yandex map image

Quick Start

1. Explore the dataset

from models.homography import HomographyDataset

dataset = HomographyDataset(
    root_dir=r"C:\Users\admin\Projects\autopilot\datasets\ya_go_maps\images",
    augment=True,
    image_size=(256, 256)
)

print(f"Found {len(dataset)} image pairs")
sample = dataset[0]
print(f"Sample homography matrix:\n{sample['homography'].numpy()}")

2. Train a model

python models/train_homography.py \
  --data_dir "C:\Users\admin\Projects\autopilot\datasets\ya_go_maps\images" \
  --epochs 50 \
  --batch_size 16 \
  --lr 1e-3 \
  --output_dir "runs/my_experiment"

3. Perform inference

python models/infer_homography.py \
  --model_path "runs/my_experiment/checkpoint_best.pth" \
  --mode single \
  --google_path "path/to/google.png" \
  --yandex_path "path/to/yandex.png" \
  --output_vis "alignment_result.png"

File Structure

models/
├── homography.py              # Dataset class and data loaders
├── homography_cnn.py          # CNN model architecture
├── train_homography.py        # Training script
├── infer_homography.py        # Inference script
├── example_homography.py      # Example usage
├── homography.ipynb           # Jupyter notebook (empty)
└── README_homography.md       # This file

Model Architecture

The homography estimation model (HomographyCNN) consists of:

Dual encoders - Separate feature extraction for Google and Yandex images
Residual blocks - For deep feature learning
Fusion layers - Combine features from both images
Regression head - Predict 3x3 homography matrix

Key Features:

Residual connections for stable training
Batch normalization options
Dropout for regularization
Geometric consistency loss

Training Configuration

Default training parameters:

Optimizer: Adam with learning rate 1e-3
Loss function: Combined matrix + geometric + regularization loss
Batch size: 32
Image size: 256x256
Train/val split: 80/20

Inference Modes

The inference script supports three modes:

1. Single image pair

python infer_homography.py --mode single \
  --google_path google.png --yandex_path yandex.png

2. Dataset evaluation

python infer_homography.py --mode dataset \
  --dataset_dir path/to/dataset --num_samples 100

3. Batch processing

python infer_homography.py --mode batch \
  --input_dir path/to/input --output_dir path/to/output

Evaluation Metrics

The system computes several metrics:

Matrix MSE: Mean squared error of homography matrix elements
Corner error: Average pixel error at image corners
Geometric consistency: Warping error across grid points

Data Augmentation

The dataset applies homography-based augmentation:

Random rotation (-30° to 30°)
Random scaling (0.8x to 1.2x)
Random translation (-50 to 50 pixels)
Small perspective distortion

Integration with Autopilot System

The homography estimation can be integrated into the autopilot system:

from models.infer_homography import HomographyInference

# Initialize inference
inference = HomographyInference(model_path="path/to/model.pth")

# During flight loop
homography = inference.predict(google_img, yandex_img)
# Use homography to update drone position

Troubleshooting

Common Issues

Out of memory
- Reduce batch size
- Use smaller image size
- Enable gradient checkpointing
Poor convergence
- Adjust learning rate
- Increase model capacity
- Add more data augmentation
Inference errors
- Check image formats (must be RGB)
- Verify model was trained with same image size
- Ensure proper normalization

Debugging Tips

Use example_homography.py to test components
Enable TensorBoard for training visualization
Check homography matrix normalization (last element should be ~1)

Advanced Usage

Custom Model Architecture

from models.homography_cnn import create_homography_model

model = create_homography_model(
    model_type="cnn",
    input_size=(512, 512),
    hidden_channels=128,
    num_blocks=6,
    dropout_rate=0.5
)

Custom Loss Function

from models.homography_cnn import HomographyLoss

loss_fn = HomographyLoss(
    matrix_weight=0.7,
    geometric_weight=0.3,
    reg_weight=0.05,
    grid_size=16
)

Transfer Learning

# Load pretrained model
checkpoint = torch.load("pretrained.pth")
model.load_state_dict(checkpoint["model_state_dict"])

# Fine-tune on new data
for param in model.google_encoder.parameters():
    param.requires_grad = False  # Freeze encoder

Performance Tips

Training speed
- Use multiple GPU workers for data loading
- Enable mixed precision training
- Use gradient accumulation for larger effective batch sizes
Memory efficiency
- Use gradient checkpointing
- Implement progressive resizing
- Use memory-efficient optimizers
Inference speed
- Use TensorRT or ONNX for deployment
- Implement model quantization
- Use batch inference when possible

Future Improvements

Model enhancements
- Transformer-based architecture
- Multi-scale feature fusion
- Uncertainty estimation
Training improvements
- Self-supervised pre-training
- Curriculum learning
- Adversarial training
Deployment features
- Real-time inference optimization
- Mobile deployment
- Web API

Citation

If you use this system in your research, please cite:

@software{homography_estimation_2024,
  title = {Homography Estimation for Map Alignment},
  author = {Autopilot Team},
  year = {2024},
  url = {https://github.com/your-repo/homography-estimation}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For questions and issues:

Check the troubleshooting section
Review the example scripts
Open an issue on GitHub
Contact the development team

Last updated: October 2024

7.5 KiB Raw Blame History