Files
route-switcher/README.md
Michal Humpula 5fbd72b370 working base
2026-02-15 17:36:08 +01:00

181 lines
4.8 KiB
Markdown

# Route-Switcher
A Linux-based network failover system written in Rust that automatically switches routing between dual network interfaces based on connectivity monitoring.
## Overview
Route-Switcher monitors connectivity to specified IP addresses via multiple network interfaces and automatically manages routing tables to ensure network redundancy. When the primary interface fails, it seamlessly switches to a secondary interface, and automatically fails back when the primary connection is restored.
## Architecture
### Core Components
1. **Async Pingers** (`src/pinger.rs`)
- Dual-interface ICMP monitoring
- Explicit interface binding (equivalent to `ping -I <interface>`)
- Configurable ping targets and intervals
- Async/await implementation with tokio
2. **Route Manager** (`src/routing.rs`)
- Netlink-based route manipulation
- No external dependencies on `ip` command
- Route addition and deletion
- Metric-based route prioritization
3. **State Machine** (`src/main.rs`)
- Failover logic with anti-flapping protection
- Three consecutive failures trigger failover
- One minute of stable connectivity triggers failback
- Prevents switching when both interfaces fail
4. **Configuration**
- Interface definitions (primary/secondary)
- Gateway configurations
- Ping targets and timing
- Route metrics
## Key Features
- **Dual Interface Monitoring**: Simultaneous ping testing via both network interfaces
- **Automatic Failover**: Switches to secondary interface after 3 consecutive ping failures
- **Smart Failback**: Returns to primary interface after 1 minute of stable connectivity
- **Anti-Flapping**: Prevents frequent switching between interfaces
- **Edge Case Handling**: Won't switch if secondary interface is also down
- **Netlink Integration**: Direct kernel communication for route management
- **Async Architecture**: Non-blocking monitoring and management
## Requirements
- Linux operating system
- Rust 2024 edition
- Two network interfaces with internet connectivity
- Root privileges for route manipulation
- Netlink kernel support
## Configuration
### Network Setup Example
```bash
# Primary interface
eth0: 192.168.1.10/24, gateway 192.168.1.1
# Secondary interface
eth1: 192.168.2.10/24, gateway 192.168.2.1
```
### Application Configuration
The application is configured through command-line arguments:
```bash
sudo cargo run -- \
--primary-interface eth0 \
--secondary-interface eth1 \
--primary-gateway 192.168.1.1 \
--secondary-gateway 192.168.2.1 \
--ping-target 8.8.8.8
```
## Usage
### Basic Usage
```bash
# Run with default settings
sudo cargo run
# Run with custom configuration
sudo cargo run -- \
--primary-interface eth0 \
--secondary-interface eth1 \
--primary-gateway 192.168.1.1 \
--secondary-gateway 192.168.2.1
```
### Development
```bash
# Build
cargo build
# Run tests
cargo test
# Run with debug logging
RUST_LOG=debug sudo cargo run
# Run with custom log level
RUST_LOG=info sudo cargo run
```
## Testing Environment
### Podman-Compose Setup
The project includes a complete testing environment using podman-compose:
```bash
# Start test environment
podman-compose up -d
# View logs
podman-compose logs -f route-switcher
# Stop test environment
podman-compose down
```
### End-to-End Testing
```bash
# Simulate primary interface failure
podman-compose exec primary ip link set eth0 down
# Observe failover in logs
podman-compose logs -f route-switcher
# Restore primary interface
podman-compose exec primary ip link set eth0 up
# Observe failback after 1 minute
```
## Implementation Details
### State Machine
```
[Boot] -> [Primary] (after initial connectivity check)
[Primary] -> [Fallback] (after 3 consecutive failures)
[Fallback] -> [Primary] (after 60 seconds of stability)
```
### Route Management
- Primary route: `ip r add default via <primary-gw> dev <primary-iface> metric 10`
- Secondary route: `ip r add default via <secondary-gw> dev <secondary-iface> metric 20`
- Routes are managed via netlink, not external commands
### Failover Logic
1. **Detection**: 3 consecutive ping failures on primary interface
2. **Verification**: Secondary interface must be responsive
3. **Switch**: Update routing table to use secondary gateway
4. **Monitor**: Continue monitoring both interfaces
5. **Recovery**: After 60 seconds of stable primary connectivity, switch back
### Error Handling
- Graceful degradation on interface failures
- Comprehensive logging for debugging
- Signal handling for clean shutdown
- Recovery from temporary network issues
## Dependencies
- `tokio` - Async runtime
- `pnet` - Packet networking
- `netlink-sys` - Netlink kernel communication
- `anyhow` - Error handling
- `log` + `env_logger` - Logging
- `crossbeam-channel` - Inter-thread communication
- `signal-hook` - Signal handling
## Development Phases
- [ ] End-to-end automated tests
## License
GPLv3