cleanup docs
This commit is contained in:
98
README.md
98
README.md
@@ -8,31 +8,25 @@ Route-Switcher monitors connectivity to specified IP addresses via multiple netw
|
||||
|
||||
## Architecture
|
||||
|
||||
### Core Components
|
||||
Route-Switcher consists of three main components:
|
||||
|
||||
1. **Async Pingers** (`src/pinger.rs`)
|
||||
- Dual-interface ICMP monitoring
|
||||
- Explicit interface binding (equivalent to `ping -I <interface>`)
|
||||
- Configurable ping targets and intervals
|
||||
- Async/await implementation with tokio
|
||||
1. **Async Pingers** (`src/pinger.rs`) - ICMP monitoring with explicit interface binding
|
||||
2. **Route Manager** (`src/routing.rs`) - Netlink-based route manipulation
|
||||
3. **State Machine** (`src/main.rs`) - Failover logic with anti-flapping protection
|
||||
|
||||
2. **Route Manager** (`src/routing.rs`)
|
||||
- Netlink-based route manipulation
|
||||
- No external dependencies on `ip` command
|
||||
- Route addition and deletion
|
||||
- Metric-based route prioritization
|
||||
### State Machine
|
||||
```
|
||||
Boot → Primary: After 10 seconds of sampling
|
||||
Primary → Fallback: After 3 consecutive failures AND secondary is healthy
|
||||
Fallback → Primary: After 60 seconds of stable primary connectivity
|
||||
```
|
||||
|
||||
3. **State Machine** (`src/main.rs`)
|
||||
- Failover logic with anti-flapping protection
|
||||
- Three consecutive failures trigger failover
|
||||
- One minute of stable connectivity triggers failback
|
||||
- Prevents switching when both interfaces fail
|
||||
### Route Management Strategy
|
||||
- **Primary route**: metric 10 (default priority)
|
||||
- **Secondary route**: metric 20 (lower priority)
|
||||
- **Failover route**: metric 5 (highest priority, added only during failover)
|
||||
|
||||
4. **Configuration**
|
||||
- Interface definitions (primary/secondary)
|
||||
- Gateway configurations
|
||||
- Ping targets and timing
|
||||
- Route metrics
|
||||
The system maintains both base routes continuously and adds/removes the failover route as needed.
|
||||
|
||||
## Key Features
|
||||
|
||||
@@ -105,63 +99,38 @@ RUST_LOG=debug sudo cargo run
|
||||
RUST_LOG=info sudo cargo run
|
||||
```
|
||||
|
||||
## Testing Environment
|
||||
|
||||
### Podman-Compose Setup
|
||||
The project includes a complete testing environment using podman-compose:
|
||||
## Testing
|
||||
|
||||
### Quick Test
|
||||
```bash
|
||||
# Start test environment
|
||||
podman-compose up -d
|
||||
|
||||
# Run automated failover test
|
||||
./scripts/test-failover.sh
|
||||
|
||||
# View logs
|
||||
podman-compose logs -f route-switcher
|
||||
|
||||
# Stop test environment
|
||||
# Stop environment
|
||||
podman-compose down
|
||||
```
|
||||
|
||||
### End-to-End Testing
|
||||
### Manual Testing
|
||||
```bash
|
||||
# Simulate primary interface failure
|
||||
podman-compose exec primary ip link set eth0 down
|
||||
# Test primary connectivity
|
||||
podman-compose exec route-switcher ping -c 3 -I eth0 192.168.202.100
|
||||
|
||||
# Observe failover in logs
|
||||
podman-compose logs -f route-switcher
|
||||
# Test secondary connectivity
|
||||
podman-compose exec route-switcher ping -c 3 -I eth1 192.168.202.100
|
||||
|
||||
# Restore primary interface
|
||||
podman-compose exec primary ip link set eth0 up
|
||||
# Simulate primary router failure
|
||||
podman-compose exec primary-router ip link set eth0 down
|
||||
|
||||
# Observe failback after 1 minute
|
||||
# Check routing table
|
||||
podman-compose exec route-switcher ip route show
|
||||
```
|
||||
|
||||
## Implementation Details
|
||||
|
||||
### State Machine
|
||||
```
|
||||
[Boot] -> [Primary] (after initial connectivity check)
|
||||
[Primary] -> [Fallback] (after 3 consecutive failures)
|
||||
[Fallback] -> [Primary] (after 60 seconds of stability)
|
||||
```
|
||||
|
||||
### Route Management
|
||||
- Primary route: `ip r add default via <primary-gw> dev <primary-iface> metric 10`
|
||||
- Secondary route: `ip r add default via <secondary-gw> dev <secondary-iface> metric 20`
|
||||
- Routes are managed via netlink, not external commands
|
||||
|
||||
### Failover Logic
|
||||
1. **Detection**: 3 consecutive ping failures on primary interface
|
||||
2. **Verification**: Secondary interface must be responsive
|
||||
3. **Switch**: Update routing table to use secondary gateway
|
||||
4. **Monitor**: Continue monitoring both interfaces
|
||||
5. **Recovery**: After 60 seconds of stable primary connectivity, switch back
|
||||
|
||||
### Error Handling
|
||||
- Graceful degradation on interface failures
|
||||
- Comprehensive logging for debugging
|
||||
- Signal handling for clean shutdown
|
||||
- Recovery from temporary network issues
|
||||
|
||||
## Dependencies
|
||||
|
||||
- `tokio` - Async runtime
|
||||
@@ -169,12 +138,7 @@ podman-compose exec primary ip link set eth0 up
|
||||
- `netlink-sys` - Netlink kernel communication
|
||||
- `anyhow` - Error handling
|
||||
- `log` + `env_logger` - Logging
|
||||
- `crossbeam-channel` - Inter-thread communication
|
||||
- `signal-hook` - Signal handling
|
||||
|
||||
## Development Phases
|
||||
|
||||
- [ ] End-to-end automated tests
|
||||
- `clap` - Command line parsing
|
||||
|
||||
## License
|
||||
|
||||
|
||||
Reference in New Issue
Block a user