new logging systems
All checks were successful
Test and Publish Templates / test-and-publish (push) Successful in 40s

This commit is contained in:
Your Name
2025-09-20 09:04:29 +12:00
parent 7c7c60e969
commit 47a51ec176
18 changed files with 1597 additions and 0 deletions

244
logserver/TODO.md Normal file
View File

@@ -0,0 +1,244 @@
# LogServer Template - Implementation TODO
## Phase 1: Core Infrastructure (Priority 1)
### Configuration Files
- [ ] Create `config/.template_info.env` with template metadata
- [ ] Create `config/service.env` with user-configurable settings
- [ ] Define all required environment variables (ports, passwords, heap sizes)
- [ ] Set appropriate default values for zero-config experience
### Docker Compose Setup
- [ ] Create `docker-compose.yml` with ELK stack services
- [ ] Configure Elasticsearch single-node setup
- [ ] Configure Logstash with Beats input pipeline
- [ ] Configure Kibana with Elasticsearch connection
- [ ] Set up proper networking between services
- [ ] Define named volumes for data persistence
- [ ] Configure health checks for each service
### Required Scripts
- [ ] Implement `install.sh` - Pull images, create volumes, start services
- [ ] Implement `uninstall.sh` - Stop and remove containers (preserve volumes!)
- [ ] Implement `start.sh` - Start all ELK services with docker-compose
- [ ] Implement `stop.sh` - Gracefully stop all services
- [ ] Implement `status.sh` - Check health of all three services
## Phase 2: Logstash Configuration (Priority 1)
### Input Configuration
- [ ] Configure Beats input on port 5044 with TLS/SSL
- [ ] Set up mutual TLS (mTLS) authentication
- [ ] Configure client certificate validation
- [ ] Add API key authentication option
- [ ] Implement IP whitelisting
- [ ] Add Syslog input on port 514 (UDP/TCP) - unauthenticated
- [ ] Add Docker Fluentd input on port 24224 (optional)
### Filter Pipeline
- [ ] Create Docker log parser (extract container metadata)
- [ ] Create Syslog parser (RFC3164 and RFC5424)
- [ ] Add JSON parser for structured logs
- [ ] Implement multiline pattern for stack traces
- [ ] Add timestamp extraction and normalization
- [ ] Create field enrichment (add host metadata)
- [ ] Implement conditional routing based on log type
### Output Configuration
- [ ] Configure Elasticsearch output with index patterns
- [ ] Set up index templates for different log types
- [ ] Configure index lifecycle management (ILM)
## Phase 3: Elasticsearch Setup (Priority 1)
### System Configuration
- [ ] Set appropriate heap size defaults (ES_HEAP_SIZE)
- [ ] Configure vm.max_map_count requirement check
- [ ] Set up single-node discovery settings
- [ ] Configure data persistence volume
- [ ] Set up index templates for:
- [ ] Docker logs (docker-*)
- [ ] System logs (syslog-*)
- [ ] Application logs (app-*)
- [ ] Error logs (errors-*)
### Index Management
- [ ] Configure ILM policies for log rotation
- [ ] Set retention period (default 30 days)
- [ ] Configure max index size limits
- [ ] Set up automatic cleanup of old indices
- [ ] Create snapshot repository configuration
## Phase 4: Kibana Configuration (Priority 2)
### Initial Setup
- [ ] Configure Kibana with Elasticsearch URL
- [ ] Set up basic authentication
- [ ] Configure server base path
- [ ] Set appropriate memory limits
### Pre-built Dashboards
- [ ] Create System Overview dashboard
- [ ] Create Docker Containers dashboard
- [ ] Create Error Analysis dashboard
- [ ] Create Security Events dashboard
- [ ] Create Host Metrics dashboard
### Saved Searches
- [ ] Error logs across all sources
- [ ] Authentication events
- [ ] Container lifecycle events
- [ ] Slow queries/performance issues
- [ ] Critical system events
### Index Patterns
- [ ] Configure docker-* pattern
- [ ] Configure syslog-* pattern
- [ ] Configure app-* pattern
- [ ] Configure filebeat-* pattern
## Phase 5: Optional Scripts (Priority 2)
### Operational Scripts
- [ ] Implement `logs.sh` - Show logs from all ELK services
- [ ] Implement `backup.sh` - Snapshot Elasticsearch indices
- [ ] Implement `restore.sh` - Restore from snapshots
- [ ] Implement `destroy.sh` - Complete removal including volumes
- [ ] Implement `ports.sh` - Display all exposed ports
- [ ] Implement `ssh.sh` - Shell into specific container
### Helper Scripts
- [ ] Create `_volumes.sh` for volume management helpers
- [ ] Add health check script for all services
- [ ] Create performance tuning script
- [ ] Add certificate generation script for SSL
## Phase 6: Security Features (Priority 1 - CRITICAL)
### Certificate Authority Setup
- [ ] Create CA certificate and key for signing client certs
- [ ] Generate server certificate for Logstash
- [ ] Create certificate generation script for clients
- [ ] Set up certificate storage structure
- [ ] Implement certificate rotation mechanism
### mTLS Authentication
- [ ] Configure Logstash for mutual TLS
- [ ] Set up client certificate validation
- [ ] Create client certificate generation script
- [ ] Implement certificate revocation list (CRL)
- [ ] Add certificate expiry monitoring
### API Key Authentication
- [ ] Create API key generation script
- [ ] Configure Logstash to accept API keys
- [ ] Implement API key storage (encrypted)
- [ ] Add API key rotation mechanism
- [ ] Create API key revocation process
### Network Security
- [ ] Implement IP whitelisting in Logstash
- [ ] Configure firewall rules
- [ ] Set up rate limiting
- [ ] Add connection throttling
- [ ] Implement DDoS protection
### Kibana Security
- [ ] Configure Kibana HTTPS
- [ ] Set up basic authentication
- [ ] Create user management scripts
- [ ] Implement session management
- [ ] Add audit logging
## Phase 7: Performance & Optimization (Priority 3)
### Resource Management
- [ ] Configure CPU limits for each service
- [ ] Set memory limits appropriately
- [ ] Add swap handling configuration
- [ ] Configure JVM options files
- [ ] Add performance monitoring
### Optimization
- [ ] Configure pipeline workers
- [ ] Set batch sizes for optimal throughput
- [ ] Configure queue sizes
- [ ] Add caching configuration
- [ ] Optimize index refresh intervals
## Phase 8: Testing & Documentation (Priority 3)
### Testing
- [ ] Test installation process
- [ ] Test uninstall (verify volume preservation)
- [ ] Test log ingestion from sample client
- [ ] Test all dashboard functionality
- [ ] Test backup and restore procedures
- [ ] Load test with high log volume
- [ ] Test failover and recovery
### Documentation
- [ ] Create README.txt for dropshell format
- [ ] Document all configuration options
- [ ] Add troubleshooting guide
- [ ] Create quick start guide
- [ ] Document upgrade procedures
- [ ] Add performance tuning guide
## Phase 9: Integration Testing (Priority 3)
### With LogClient
- [ ] Test automatic discovery
- [ ] Verify log flow from client to server
- [ ] Test reconnection scenarios
- [ ] Verify all log types are parsed correctly
- [ ] Test SSL communication
- [ ] Measure end-to-end latency
### Compatibility Testing
- [ ] Test with different Docker versions
- [ ] Test on various Linux distributions
- [ ] Verify with different log formats
- [ ] Test with high-volume producers
- [ ] Validate resource usage
## Phase 10: Production Readiness (Priority 4)
### Monitoring & Alerting
- [ ] Add Elasticsearch monitoring
- [ ] Configure disk space alerts
- [ ] Set up index health monitoring
- [ ] Add performance metrics collection
- [ ] Create alert rules in Kibana
### Maintenance Features
- [ ] Add automatic update check
- [ ] Create maintenance mode
- [ ] Add data export functionality
- [ ] Create migration scripts
- [ ] Add configuration validation
## Notes
### Design Principles
1. **Minimum configuration**: Should work with just `dropshell install logserver`
2. **Data safety**: Never delete volumes in uninstall.sh
3. **Non-interactive**: All scripts must run without user input
4. **Idempotent**: Scripts can be run multiple times safely
5. **Clear feedback**: Provide clear status and error messages
### Dependencies
- Docker and Docker Compose
- Sufficient system resources (4GB+ RAM recommended)
- Network connectivity for clients
- Persistent storage for logs
### Testing Checklist
- [ ] All required scripts present and executable
- [ ] Template validates with dropshell test-template
- [ ] Services start and connect properly
- [ ] Logs flow from client to Kibana
- [ ] Data persists across container restarts
- [ ] Uninstall preserves data volumes
- [ ] Resource limits are enforced
- [ ] Error handling works correctly