new logging systems
All checks were successful
Test and Publish Templates / test-and-publish (push) Successful in 40s

This commit is contained in:
Your Name
2025-09-20 09:04:29 +12:00
parent 7c7c60e969
commit 47a51ec176
18 changed files with 1597 additions and 0 deletions

294
logclient/TODO.md Normal file
View File

@@ -0,0 +1,294 @@
# LogClient Template - Implementation TODO
## Phase 1: Core Infrastructure (Priority 1)
### Configuration Files
- [ ] Create `config/.template_info.env` with template metadata
- [ ] Create `config/service.env` with minimal required settings
- [ ] Define LOGSERVER_HOST and LOGSERVER_PORT variables
- [ ] Add AUTH_MODE variable (mtls, apikey, basic)
- [ ] Add certificate/key path variables for mTLS
- [ ] Add API_KEY variable for API key auth
- [ ] Add USERNAME/PASSWORD for basic auth
- [ ] Add optional performance and filtering variables
- [ ] Set sensible defaults where possible
### Filebeat Configuration
- [ ] Create base `filebeat.yml` configuration template
- [ ] Configure Docker input using Docker API (not autodiscover with hints)
- [ ] Set containers.ids: ["*"] to collect from all containers
- [ ] Set up system log inputs for host logs
- [ ] Configure output to Logstash
- [ ] Add error handling and retry logic
- [ ] Set up local disk buffering
- [ ] Configure stream: "all" to get both stdout and stderr
### Required Scripts
- [ ] Implement `install.sh` - Pull Filebeat image, configure auth, start
- [ ] Implement `uninstall.sh` - Stop and remove container (preserve config and certs)
- [ ] Implement `start.sh` - Start Filebeat with auth config and proper mounts
- [ ] Implement `stop.sh` - Gracefully stop Filebeat
- [ ] Implement `status.sh` - Check Filebeat health and auth status
- [ ] Create `setup-auth.sh` - Helper script to configure authentication
## Phase 2: Docker API Log Collection (Priority 1)
### Docker API Input Configuration
- [ ] Configure Docker input type (NOT autodiscover, use direct Docker input)
- [ ] Mount Docker socket (/var/run/docker.sock) with proper permissions
- [ ] Configure Docker API endpoint (unix:///var/run/docker.sock)
- [ ] Set up real-time log streaming from Docker daemon
- [ ] Enable collection from ALL logging drivers (local, json-file, journald, etc.)
- [ ] Configure since_time to get recent logs on startup
### Container Metadata Extraction
- [ ] Extract container name, ID, image name, and image tag
- [ ] Map container labels to fields
- [ ] Handle docker-compose project names and service names
- [ ] Add container state information
- [ ] Include container environment variables (filtered)
- [ ] Handle container lifecycle events (start, stop, restart)
### Container Filtering
- [ ] Implement include/exclude by container name patterns
- [ ] Add label-based filtering (containers.labels)
- [ ] Create ignore patterns for system containers
- [ ] Add support for custom filter expressions
- [ ] Configure combine_partial to handle partial log lines
- [ ] Document filtering examples with Docker API syntax
## Phase 3: System Log Collection (Priority 1)
### Log File Inputs
- [ ] Configure /var/log/syslog or /var/log/messages
- [ ] Add /var/log/auth.log or /var/log/secure
- [ ] Include /var/log/kern.log
- [ ] Monitor /var/log/dpkg.log or /var/log/yum.log
- [ ] Add custom log path support via environment variable
### Journald Integration
- [ ] Detect if systemd/journald is available
- [ ] Configure journald input if present
- [ ] Set up unit filtering
- [ ] Extract systemd metadata
- [ ] Handle binary journal format
### Log Parsing
- [ ] Configure syslog parsing
- [ ] Extract severity levels
- [ ] Parse timestamps correctly
- [ ] Handle different syslog formats
- [ ] Add timezone handling
## Phase 4: Output Configuration (Priority 1)
### Logstash Output
- [ ] Configure primary Logstash endpoint
- [ ] Set up connection parameters (timeout, retry)
- [ ] Configure bulk operations settings
- [ ] Add compression support
- [ ] Implement backpressure handling
### Connection Management
- [ ] Configure automatic reconnection
- [ ] Set exponential backoff for retries
- [ ] Add connection pooling
- [ ] Configure keepalive settings
- [ ] Handle DNS resolution failures
### Authentication Configuration (Priority 1 - CRITICAL)
- [ ] Implement mTLS authentication support
- [ ] Configure client certificate and key loading
- [ ] Add CA certificate validation
- [ ] Implement API key authentication
- [ ] Add basic auth as fallback option
- [ ] Create authentication mode selection logic
- [ ] Handle authentication failures gracefully
- [ ] Add certificate expiry checking
- [ ] Implement secure credential storage
- [ ] Document authentication setup process
## Phase 5: Reliability Features (Priority 2)
### Local Buffering
- [ ] Configure disk queue for reliability
- [ ] Set queue size limits
- [ ] Configure memory queue settings
- [ ] Add overflow handling
- [ ] Set up automatic cleanup of old events
### Error Handling
- [ ] Add retry logic for failed sends
- [ ] Configure dead letter queue
- [ ] Add circuit breaker pattern
- [ ] Log transmission errors appropriately
- [ ] Add metrics for monitoring failures
### Performance Optimization
- [ ] Configure worker count
- [ ] Set batch size for sending
- [ ] Add compression level setting
- [ ] Configure CPU and memory limits
- [ ] Optimize for high-volume scenarios
## Phase 6: Optional Scripts (Priority 2)
### Operational Scripts
- [ ] Implement `logs.sh` - Show Filebeat logs
- [ ] Implement `destroy.sh` - Complete removal
- [ ] Implement `ssh.sh` - Shell into Filebeat container
- [ ] Create `test.sh` - Test connectivity to server
- [ ] Add `metrics.sh` - Show Filebeat statistics
### Diagnostic Scripts
- [ ] Create connectivity test script
- [ ] Add configuration validation script
- [ ] Create debug mode enabler
- [ ] Add log sampling script
- [ ] Create performance benchmark script
## Phase 7: Monitoring & Health (Priority 2)
### Health Checks
- [ ] Configure Filebeat HTTP endpoint
- [ ] Add Docker health check
- [ ] Monitor queue status
- [ ] Check connection to Logstash
- [ ] Track dropped events
### Metrics Collection
- [ ] Enable Filebeat monitoring
- [ ] Export metrics endpoint
- [ ] Track events sent/failed
- [ ] Monitor resource usage
- [ ] Add performance counters
### Status Reporting
- [ ] Implement detailed status in status.sh
- [ ] Show connection state
- [ ] Display queue status
- [ ] Report recent errors
- [ ] Show throughput metrics
## Phase 8: Advanced Features (Priority 3)
### Processors
- [ ] Add field renaming processor
- [ ] Configure drop_event conditions
- [ ] Add rate limiting processor
- [ ] Include fingerprinting for deduplication
- [ ] Add custom field enrichment
### Multiline Handling
- [ ] Configure patterns for common languages
- [ ] Java stack trace handling
- [ ] Python traceback handling
- [ ] Go panic handling
- [ ] Custom pattern support via environment
### Field Management
- [ ] Configure field inclusion/exclusion
- [ ] Add custom fields via environment
- [ ] Set up field type conversions
- [ ] Add timestamp parsing
- [ ] Configure field aliasing
## Phase 9: Testing (Priority 3)
### Unit Testing
- [ ] Test configuration generation
- [ ] Verify volume mounts
- [ ] Test environment variable substitution
- [ ] Validate filtering logic
- [ ] Test error conditions
### Integration Testing
- [ ] Test with logserver template
- [ ] Verify Docker log collection
- [ ] Test system log collection
- [ ] Validate SSL connectivity
- [ ] Test reconnection scenarios
- [ ] Verify buffering during outages
### Load Testing
- [ ] Test with high log volume
- [ ] Measure resource usage
- [ ] Test queue overflow handling
- [ ] Verify rate limiting
- [ ] Benchmark throughput
## Phase 10: Documentation (Priority 3)
### User Documentation
- [ ] Create README.txt for dropshell
- [ ] Document all configuration options
- [ ] Add troubleshooting guide
- [ ] Create quick start guide
- [ ] Add FAQ section
### Configuration Examples
- [ ] Minimal configuration example
- [ ] High-volume configuration
- [ ] Secure SSL configuration
- [ ] Filtered configuration
- [ ] Custom paths configuration
### Integration Guides
- [ ] Integration with logserver
- [ ] Docker Compose examples
- [ ] Kubernetes DaemonSet example
- [ ] Swarm mode configuration
- [ ] Custom application integration
## Phase 11: Production Readiness (Priority 4)
### Security Hardening
- [ ] Run as non-root user where possible
- [ ] Minimize container capabilities
- [ ] Add secrets management
- [ ] Configure log sanitization
- [ ] Add audit logging
### Updates & Maintenance
- [ ] Add update notification
- [ ] Create upgrade script
- [ ] Add configuration migration
- [ ] Document breaking changes
- [ ] Create rollback procedure
### Compatibility
- [ ] Test with different Filebeat versions
- [ ] Verify Docker API compatibility
- [ ] Test on different Linux distributions
- [ ] Validate with various log formats
- [ ] Ensure Logstash version compatibility
## Notes
### Design Principles
1. **Minimal configuration**: Just needs LOGSERVER_HOST to work
2. **Docker API access**: Use Docker API for driver-independent log collection
3. **Automatic discovery**: Find all container logs without manual configuration
4. **Reliability first**: Never lose logs, buffer locally if needed
5. **Low overhead**: Minimal resource usage on host
6. **Non-intrusive**: No changes to existing containers needed
7. **Driver flexibility**: Allow containers to use any logging driver (especially `local`)
### Key Requirements
- Must work with zero configuration beyond server address
- Must use Docker API input, not file-based collection
- Must support all Docker logging drivers (local, json-file, etc.)
- Must handle Docker socket permissions properly
- Must be resilient to network failures
- Must not impact host performance significantly
- Must preserve configuration on uninstall
### Testing Checklist
- [ ] Validates with dropshell test-template
- [ ] Connects to logserver successfully
- [ ] Collects Docker logs automatically
- [ ] Collects system logs properly
- [ ] Handles server downtime gracefully
- [ ] Reconnects automatically
- [ ] Resource usage stays within limits
- [ ] Uninstall preserves configuration