
- Add: GitHub API integration for live version checking in taskserv management - Add: HTTP client configuration option (http.use_curl) in config.defaults.toml - Add: Helper function fetch_latest_version with curl/http get support - Fix: Settings path structure for prov_data_dirpath access pattern - Remove: Legacy simulation code for version checking - Update: Core configuration name from "provisioning-system" to "provisioning" - Clean: Remove obsolete example configs and infrastructure files
486 lines
13 KiB
Markdown
486 lines
13 KiB
Markdown
# Nushell Infrastructure Runtime Task Service
|
|
|
|
> **Security-First Shell Runtime for Cloud Native Infrastructure**
|
|
|
|
A secure, auditable Nushell runtime designed specifically for infrastructure servers, providing powerful scripting capabilities while maintaining strict security controls and operational safety.
|
|
|
|
## 🎯 Purpose
|
|
|
|
The Nushell task service provides a modern, secure shell environment for:
|
|
- **Remote Script Execution**: Safe execution of automation scripts on infrastructure servers
|
|
- **Observability**: Structured data collection for metrics, logs, and telemetry
|
|
- **Infrastructure Management**: Configuration-driven server management and orchestration
|
|
- **Security Auditing**: Complete audit trail of all shell operations and command execution
|
|
|
|
## 🏗️ Architecture
|
|
|
|
### Core Components
|
|
|
|
```
|
|
taskservs/nushell/
|
|
├── default/
|
|
│ ├── prepare # Environment preparation script
|
|
│ ├── install-nushell.sh # Nushell binary installation
|
|
│ ├── env-nushell.j2 # Environment configuration template
|
|
│ ├── config.nu.j2 # Secure Nushell configuration
|
|
│ ├── env.nu.j2 # Environment variables template
|
|
│ └── remote-exec.nu.j2 # Remote execution library
|
|
└── observability/
|
|
├── collect.nu # System metrics collection
|
|
├── process.nu # Log processing and analysis
|
|
└── telemetry.nu # Telemetry and monitoring
|
|
```
|
|
|
|
### Security Model
|
|
|
|
- **Read-Only Mode**: Default execution prevents destructive operations
|
|
- **Command Filtering**: Allowlist/blocklist for command execution
|
|
- **Path Restrictions**: Limited file system access to safe directories
|
|
- **Session Timeouts**: Automatic session termination for security
|
|
- **Audit Logging**: Complete command and operation logging
|
|
- **Resource Limits**: CPU, memory, and network usage controls
|
|
|
|
## 📋 Installation Options
|
|
|
|
### 1. Standalone Installation
|
|
|
|
Install Nushell as a dedicated task service:
|
|
|
|
```bash
|
|
./core/nulib/provisioning taskserv create nushell --infra <infrastructure-name>
|
|
```
|
|
|
|
### 2. Integrated with OS Base Layer
|
|
|
|
Enable during OS installation for automatic deployment:
|
|
|
|
```toml
|
|
# In your infrastructure configuration
|
|
[taskserv]
|
|
install_nushell = true
|
|
nushell_readonly = true
|
|
nushell_plugins = false
|
|
nushell_execution_mode = "restricted"
|
|
```
|
|
|
|
### 3. Conditional Installation by Server Role
|
|
|
|
Configure role-based installation:
|
|
|
|
```toml
|
|
# Control nodes: Always install
|
|
# Worker nodes: Optional based on needs
|
|
[server.control]
|
|
install_nushell = true
|
|
|
|
[server.worker]
|
|
install_nushell = false # or true if needed
|
|
```
|
|
|
|
## ⚙️ Configuration
|
|
|
|
### Security Settings
|
|
|
|
| Setting | Default | Description |
|
|
|---------|---------|-------------|
|
|
| `nushell_readonly` | `true` | Enable read-only mode (blocks write operations) |
|
|
| `nushell_execution_mode` | `"restricted"` | Execution mode: `restricted`, `normal`, `privileged` |
|
|
| `nushell_plugins` | `false` | Enable Nushell plugins (kcl, tera, polars) |
|
|
| `nushell_network` | `false` | Allow network operations |
|
|
| `nushell_audit` | `true` | Enable audit logging |
|
|
| `nushell_session_timeout` | `900` | Session timeout in seconds (15 minutes) |
|
|
|
|
### Resource Limits
|
|
|
|
| Setting | Default | Description |
|
|
|---------|---------|-------------|
|
|
| `nushell_max_memory` | `"256MB"` | Maximum memory usage |
|
|
| `nushell_max_cpu_time` | `"30s"` | Maximum CPU time per command |
|
|
| `nushell_remote_timeout` | `300` | Remote execution timeout (seconds) |
|
|
|
|
### Command Restrictions
|
|
|
|
```toml
|
|
[taskserv]
|
|
nushell_allowed_commands = "ls,cat,grep,ps,df,free,uptime,systemctl,kubectl"
|
|
nushell_blocked_commands = "rm,mv,cp,chmod,chown,sudo,su"
|
|
nushell_allowed_paths = "/tmp,/var/log,/proc,/sys"
|
|
```
|
|
|
|
## 🚀 Usage Examples
|
|
|
|
### Basic System Information
|
|
|
|
```nu
|
|
# Check system health
|
|
use /home/admin/nushell/observability/collect.nu *
|
|
status-check
|
|
|
|
# Collect system metrics
|
|
collect-system-metrics | to json
|
|
|
|
# Monitor for 5 minutes with 30s intervals
|
|
health-monitor --duration 300 --interval 30
|
|
```
|
|
|
|
### Log Analysis
|
|
|
|
```nu
|
|
# Parse and analyze logs
|
|
use /home/admin/nushell/observability/process.nu *
|
|
|
|
# Parse system logs
|
|
cat /var/log/syslog | lines | parse-logs --format syslog
|
|
|
|
# Detect anomalies in recent logs
|
|
collect-logs --since "1h" --level "error" | detect-anomalies --threshold 2.0
|
|
|
|
# Generate comprehensive log summary
|
|
collect-logs --since "24h" | generate-summary --include-patterns --include-anomalies
|
|
```
|
|
|
|
### Remote Execution
|
|
|
|
```nu
|
|
# Execute script with security validation
|
|
use /home/admin/nushell/lib/remote-exec.nu *
|
|
|
|
# Validate script before execution
|
|
nu-validate-script "/tmp/monitoring-script.nu"
|
|
|
|
# Execute with audit logging
|
|
nu-remote-exec "/tmp/monitoring-script.nu" --readonly --audit --timeout 120
|
|
|
|
# Stream live data with filtering
|
|
nu-remote-stream "ps aux" --filter "where cpu > 10" --format json --lines 20
|
|
```
|
|
|
|
### Telemetry Integration
|
|
|
|
```nu
|
|
# Initialize telemetry
|
|
use /home/admin/nushell/observability/telemetry.nu *
|
|
|
|
# Configure telemetry endpoint
|
|
init-telemetry --endpoint "https://monitoring.example.com/api/metrics" --enable-health
|
|
|
|
# Start health monitoring with alerts
|
|
health-monitoring --check-interval 60 --alert-endpoint "https://alerts.example.com/webhook"
|
|
|
|
# Send custom metrics
|
|
let custom_metrics = {app_name: "web-server", response_time: 250, status: "healthy"}
|
|
send-telemetry $custom_metrics --format "prometheus"
|
|
```
|
|
|
|
## 🔧 Integration Patterns
|
|
|
|
### SSH Remote Execution
|
|
|
|
Execute Nushell scripts remotely via SSH:
|
|
|
|
```bash
|
|
# Execute health check on remote server
|
|
ssh server-01 'nu /home/admin/nushell/observability/collect.nu -c "status-check | to json"'
|
|
|
|
# Stream metrics collection
|
|
ssh server-01 'nu -c "use /home/admin/nushell/observability/collect.nu *; health-monitor --duration 60 --interval 10"'
|
|
```
|
|
|
|
### Container Integration
|
|
|
|
Use in containerized environments:
|
|
|
|
```dockerfile
|
|
# Add to your infrastructure containers
|
|
RUN curl -L https://github.com/nushell/nushell/releases/download/0.107.1/nu-0.107.1-x86_64-unknown-linux-gnu.tar.gz | tar xz
|
|
COPY nushell-config/ /home/app/.config/nushell/
|
|
ENV NUSHELL_EXECUTION_MODE=restricted
|
|
ENV NUSHELL_READONLY_MODE=true
|
|
```
|
|
|
|
### Kubernetes Jobs
|
|
|
|
Deploy as Kubernetes jobs for infrastructure tasks:
|
|
|
|
```yaml
|
|
apiVersion: batch/v1
|
|
kind: Job
|
|
metadata:
|
|
name: nushell-health-check
|
|
spec:
|
|
template:
|
|
spec:
|
|
containers:
|
|
- name: nushell
|
|
image: infrastructure/nushell-runtime:latest
|
|
command: ["nu"]
|
|
args: ["/scripts/health-check.nu"]
|
|
env:
|
|
- name: NUSHELL_EXECUTION_MODE
|
|
value: "restricted"
|
|
- name: NUSHELL_AUDIT_ENABLED
|
|
value: "true"
|
|
```
|
|
|
|
## 🛡️ Security Features
|
|
|
|
### Command Validation
|
|
|
|
- **Pre-execution validation**: All commands validated before execution
|
|
- **Pattern matching**: Blocked dangerous command patterns
|
|
- **Path validation**: File access restricted to safe directories
|
|
- **Resource monitoring**: CPU, memory, and network usage tracked
|
|
|
|
### Audit Trail
|
|
|
|
Complete audit logging includes:
|
|
- Command execution timestamps
|
|
- User context and session information
|
|
- Input/output data (configurable)
|
|
- Error conditions and failures
|
|
- Resource usage metrics
|
|
|
|
### Session Management
|
|
|
|
- **Automatic timeouts**: Sessions expire after inactivity
|
|
- **Resource limits**: Memory and CPU usage constraints
|
|
- **Network isolation**: Optional network access controls
|
|
- **Privilege separation**: Non-privileged execution by default
|
|
|
|
## 📊 Monitoring and Observability
|
|
|
|
### Health Checks
|
|
|
|
```nu
|
|
# System health overview
|
|
status-check
|
|
|
|
# Detailed health monitoring
|
|
health-monitor --interval 30 --duration 600 --output "/tmp/health-data.json"
|
|
|
|
# Remote health validation
|
|
nu-health-check
|
|
```
|
|
|
|
### Metrics Collection
|
|
|
|
Automated collection of:
|
|
- **System metrics**: CPU, memory, disk, network
|
|
- **Process metrics**: Running processes, resource usage
|
|
- **Container metrics**: Docker/Podman container status
|
|
- **Kubernetes metrics**: Pod status, resource utilization
|
|
|
|
### Log Processing
|
|
|
|
Advanced log analysis capabilities:
|
|
- **Multi-format parsing**: JSON, syslog, Apache, Nginx
|
|
- **Pattern extraction**: Error patterns, IP addresses, URLs
|
|
- **Anomaly detection**: Statistical analysis of log patterns
|
|
- **Time-series aggregation**: Bucketed metrics over time
|
|
|
|
## 🔄 Deployment Strategies
|
|
|
|
### Development/Testing
|
|
|
|
```toml
|
|
[taskserv]
|
|
install_nushell = true
|
|
nushell_readonly = false
|
|
nushell_plugins = true
|
|
nushell_execution_mode = "normal"
|
|
nushell_network = true
|
|
```
|
|
|
|
### Production
|
|
|
|
```toml
|
|
[taskserv]
|
|
install_nushell = true
|
|
nushell_readonly = true
|
|
nushell_plugins = false
|
|
nushell_execution_mode = "restricted"
|
|
nushell_network = false
|
|
nushell_audit = true
|
|
nushell_session_timeout = 600
|
|
```
|
|
|
|
### High-Security Environments
|
|
|
|
```toml
|
|
[taskserv]
|
|
install_nushell = true
|
|
nushell_readonly = true
|
|
nushell_plugins = false
|
|
nushell_execution_mode = "restricted"
|
|
nushell_network = false
|
|
nushell_audit = true
|
|
nushell_session_timeout = 300
|
|
nushell_allowed_commands = "ls,cat,ps,df,free"
|
|
nushell_blocked_commands = "rm,mv,cp,chmod,chown,sudo,su,wget,curl"
|
|
```
|
|
|
|
## 🚨 Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
**1. Permission Denied Errors**
|
|
```nu
|
|
# Check file permissions
|
|
ls -la /home/admin/nushell/
|
|
# Verify ownership
|
|
ls -la ~/.config/nushell/
|
|
```
|
|
|
|
**2. Command Blocked in Read-Only Mode**
|
|
```nu
|
|
# Check execution mode
|
|
echo $env.NUSHELL_EXECUTION_MODE
|
|
# Review blocked commands
|
|
echo $env.NUSHELL_BLOCKED_COMMANDS
|
|
```
|
|
|
|
**3. Session Timeout Issues**
|
|
```nu
|
|
# Check session timeout setting
|
|
echo $env.NUSHELL_SESSION_TIMEOUT
|
|
# Review audit log for session activity
|
|
cat ~/nushell/audit.log | tail -20
|
|
```
|
|
|
|
**4. Plugin Loading Failures**
|
|
```nu
|
|
# Register plugins manually
|
|
nu ~/nushell/scripts/register-plugins.nu
|
|
# Check plugin availability
|
|
plugin list
|
|
```
|
|
|
|
### Debug Mode
|
|
|
|
Enable debug logging for troubleshooting:
|
|
|
|
```bash
|
|
# Set debug environment
|
|
export NUSHELL_LOG_LEVEL=debug
|
|
export NU_LOG_LEVEL=debug
|
|
|
|
# Run with verbose output
|
|
nu --log-level debug /path/to/script.nu
|
|
```
|
|
|
|
### Performance Issues
|
|
|
|
Monitor resource usage:
|
|
|
|
```nu
|
|
# Check memory usage
|
|
free -h
|
|
|
|
# Monitor CPU usage
|
|
top -p $(pgrep nu)
|
|
|
|
# Review resource limits
|
|
echo $env.NUSHELL_MAX_MEMORY
|
|
echo $env.NUSHELL_MAX_CPU_TIME
|
|
```
|
|
|
|
## 📈 Performance Considerations
|
|
|
|
### Resource Usage
|
|
|
|
- **Memory footprint**: ~50-100MB base usage
|
|
- **CPU overhead**: Minimal during idle, scales with workload
|
|
- **Disk usage**: ~20MB for binary + configs
|
|
- **Network impact**: Controlled by security settings
|
|
|
|
### Optimization Tips
|
|
|
|
1. **Use streaming operations** for large datasets
|
|
2. **Enable read-only mode** for security and performance
|
|
3. **Limit session timeouts** to prevent resource leaks
|
|
4. **Use structured data formats** (JSON, YAML) for efficiency
|
|
5. **Batch telemetry data** to reduce network overhead
|
|
|
|
## 🔗 Integration with Other Services
|
|
|
|
### KCL Configuration Language
|
|
|
|
Optional integration with KCL for configuration validation:
|
|
|
|
```nu
|
|
# Install KCL plugin (if enabled)
|
|
plugin add nu_plugin_kcl
|
|
|
|
# Validate KCL configurations
|
|
kcl run infrastructure.k | to json | from json
|
|
```
|
|
|
|
### Monitoring Systems
|
|
|
|
Compatible with popular monitoring solutions:
|
|
- **Prometheus**: Native metrics export format
|
|
- **InfluxDB**: Line protocol support
|
|
- **Grafana**: JSON data source integration
|
|
- **ELK Stack**: Structured log output
|
|
|
|
### CI/CD Pipelines
|
|
|
|
Use in automation workflows:
|
|
|
|
```yaml
|
|
# GitHub Actions example
|
|
- name: Infrastructure Health Check
|
|
run: |
|
|
ssh ${{ secrets.SERVER_HOST }} 'nu -c "
|
|
use /home/admin/nushell/observability/collect.nu *;
|
|
status-check | to json
|
|
"'
|
|
```
|
|
|
|
## 📚 Best Practices
|
|
|
|
### Security
|
|
|
|
1. **Always use read-only mode** in production
|
|
2. **Implement session timeouts** appropriate for your environment
|
|
3. **Enable audit logging** for compliance and debugging
|
|
4. **Restrict network access** unless specifically required
|
|
5. **Regularly review command allowlists** and access patterns
|
|
|
|
### Performance
|
|
|
|
1. **Use streaming operations** for processing large datasets
|
|
2. **Implement data pagination** for long-running queries
|
|
3. **Cache frequently accessed data** using structured formats
|
|
4. **Monitor resource usage** and adjust limits as needed
|
|
|
|
### Operational
|
|
|
|
1. **Document custom scripts** and their intended usage
|
|
2. **Test scripts in development** before production deployment
|
|
3. **Implement proper error handling** in automation scripts
|
|
4. **Use version control** for custom observability scripts
|
|
5. **Regular security audits** of command patterns and access logs
|
|
|
|
## 🆕 Version History
|
|
|
|
- **v1.0.0** (2025-09-23): Initial release with core security features
|
|
- Secure configuration templates
|
|
- Remote execution library
|
|
- Observability and telemetry integration
|
|
- OS base layer integration
|
|
- Comprehensive audit logging
|
|
|
|
## 🤝 Contributing
|
|
|
|
This task service is part of the cloud-native provisioning system. For improvements or issues:
|
|
|
|
1. Follow the project's architecture principles (PAP)
|
|
2. Ensure all changes maintain security-first approach
|
|
3. Test configurations in development environments
|
|
4. Document security implications of changes
|
|
5. Include appropriate test coverage for new features
|
|
|
|
## 📄 License
|
|
|
|
Part of the cloud-native provisioning system. See project license for details. |