Jesús Pérez 3c3ef47f7f

feat(taskserv): implement real-time version checking with configurable HTTP client

- Add: GitHub API integration for live version checking in taskserv management
- Add: HTTP client configuration option (http.use_curl) in config.defaults.toml
- Add: Helper function fetch_latest_version with curl/http get support
- Fix: Settings path structure for prov_data_dirpath access pattern
- Remove: Legacy simulation code for version checking
- Update: Core configuration name from "provisioning-system" to "provisioning"
- Clean: Remove obsolete example configs and infrastructure files

2025-09-24 01:55:06 +01:00

13 KiB

Raw Blame History

Nushell Infrastructure Runtime Task Service

Security-First Shell Runtime for Cloud Native Infrastructure

A secure, auditable Nushell runtime designed specifically for infrastructure servers, providing powerful scripting capabilities while maintaining strict security controls and operational safety.

🎯 Purpose

The Nushell task service provides a modern, secure shell environment for:

Remote Script Execution: Safe execution of automation scripts on infrastructure servers
Observability: Structured data collection for metrics, logs, and telemetry
Infrastructure Management: Configuration-driven server management and orchestration
Security Auditing: Complete audit trail of all shell operations and command execution

🏗️ Architecture

Core Components

taskservs/nushell/
├── default/
│   ├── prepare                    # Environment preparation script
│   ├── install-nushell.sh        # Nushell binary installation
│   ├── env-nushell.j2            # Environment configuration template
│   ├── config.nu.j2              # Secure Nushell configuration
│   ├── env.nu.j2                 # Environment variables template
│   └── remote-exec.nu.j2         # Remote execution library
└── observability/
    ├── collect.nu                # System metrics collection
    ├── process.nu                # Log processing and analysis
    └── telemetry.nu              # Telemetry and monitoring

Security Model

Read-Only Mode: Default execution prevents destructive operations
Command Filtering: Allowlist/blocklist for command execution
Path Restrictions: Limited file system access to safe directories
Session Timeouts: Automatic session termination for security
Audit Logging: Complete command and operation logging
Resource Limits: CPU, memory, and network usage controls

📋 Installation Options

1. Standalone Installation

Install Nushell as a dedicated task service:

./core/nulib/provisioning taskserv create nushell --infra <infrastructure-name>

2. Integrated with OS Base Layer

Enable during OS installation for automatic deployment:

# In your infrastructure configuration
[taskserv]
install_nushell = true
nushell_readonly = true
nushell_plugins = false
nushell_execution_mode = "restricted"

3. Conditional Installation by Server Role

Configure role-based installation:

# Control nodes: Always install
# Worker nodes: Optional based on needs
[server.control]
install_nushell = true

[server.worker]
install_nushell = false  # or true if needed

⚙️ Configuration

Security Settings

Setting	Default	Description
`nushell_readonly`	`true`	Enable read-only mode (blocks write operations)
`nushell_execution_mode`	`"restricted"`	Execution mode: `restricted`, `normal`, `privileged`
`nushell_plugins`	`false`	Enable Nushell plugins (kcl, tera, polars)
`nushell_network`	`false`	Allow network operations
`nushell_audit`	`true`	Enable audit logging
`nushell_session_timeout`	`900`	Session timeout in seconds (15 minutes)

Resource Limits

Setting	Default	Description
`nushell_max_memory`	`"256MB"`	Maximum memory usage
`nushell_max_cpu_time`	`"30s"`	Maximum CPU time per command
`nushell_remote_timeout`	`300`	Remote execution timeout (seconds)

Command Restrictions

[taskserv]
nushell_allowed_commands = "ls,cat,grep,ps,df,free,uptime,systemctl,kubectl"
nushell_blocked_commands = "rm,mv,cp,chmod,chown,sudo,su"
nushell_allowed_paths = "/tmp,/var/log,/proc,/sys"

🚀 Usage Examples

Basic System Information

# Check system health
use /home/admin/nushell/observability/collect.nu *
status-check

# Collect system metrics
collect-system-metrics | to json

# Monitor for 5 minutes with 30s intervals
health-monitor --duration 300 --interval 30

Log Analysis

# Parse and analyze logs
use /home/admin/nushell/observability/process.nu *

# Parse system logs
cat /var/log/syslog | lines | parse-logs --format syslog

# Detect anomalies in recent logs
collect-logs --since "1h" --level "error" | detect-anomalies --threshold 2.0

# Generate comprehensive log summary
collect-logs --since "24h" | generate-summary --include-patterns --include-anomalies

Remote Execution

# Execute script with security validation
use /home/admin/nushell/lib/remote-exec.nu *

# Validate script before execution
nu-validate-script "/tmp/monitoring-script.nu"

# Execute with audit logging
nu-remote-exec "/tmp/monitoring-script.nu" --readonly --audit --timeout 120

# Stream live data with filtering
nu-remote-stream "ps aux" --filter "where cpu > 10" --format json --lines 20

Telemetry Integration

# Initialize telemetry
use /home/admin/nushell/observability/telemetry.nu *

# Configure telemetry endpoint
init-telemetry --endpoint "https://monitoring.example.com/api/metrics" --enable-health

# Start health monitoring with alerts
health-monitoring --check-interval 60 --alert-endpoint "https://alerts.example.com/webhook"

# Send custom metrics
let custom_metrics = {app_name: "web-server", response_time: 250, status: "healthy"}
send-telemetry $custom_metrics --format "prometheus"

🔧 Integration Patterns

SSH Remote Execution

Execute Nushell scripts remotely via SSH:

# Execute health check on remote server
ssh server-01 'nu /home/admin/nushell/observability/collect.nu -c "status-check | to json"'

# Stream metrics collection
ssh server-01 'nu -c "use /home/admin/nushell/observability/collect.nu *; health-monitor --duration 60 --interval 10"'

Container Integration

Use in containerized environments:

# Add to your infrastructure containers
RUN curl -L https://github.com/nushell/nushell/releases/download/0.107.1/nu-0.107.1-x86_64-unknown-linux-gnu.tar.gz | tar xz
COPY nushell-config/ /home/app/.config/nushell/
ENV NUSHELL_EXECUTION_MODE=restricted
ENV NUSHELL_READONLY_MODE=true

Kubernetes Jobs

Deploy as Kubernetes jobs for infrastructure tasks:

apiVersion: batch/v1
kind: Job
metadata:
  name: nushell-health-check
spec:
  template:
    spec:
      containers:
      - name: nushell
        image: infrastructure/nushell-runtime:latest
        command: ["nu"]
        args: ["/scripts/health-check.nu"]
        env:
        - name: NUSHELL_EXECUTION_MODE
          value: "restricted"
        - name: NUSHELL_AUDIT_ENABLED
          value: "true"

🛡️ Security Features

Command Validation

Pre-execution validation: All commands validated before execution
Pattern matching: Blocked dangerous command patterns
Path validation: File access restricted to safe directories
Resource monitoring: CPU, memory, and network usage tracked

Audit Trail

Complete audit logging includes:

Command execution timestamps
User context and session information
Input/output data (configurable)
Error conditions and failures
Resource usage metrics

Session Management

Automatic timeouts: Sessions expire after inactivity
Resource limits: Memory and CPU usage constraints
Network isolation: Optional network access controls
Privilege separation: Non-privileged execution by default

📊 Monitoring and Observability

Health Checks

# System health overview
status-check

# Detailed health monitoring
health-monitor --interval 30 --duration 600 --output "/tmp/health-data.json"

# Remote health validation
nu-health-check

Metrics Collection

Automated collection of:

System metrics: CPU, memory, disk, network
Process metrics: Running processes, resource usage
Container metrics: Docker/Podman container status
Kubernetes metrics: Pod status, resource utilization

Log Processing

Advanced log analysis capabilities:

Multi-format parsing: JSON, syslog, Apache, Nginx
Pattern extraction: Error patterns, IP addresses, URLs
Anomaly detection: Statistical analysis of log patterns
Time-series aggregation: Bucketed metrics over time

🔄 Deployment Strategies

Development/Testing

[taskserv]
install_nushell = true
nushell_readonly = false
nushell_plugins = true
nushell_execution_mode = "normal"
nushell_network = true

Production

[taskserv]
install_nushell = true
nushell_readonly = true
nushell_plugins = false
nushell_execution_mode = "restricted"
nushell_network = false
nushell_audit = true
nushell_session_timeout = 600

High-Security Environments

[taskserv]
install_nushell = true
nushell_readonly = true
nushell_plugins = false
nushell_execution_mode = "restricted"
nushell_network = false
nushell_audit = true
nushell_session_timeout = 300
nushell_allowed_commands = "ls,cat,ps,df,free"
nushell_blocked_commands = "rm,mv,cp,chmod,chown,sudo,su,wget,curl"

🚨 Troubleshooting

Common Issues

1. Permission Denied Errors

# Check file permissions
ls -la /home/admin/nushell/
# Verify ownership
ls -la ~/.config/nushell/

2. Command Blocked in Read-Only Mode

# Check execution mode
echo $env.NUSHELL_EXECUTION_MODE
# Review blocked commands
echo $env.NUSHELL_BLOCKED_COMMANDS

3. Session Timeout Issues

# Check session timeout setting
echo $env.NUSHELL_SESSION_TIMEOUT
# Review audit log for session activity
cat ~/nushell/audit.log | tail -20

4. Plugin Loading Failures

# Register plugins manually
nu ~/nushell/scripts/register-plugins.nu
# Check plugin availability
plugin list

Debug Mode

Enable debug logging for troubleshooting:

# Set debug environment
export NUSHELL_LOG_LEVEL=debug
export NU_LOG_LEVEL=debug

# Run with verbose output
nu --log-level debug /path/to/script.nu

Performance Issues

Monitor resource usage:

# Check memory usage
free -h

# Monitor CPU usage
top -p $(pgrep nu)

# Review resource limits
echo $env.NUSHELL_MAX_MEMORY
echo $env.NUSHELL_MAX_CPU_TIME

📈 Performance Considerations

Resource Usage

Memory footprint: ~50-100MB base usage
CPU overhead: Minimal during idle, scales with workload
Disk usage: ~20MB for binary + configs
Network impact: Controlled by security settings

Optimization Tips

Use streaming operations for large datasets
Enable read-only mode for security and performance
Limit session timeouts to prevent resource leaks
Use structured data formats (JSON, YAML) for efficiency
Batch telemetry data to reduce network overhead

🔗 Integration with Other Services

KCL Configuration Language

Optional integration with KCL for configuration validation:

# Install KCL plugin (if enabled)
plugin add nu_plugin_kcl

# Validate KCL configurations
kcl run infrastructure.k | to json | from json

Monitoring Systems

Compatible with popular monitoring solutions:

Prometheus: Native metrics export format
InfluxDB: Line protocol support
Grafana: JSON data source integration
ELK Stack: Structured log output

CI/CD Pipelines

Use in automation workflows:

# GitHub Actions example
- name: Infrastructure Health Check
  run: |
    ssh ${{ secrets.SERVER_HOST }} 'nu -c "
      use /home/admin/nushell/observability/collect.nu *;
      status-check | to json
    "'

📚 Best Practices

Security

Always use read-only mode in production
Implement session timeouts appropriate for your environment
Enable audit logging for compliance and debugging
Restrict network access unless specifically required
Regularly review command allowlists and access patterns

Performance

Use streaming operations for processing large datasets
Implement data pagination for long-running queries
Cache frequently accessed data using structured formats
Monitor resource usage and adjust limits as needed

Operational

Document custom scripts and their intended usage
Test scripts in development before production deployment
Implement proper error handling in automation scripts
Use version control for custom observability scripts
Regular security audits of command patterns and access logs

🆕 Version History

v1.0.0 (2025-09-23): Initial release with core security features
- Secure configuration templates
- Remote execution library
- Observability and telemetry integration
- OS base layer integration
- Comprehensive audit logging

🤝 Contributing

This task service is part of the cloud-native provisioning system. For improvements or issues:

Follow the project's architecture principles (PAP)
Ensure all changes maintain security-first approach
Test configurations in development environments
Document security implications of changes
Include appropriate test coverage for new features

📄 License

Part of the cloud-native provisioning system. See project license for details.

13 KiB Raw Blame History