infrastructure-monitoring

Set up comprehensive infrastructure monitoring with Prometheus, Grafana, and alerting systems for metrics, health checks, and performance tracking.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "infrastructure-monitoring" with this command: npx skills add aj-geddes/useful-ai-prompts/aj-geddes-useful-ai-prompts-infrastructure-monitoring

Infrastructure Monitoring

Table of Contents

Overview

Implement comprehensive infrastructure monitoring to track system health, performance metrics, and resource utilization with alerting and visualization across your entire stack.

When to Use

  • Real-time performance monitoring
  • Capacity planning and trends
  • Incident detection and alerting
  • Service health tracking
  • Resource utilization analysis
  • Performance troubleshooting
  • Compliance and audit trails
  • Historical data analysis

Quick Start

Minimal working example:

# prometheus.yml
global:
  scrape_interval: 15s
  evaluation_interval: 15s
  external_labels:
    monitor: "infrastructure-monitor"
    environment: "production"

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
            - localhost:9093

# Rule files
rule_files:
  - "alerts.yml"
  - "rules.yml"

scrape_configs:
  # Prometheus itself
  - job_name: "prometheus"
    static_configs:
      - targets: ["localhost:9090"]
// ... (see reference guides for full implementation)

Reference Guides

Detailed implementations in the references/ directory:

GuideContents
Prometheus ConfigurationPrometheus Configuration
Alert RulesAlert Rules
Alertmanager ConfigurationAlertmanager Configuration
Grafana DashboardGrafana Dashboard
Monitoring DeploymentMonitoring Deployment

Best Practices

✅ DO

  • Follow established patterns and conventions
  • Write clean, maintainable code
  • Add appropriate documentation
  • Test thoroughly before deploying

❌ DON'T

  • Skip testing or validation
  • Ignore error handling
  • Hard-code configuration values

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

nodejs-express-server

No summary provided by upstream source.

Repository SourceNeeds Review
General

markdown-documentation

No summary provided by upstream source.

Repository SourceNeeds Review
General

rest-api-design

No summary provided by upstream source.

Repository SourceNeeds Review
General

architecture-diagrams

No summary provided by upstream source.

Repository SourceNeeds Review