Incidents

Track and manage incidents in your system.

What is an Incident?

An incident is any event that disrupts normal service operation. In Uptinio, incidents can be triggered by:

Monitors: When a monitored endpoint fails health checks
Servers: When server metrics exceed thresholds or the server becomes unreachable

HTTP/HTTPS Failures: When a website or API endpoint is unreachable or returns error responses
SSL Certificate Issues: When SSL certificates are invalid or about to expire
Domain Expiry: When monitored domains are approaching expiration dates
Keyword Monitoring: When expected content is missing from responses
TCP Port Issues: When monitored ports become unreachable
Ping Failures: When basic connectivity checks fail

Server Downtime: When the server becomes unreachable or stops sending metrics
Resource Thresholds:
- CPU usage exceeds configured threshold
- Memory (RAM) usage exceeds configured threshold
- Disk space usage exceeds configured threshold
- Network traffic (in/out) exceeds configured thresholds

Detection: System automatically detects issues based on configured thresholds
Creation: Incident is created with initial status and details
Notification: Configured integrations are notified (email, Slack, webhooks)
Investigation: Team investigates and updates incident status
Resolution: Issue is resolved and incident is marked as resolved
Post-mortem: Review incident details to identify areas for improvement and prevent similar issues in the future