Dynatrace - What Is Observability

Dynatrace’s treatment of observability going beyond the three pillars — addressing unknown unknowns, AIOps integration, and user experience as observability dimensions.

Summary

The article defines observability as the ability to measure the internal state of a system from its external outputs — a framing derived from control theory. The key distinction from monitoring is that monitoring answers “is something wrong?” (known knowns: predefined metrics and thresholds), while observability answers “why is something wrong?” (unknown unknowns: ad hoc exploration of data without predefined queries). This distinction matters because modern distributed systems fail in ways that weren’t anticipated when dashboards were built.

While logs, metrics, and traces are necessary pillars of observability, Dynatrace argues they are not sufficient. AIOps integration adds automated anomaly detection that can surface unknown-unknown failures without requiring engineers to manually explore all possible failure dimensions. SLO-based observability shifts alerting from technical metrics (CPU usage, error rates) to user-facing service level objectives — aligning operations with business outcomes. User experience is treated as a first-class observability dimension: if the system is technically healthy but users are experiencing slowness, observability has failed.

The “single source of truth” goal is the aspirational state: all telemetry unified into a coherent model that allows any question about system behavior to be answered without context-switching between tools.

Key Arguments

Monitoring answers “is something wrong?”; observability answers “why is something wrong?” — the difference is unknown unknowns
The three pillars (logs, metrics, traces) are necessary but not sufficient for true observability
AIOps automates anomaly detection, enabling discovery of unknown-unknown failures without predefined queries
SLO-based observability aligns operations with user-facing outcomes rather than technical metrics
User experience must be included as an observability dimension — technical health ≠ user experience health
Observability enables proactive operations (catch problems before users report them) vs. reactive operations (respond to incidents)

Concepts Covered

Observability — comprehensive treatment; monitoring vs. observability distinction is a key contribution
Distributed Tracing — traces as the third pillar; context propagation across services
Quality Attributes — reliability and performance as quality attributes that observability supports
Self-Healing Systems — AIOps connection; automated detection as a step toward self-healing

Quality Notes

Vendor perspective (Dynatrace) but reasonably balanced. Best used for the monitoring vs. observability conceptual distinction and the argument that three pillars are necessary but not sufficient. Cross-reference with Honeycomb’s treatment for the high-cardinality angle.

software-architecture-design

Explorer

Dynatrace - What Is Observability

Dynatrace - What Is Observability

Summary

Key Arguments

Concepts Covered

Quality Notes

Graph View

Table of Contents

Backlinks