The complexity and dynamism of distributed systems reflect both the teams that develop them and the products they support. Our observability tools need to be dynamic enough to keep up with our changing infrastructure, flexible enough to take in observability data of all types, yet understand that data well enough to show us only what we care about.
Knowledge graphs have become an increasingly popular method for storing data as they can explicitly but also flexibly encode the relationships between entities. Using both domain knowledge and user interaction data we can train models to encode the vast amounts of data produced by observability tools into a knowledge graph, discover how they are interlinked, and imbue them with meaning.
This talk will go into the details of how to construct these knowledge graphs, dubbed Observability Graphs, to reflect the innate structure (but also the uncertainty) in today’s dynamic infrastructure. Then we will show how the knowledge graphs can be used to power automated alerting, alert clustering, and automated root-cause analysis.
The complexity of the systems we are responsible for has grown significantly over the last decade. High performing teams use monitoring and observability to increase the reliability of these systems. Come along one practitioner's journey and let his observations guide the way you think about, and change, observability practices in your company.
This talk starts with my early experience on a team struggling with monitoring our new fangled cloud. That experience inspired me to become part of the industry designing and developing the next-generation of monitoring systems. I’ll discuss how every next-gen system in-turn inspires the next monitoring -- and Observability -- paradigm. Let my observations on observability help you think about -- and change -- Observability practices in your company.