1. Nobody "likes" getting alerts. Best case, it tells you something went (or is about to go) wrong. But more often they're are meaningless, trivial, or just plain wrong - a source of constant interruptions, false alarms, unplanned work, and noise.

    While some say this is the inherent nature of alerts (and monitoring in general) the truth that well-crafted alerts based on insightful monitoring are a gift - saving hours of investigation and thousands of dollars.

    Whether your organization views alerts a curse or a blessing depends on the design and implementation of those alerts, more so than any specific monitoring tool or technique. And, like most things in technology, good design can be taught and learned.

    In this talk, I'll give a brief tour of the alerting hall of horrors, and then provide real-world, vendor-agnostic techniques to make alerts meaningful, effective, valuable, and actionable (as a bonus, I'll show how to make them manageable, too!). By breaking a few bad habits; understanding how and why vendors put their tools together in particular ways; and learning a few new concepts, you'll have people emailing you to say "thank goodness I got that alert!".

    Now there's something you probably don't hear every day.

    # vimeo.com/843992066 Uploaded 129 Views 0 Comments
  2. How do we make use of the increasing volume of observability data that we collect? Observability was inspired by control theory, but current observability solutions are missing a key element of that theory: the system model. We can’t understand the state of the system, or ‘answer unknown unknowns’ if we don’t know how the system works. We’re drowning in data but starved of answers! Discover how graphs (think social network graph, not line graph) and Open Telemetry semantic conventions can help us connect the dots.

    # vimeo.com/843992979 Uploaded 58 Views 0 Comments
  3. It's important to have consistent data across an organisation but to insure data consistency, we have to insure the people responsible for producing that data have a common understanding. In implementing an observability framework for Ikea, we had to overcome not only technical hurdles but issues with taxonomy, semantics and language barriers to insure a common understanding among teams. I want to share my experiences as a senior engineer in an observability pipeline team and how we slowed down to speed up our company's observability journey.

    # vimeo.com/843999151 Uploaded 86 Views 0 Comments
  4. Shopify served 75.98 million requests per minute during Black Friday/Cyber Monday 2022, and our OpenResty deployments handled each of these requests before they hit an application server (OpenResty is a technology that lets you embed arbitrary Lua scripts into NGINX configuration files). Until recently, our routing stack was completely untraced, which left a huge blind spot in our view of our infrastructure.

    In 2022, we finally implemented tracing in our OpenResty deployments, and it wasn’t easy. In this talk, I’ll describe how we got a working tracing implementation. Along the way, I’ll explain the dangers of custom trace propagation formats, the joys of working in a well-specified open source project, the wonders (and challenges) of the OpenResty runtime, and the mental challenges that accompany the modification of NGINX, that famously performant HTTP server and reverse proxy.

    # vimeo.com/843997148 Uploaded 25 Views 0 Comments
  5. Advances in monitoring and observability have given so many of us the confidence that we knowing what's happening in every corner of our systems, but for many teams, one system facet remains stubbornly un-observable: cost. This talk will tell the story of how a surprise giant AWS bill sent our growing startup engineering team on a mission to be able to observe our cloud spend with the same clarity and immediacy as performance, reliability, or any other important system characteristic. I’ll walk you the techniques we’ve attempted to observe cost, sharing the pros and cons of each, and I’ll also talk about how we’ve used this data as a basis for re-shaping team practices, building out an internal training and support program that has helped our whole engineering organization get fluent in balancing cost against our other operational and business concerns.

    # vimeo.com/843992516 Uploaded 71 Views 0 Comments

Monitorama PDX 2023 - Portland, OR

Monitorama

Browse This Channel

Channels are a simple, beautiful way to showcase and watch videos. Browse more Channels.