Revolutionizing Perfomance Management

Dan Kuebrich

Subscribe to Dan Kuebrich: eMailAlertsEmail Alerts
Get Dan Kuebrich via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

Top Stories by Dan Kuebrich

Our fundamental unit of performance data is the trace, an incredibly rich view into the performance of an individual request moving through your web application. Given all this data and the diversity of the contents of any individual trace, it’s important to have an interface for understanding what exactly was going on when a request was served. How did it get handled? What parts were slow, and what parts were anomalous? Over the past year, the TraceView team has been listening to your thoughts on this topic as well as hatching some of our own. Today we get to share the fruit of our labors: Trace Details, redesigned. RUM, meet trace details. Trace details and RUM are old friends, so it’s no surprise they’re here together now.  But there are a few details that might be surprising to you: Using full-page caching (eg. Varnish, WP Super Cache, …)?  Now you can measure ... (more)

Tracing Celery Performance for Web Applications

Are you using Celery to process Python back-end tasks asynchronously?  Have you wanted to get insight into their resource consumption and efficiency?  Here’s a few useful ways to get insight into Celery performance when running tasks. A simple celery task For a quick review, Celery lets you turn any python method into an asynchronous task.  Here’s a simple one: 1 2 3 4 from celery.task import task @task def add(x, y): return x + y Let’s trace Celery We’ll start with the good stuff. In the latest release of our Python instrumentation, oboeware-1.0, we have an updated API that makes ... (more)

Amazon Outage

You don’t have to be a pre-cog to find and deal with infrastructure and application problems; you just need good monitoring.  We had quite a day Monday during the EC2 EBS availability incident.  Thanks to some early alerts - which started coming in about 2.5 hours before AWS started reporting problems - our ops team was able to intervene and make sure that our customers’ data was safe and sound. I’ll start with screenshots of what we saw and experienced, then get into what metrics to watch and alert on in your environment, as well as how to do so in TraceView. 10:30 AM EST: Incr... (more)

Web Performance at Surge 2012

Surge was my favorite event last year, not only because it’s located in Baltimore (“The Charm City”), but also because it attracts a great crowd of people: hardcore systems practitioners. We were lucky to sponsor Surge this year, and once again it didn’t disappoint! The people you meet at an event like Surge make it worthwhile, but on top of that, there are even talks too! In case you didn’t make it to Surge this year, here were a few of my favorite talks and trends: 1. TCP/SSL Optimization The best optimization for any system is doing less work; so when Bryce Howard, Chief Archi... (more)

Performing Under Pressure | Part 1

Many types of performance problems can result from the load created by concurrent users of web applications, and all too often these scalability bottlenecks go undetected until the application has been deployed in production.  Load-testing, the generation of simulated user requests, is a great way to catch these types of issues before they get out of hand.  Last month I presented about load testing with Canonical's Corey Goldberg at the Boston Python Meetup last week and thought the topic deserved blog discussion as well. In this two-part series, I'll walk through generating lo... (more)