Events versus logs for serverless

Found via Four short links (side note: this feed should be on your must-read list. Consistently superb.), a fascinating Twitter thread from Charity Majors:

In distributed systems, the hardest part is often not finding the bug in your code, but tracking down which component is actually the source of the problem so you know what code to look at. Or finding the requests that exhibit the bug, and deducing what they all have in common.

The most effective way to structure your instrumentation, so you get the maximum bang for your buck, is to emit a single arbitrarily wide event per request per service hop. We’re talking wiiiide. We usually see 200-500 dimensions in a mature app. But just one write.

🎀 Any and all unique identifying bits you can get your paws on: UUID, request ID, shopping cart ID, any other ID <<- HIGHEST VALUE DETAILS

🎀 Any other useful application context, starting with service name

🎀 Possibly system resource state at point in time e.g. /proc/net/ipv4

The entire thread is worth at least two read-throughs. I’m still pondering.

For me, the current team is more about structured logs into Splunk and extracting metrics and call geometry from UUID and spans, so the idea of a 200-500 element event per call is new, compelling and … feels correct. Like, I need to figure out how to start doing this awesome new thing here too. Especially for serverless, where you can’t log into the server and poke around; all you have are logs and/or events.

Posted by

Computer engineer from San Diego. Obsessed with hardware, software, timekeeping and elegance.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.