About Corey Quinn

Over the course of my career, I’ve worn many different hats in the tech world: systems administrator, systems engineer, director of technical operations, and director of DevOps, to name a few. Today, I’m a cloud economist at The Duckbill Group, the author of the weekly Last Week in AWS newsletter, and the host of two podcasts: Screaming in the Cloud and, you guessed it, AWS Morning Brief, which you’re about to listen to.

Links

CHAOSSEARCH@QuinnyPigChris Short’s LinkedIn: https://www.linkedin.com/in/thechrisshort/


Transcript

Corey: Welcome to AWS Morning Brief: Whiteboard Confessional. I’m Cloud Economist Corey Quinn. This weekly show exposes the semi-polite lie that is whiteboard architecture diagrams. You see, a child can draw a whiteboard architecture, but the real world is a mess. We discuss the hilariously bad decisions that make it into shipping products, the unfortunate hacks the real-world forces us to build, and that the best to call your staging environment is “theory”. Because invariably whatever you’ve built works in the theory, but not in production. Let’s get to it.



This episode is sponsored in part by ParkMyCloud, fellow worshipers at the altar of turned out [BLEEP] off. ParkMyCloud makes it easy for you to ensure you're using public cloud like the utility it's meant to be. just like water and electricity, You pay for most cloud resources when they're turned on, whether or not you're using them. Just like water and electricity, keep them away from the other computers. Use ParkMyCloud to automatically identify and eliminate wasted cloud spend from idle, oversized, and unnecessary resources. It's easy to use and start reducing your cloud bills. get started for free at parkmycloud.com/screaming.



When you're building on a given cloud provider, you're always going to have concerns. If you're building on top of Azure, for example, you're worried your licenses might lapse. If you're building on top of GCP, you're terrified that they're going to deprecate all of GCP before you get your application out the door. If you're building on Oracle Cloud, you're terrified, they'll figure out where you live and send a squadron of attorneys to sue you just on general principle. And if you build on AWS, you're constantly living in fear, at least in a personal account, that they're going to surprise you with a bill that's monstrous.



Today, I want to talk about a particular failure that a friend of this podcast named Chris Short experienced. Chris is not exactly a rank neophyte to the world of Cloud. He currently works at IBM Hat, which I'm told is the post-merger name. He was deep in the Ansible community. He's a Cloud Native Computing Foundation Ambassador, which means that every third word out of his mouth is now contractually obligated to be Kubernetes.



He was building out a static website hosting environment in his AWS account, and it was costing him between $10 and $30 a month. That is right aligned with what I tend to cost. And he wound up getting his bill at the end of the month: “Welcome to July, time to get your bill,” and it was a bit higher. Instead of $30, or even $40 a month, it was $2700. And now there was actual poop found in his pants.



This is a trivial amount of money to most companies, even a small company, and I say this from personal experience, runs on burning piles of money. However, a personal account is a very different thing. This is more than most people's mortgage payments if you don't make terrible decisions like I do, and live in San Francisco. This is an awful lot of money, and his immediate response was equivalent to mine. First, he opened a ticket with AWS support, which is an okay thing to do. Then he immediately turned to Twitter, which is the better thing to do because it means that suddenly these stories wind up in the public eye.



I found out roughly 10 seconds later, as my notifications blew up with everyone saying, “Hey, have you met Corey?” Yes, Chris and I know each other. We're friends. He wrote the DevOps’ish newsletter for a long time, and the secret cabal of DevOps-y type newsletters runs deep. We secretly run all kinds of things that aren't the billing system for cloud providers.



So, he hits the batphone. I log into his account once we get a credential exchange going, and I start poking around because, yeah, generally speaking, 100x bill increase isn't typical. And what I found was astonishing. He was effectively only running a static site with S3 in this account making the contents publicly available, which is normal. This is a stated use case for S3, despite the fact that the console is going to shriek it's damn fool head off at you at every opportunity, that you have exposed an S3 bucket to the world.



Well, yes, that is one of its purposes. It is designed to stand there, or sit there depending on what a bucket does—lay there, perhaps—and provide a static website to the world. Now, in a two-day span, someone or something downloaded data from this bucket, which is normal, but it was 30 terabytes of data, which is not. At approximately nine cents a gigabyte, this adds up to something rather substantial, specifically after free tier limits are exhausted, that's right: $2700.



Now, the typical responses to what people should do to avoid bill shocks like this don't actually work. “Well, he should have set up a billing alarm.” Yeah, aspirationally the AWS billing system runs on an eight-hour eventual consistency model, which means that at the time the bill starts spiking. He has at least 8 hours, and in some cases as many as 24 to 48, before those billing alarms would detect. The entire problem took less time than that.



So, at that point, it would be alerting after something had already happened. “Oh, he shouldn't have had the bucket available to the outside world.” Well, as it turns out, he was fronting this bucket with CloudFlare. But what he hadn't done is restrict bucket access to CloudFlare’s endpoints, and for good reason. There's no way to say, “Oh, CloudFlare’s, identity is going to be defined in an IAM managed policy.” He has to explicitly list out all of CloudFlare’s IP ranges, and hope and trust that those IP ranges will never change despite whatever networking enhancements CloudFlare makes, it's a game of guess and check and having to build an automated system around this. Again, all he wanted to do was share a static website. I've done this myself. I continue to do this myself and it costs me, on a busy month, pennies. In some rare cases, dozens of pennies.



Corey: This episode is sponsored in part by ChaosSearch. Now their name isn’t in all caps, so they’re definitely worth talking to. What is ChaosSearch? A scalable log analysis service that lets you add new workloads in minutes, not days or weeks. Click. Boom. Done. ChaosSearch is for y...

Twitter Mentions