About the guest:

Adrian Cockcroft is a technologist and strategist with broad experience from the bits to the boardroom, in both enterprise and consumer-oriented businesses, from startups to some of the largest companies in the world. He is equally at home with hardware and software, development, and operations. He’s best known as the cloud architect for Netflix during their trailblazing migration to AWS and was a very early practitioner and advocate of DevOps, microservices, and chaos engineering, helping bring these concepts to the wider audience they have today.

Find our guest on:

MastodonMediumGitHub

Find us on:

On-Call Me Maybe Podcast TwitterOn-Call Me Maybe Podcast LinkedIn PageOn-Call Me Maybe Podcast MastodonOn-Call Me Maybe Podcast InstagramOn-Call Me Maybe TikTokOn Call Me Maybe Podcast YouTube ChannelAdriana’s TwitterAdriana’s MastodonAdriana’s LinkedInAdriana’s InstagramAdriana’s BlueskyAna’s TwitterAna's MastodonAna’s LinkedInAna's InstagramAna’s Bluesky

Show Links:

NubankOrionXAssembly languageChatGPTBASICLunar LanderSun MicrosystemsALGOLPascalRS-PLUSAWKGoHPC ClusterAmazon Web ServicesNetflixReed Hastings (Netflix co-founder)Storage Area NetworksOracleHadoopCassandraAdrian @ QCon London ‘23Chaos EngineeringAdrian + Ana @ AWS re:Invent 2018: Breaking Containers: Chaos Engineering for Modern Applications on AWS CatchpointMonitorama speaker scheduleHarnessLinux FoundationCloud Native Computing Foundation (CNCF)Green Software FoundationCNCF’s KeplerEuropean Union’s Climate-Change Tax PlanLiz RiceKubeCon EU ‘23 - AmsterdamAdrian on Platform EngineeringSpinnakerThe Value Flywheel Effect by David AndersonWardley MappingAWS CDK PatternsJeremly DalyAmptAWS’s GameDayPre-Accident PodcastSidney DekkerChernobylGluecon 2023Lean Agile Scotland

Additional Links:

Laguna Seca RacetrackAlvardo Street BreweryDuckzilla from Duck Foot BrewingNetflix’s Journey to the Cloud

 

Transcript:

ANA: Hey, y'all. Welcome to On-Call Me Maybe, the podcast about DevOps, SRE, Observability Principles, on-call, and everything in between. I'm your host, Ana Margarita Medina, and with my awesome co-host... 

ADRIANA: Adriana Villela.

ANA: Today, we're talking to Adrian Cockcroft, who's an Advisor in Tech. And we get to hear about all his experiences throughout these years. We're so happy to have you join us.

ADRIAN: Thanks. Great to be here, and good to see you again.

ANA: Definitely. Where are you calling from today?

ADRIAN: So I live in California. A few years ago, I used to live in Los Gatos near Netflix and eBay, where I worked for many years. And then a few years ago, during COVID, we sort of did the, hey, we're going to retire and get out of the Bay Area thing and bought a house down about halfway between Monterey and Salinas. Anyone that knows racetracks, I can see Seca racetrack out of my window, and I can hear it when the cars are running. So we're right by a racetrack, which is part of my fun and games.

ANA: Nice. My favorite breakfast burritos are in Salinas, and I'm a huge fan of Alvarado beer out there too.

ADRIAN: Yeah, it's nice to be out of the Bay Area sort of craziness. I mean, there's still a bit of traffic here, and there aren't so many roads, but it's a much more relaxed place. And we've got a lovely place down here, and the house prices were a tiny fraction of what it would cost to have a nice place in Los Gatos. So that was the main thing.

ANA: That is awesome. And for the second question, it's our traditional On-Call Me Maybe Podcast question, what is your drink of choice for today?

ADRIAN: I'm actually drinking some water because it's about noon. But if it was a bit later in the day, I'd be drinking a gluten-free beer. There's a brewery that I really like called Duck Foot, duckfootbeer.com. They make a beer called Duckzilla. They make a stout. They make all kinds of interesting beers. And Stout Mask Replica is the name of one of their styles, which has a picture of Captain Beefheart on it. 

ANA: [laughs]

ADRIAN: And there's the "Trout Mask Replica," which is an album that some people might remember from the late '60s or something. Anyway, it's hard to find gluten-free beer. This is a brewery founded by somebody who has celiac. So all of their beers they don't make a big fuss about it, but all their beers have the gluten taken out. So the proper beer is made the normal way. There's an enzyme you put in that takes the gluten out. They do a full range of interesting beers. But they're in San Diego. They'll ship to California. If you're outside California, it's pretty much impossible to get. 

ADRIANA: [laughs]

ADRIAN: So every now and again, I go on their website, click buttons, and then a big box of beer turns up full of stuff for me that I can actually drink. Unfortunately, I like beer, but I got gluten intolerant a few years ago and now have to manage what I eat.

ANA: It is really nice that we're getting more inclusion in the beer and alcohol space that now there's less sugar or gluten taken out or just the way that they're being promoted thinking about, hey, everyone has different dietary restrictions.

ADRIAN: Yeah. And if you get a stomach ache every time you have pizza or drink beer, it's probably because you've got some gluten intolerance, and when you eventually just stop, you discover all the stomach aches went away and carry on with your life. 

ANA: Just experiment. [laughs]

ADRIANA: Cool. Well, we should probably get into some of our more techy questions. So a question that we love to ask all of our guests is, how did you get into tech?

ADRIAN: I'm quite old, so this was a long time ago. [laughs] My dad was in technology, basically. He was testing anti-aircraft missiles back in the 1960s when I was born. I was christened on a missile test-firing ship whose motto was aim high. [laughter] So I thought that was...if you're going to have a motto, that's fine. And then, in the 1960s, he went to college and was programming a little bit. He did a mature degree in statistics when he was in his 20s and then got a job at a university as a statistics lecturer. So, basically, I grew up with my dad doing computing and statistics through to the time I was growing up. 

And somewhere along the way, the school had a computer, and my dad used to sort of put things in front of me. He didn't really teach me stuff. He just left things laying around that I looked at. [laughter] Computer Weekly magazine was a weekly newspaper that was always around the house. And so it was always things like that. But I taught myself to code at high school in 1972. We had a teletype connected to a deck system 10, and it had BASIC on it. And I played Lunar Lander and tried to figure out how to write things in BASIC and a few other programming languages. 

And then, when I went to college, I did physics, but I ended up doing applied physics and electronics, which there was a bunch of designing computers that could control physics experiments. It was kind of what they were trying to teach us. So I did Assembly Language and some C programming and basically embedded stuff. And then, the first job I got was at a place called Cambridge Consultants in Cambridge, UK. I was basically doing embedded real-time development full-time. That was my day job. I did that for about six, seven years, just writing code every day.

Most of the code ended up burning it into a PROM and sticking it in a machine, leaning on the stop button while you press the start button in case the code went wrong, and all the motors went backwards or something. So we had fun with that. And then we were using Sun machines as development machines. 

During that time, I ended up basically as a one-day-a-week sysadmin because there was a full-time admin looking after all of the faxes and things. And there was one day a week I basically looked after all the Unix machines as a part-time job. So this is the early days of DevOps since I was deving four days a week and opsing one day a week. It mostly consisted of making sure the backup tapes got changed for a few Sun machines. 

And then, I joined Sun in 1988. They opened an office across the street, and eventually, I went...I wanted to know what they were going to do next. And basically, I talked my way into what you'd now call a solutions architect job though we called them systems engineers then. And I did that for about six years in the UK. 

And then I ended up moving to the U.S. to be in sort of the central group that supported that function globally, so writing white papers and trying to get all the information in the field people needed to learn about the new machines as they came out. And various things happened after that, but that's kind of how I got into tech, basically, by my dad bringing home random computers and calculators and things as I was growing up.

ADRIANA: That is so cool. You mentioned learning BASIC. BASIC was my first language. I have very fond memories of BASIC. How do you find the evolution in programming languages from the early days of writing BASIC code to the stuff that's out there now? As you reflect on that evolution over the years, what's your thought?

ADRIAN: I mean, some languages I find easy, and others I find hard to figure out. BASIC was just like; I had no idea what I was doing, absolutely no clue.

ADRIANA: [laughs]

ADRIAN: You could type things into this computer. And it was a roll of paper going up the screen. And you just, I don't know, you could make this thing do stuff by typing stuff in. So that was pretty...I had no theoretical background of what was happening at all. At some point, I decided to learn ALGOL, which is a more structured language. And that was where I kind of got into structured programming a bit more. And then there's like Pascal kind of things, those sorts of languages. 

And then the first job I had I was mostly programming in C. At some point, they gave me a terrible job with a really horrible language and development environment. And to sort of keep myself sane, I built myself a home computer and ported a C compiler to it. So I ended up rewriting the code generator for a C compiler as sort of a home project. 

ADRIANA: [laughs]

ADRIAN: It was a very small, think of a small C. But what the effect of that was is I ended up a much better C programmer because I now had a code generator in my head. I could look at the code, and I'd look at the C code, and I actually knew what it was going to do to the machine and much more. So actually, having spent a bit of time working on the guts of a language, it's very simple, but it gives you a much deeper understanding. 

I stopped being a professional programmer just as C++ came along, so I never learned C++. And then Java I used a bit, but I was never really that happy with Java. It's just too much stuff there. It's just too complicated and wordy. And then, let's see; I learned S-PLUS, which is now R. So most of this analysis code I do is in R. And because I know R, I never tried to learn Python because if I need to do something, I do it in R. 

Similarly, I never learned Perl because I could always do it in Awk or something like that. I learned Awk before Perl came along. And there are all these sorts of things where I know how to do with things, so I don't need to learn another thing or another way of doing it. When I was working for a venture capital firm in about 2014, everything started being written in Go, and it was all open source. So a startup would come in, and I'd start rummaging around in their repo on GitHub and trying to read the Go code, and I went, okay, I need to learn Go. 

ANA: [laughs]

ADRIAN: So I learned Go, and I built some pretty complicated things in it. I learned the language fairly well. For a few years, I was sort of just experimenting and playing around in it. So that's the last time I really learned a language. And I liked Go quite a lot. I like the fact that it's just go build, and it's done. And it's a very lightweight environment to get into. And I liked the fact that they haven't changed the language much. So I probably still know Go [laughter], although I haven't written code with it for quite a few years. [laughs] Most of the other languages have changed so much you actually don't know how to use them anymore.

ADRIANA: That's a really good point.

ADRIAN: So you sort of have...I think languages come along, and then they wear out. They keep adding features to them until nobody new can really learn them anymore. 

ANA: Yeah, it's true.

ADRIAN: They're just too top-heavy. And then you need a new language that's simple. That happens with a lot of technology. You get this sort of technology cycle where the big complicated thing is like the innovator's dilemma. It gets too big and complicated. And you get the new thing coming in underneath it, and it's just like a simple version of whatever it does, like all things.

ANA: Especially with the adaptation of people being like, oh, this doesn't solve the needs I have in my organization. Or when we look at open-source work specifically, it's like, well, that company is not using the language anymore. So why are they going to continue contributing to it? And they moved on to something else.

ADRIAN: I had the same thing with the cloud. When I was at Netflix, we were using EC2-Classic. That's like one big flat network. You just push a button, and a machine appears. There's none of this complex stuff with networking and all this really complicated stuff. So actually, I don't know how to use the cloud now. [laughter] But I did get a ChatGPT account recently. It generates code which looks plausible for stuff that might work in the cloud. But then I tried visiting some URLs that it brought up, and those didn't exist. So it invented some GitHub repos. [laughter] It sounded like it'd be really good if this thing existed. 

ANA: Interesting.

ADRIANA: Damn.

ADRIAN: But I think if you're in a place that's very well documented and understood, and you're trying to understand it, it's probably got enough background information to actually give you good answers. But if you get off into the weeds to do something that's a bit marginal, it'll start making things up that are unplausible. So still playing around with it. But I figure now that the AI has learned all the languages, I can figure it out instead of using Stack Overflow. 

ANA: [laughs]

ADRIAN: The other thing I find is if I'm trying to build something complicated, eventually, I end up deep in the guts of some JavaScript, and I hate JavaScript.

ADRIANA: [laughs]

ADRIAN: It's like everything I do ends up stuck in some stupid JavaScript thing that I end up trying to debug something that doesn't work in JavaScript, and I wasn't trying to use JavaScript in the first place.

ANA: [laughs]

ADRIANA: It's funny because I am not a fan of JavaScript, either. And I'm sure...I'm sorry, listeners who love JavaScript. I just could never get the idea behind JavaScript. So I can completely relate. [laughs]

ADRIAN: Yeah, it's all 1 + 1 equals 11.

[laughter]

ANA: As much as I liked writing it growing up, now it's definitely like, I'm so far away from it that I'm like, you know what? I'm okay without it. And then all of a sudden, I'm working on a project, and they're like, can you write TypeScript? And it's like, wait, why? [laughs] Why are we still here?

ADRIAN: Yeah, TypeScript is probably a bit better. But there's a limit to how much time I...the other end of this thing was you said how to get into tech, but how to get out of tech; I'm not sure quite how I've got out. But last summer, I retired from my sort of day job. I was a VP at Amazon. There are all kinds of stuff you have to do at a big company. And it was really difficult to do podcasts because you get PR and have to be onboard and a whole bunch of other stuff. 

So I'm really enjoying the freedom of being my own boss to do what I want. And I spend most of the time on vacation somewhere. And then in between, I do advisory work and try to be useful around technology and see if I can do a few conferences that I feel like that are in...sort of when I feel like it kind of basis. And I'm doing some advisory work in return for shares and little bits of consulting and paid speaking engagements and things. So that's my kind of how do you get out of technology? I'm sort of half out, but I've still got one foot in.

ADRIANA: [laughs] I feel like you can never fully get out, especially if you really love technology. It's always going to find a way to find you, which isn't a bad thing.

ADRIAN: Yeah, young enough to still keep working on stuff. My dad is still alive, but he's in his late 80s now, and he has Parkinson's, and he's getting more and more frail. And I asked him some statistics questions, and he said, "I've forgotten all that stuff." [laughter] It was long enough ago.

ADRIANA: Technology languages have changed a lot since you started your career. And you found yourself in the cloud space at one point in your career. How did you get into it?

ADRIAN: When I was at Sun in the early 2000s, about 2002 to 2003, we had a high-performance computing group that was looking at these big clusters that we used that were HPC, clusters of Linux machines. They're massively scaled Linux clusters. And we were looking at that. The other thing that we were looking at was building a machine which never went to market, which would have been so big that we wouldn't have been able to ship it. It was more than one rack in size. So you had to disassemble it and put it in several racks and then sort of glue them together to get your machine. 

And we said, well, why don't we just leave it in the place where we built it and not ship it to customers and have them connect to it over the internet? Which was sort of conceptually, yeah, that's cloud. It made sense. And you're seeing that a little bit now with quantum computers. You really don't want a quantum computer in your data center. It's a monstrous thing. It's like having a nuclear reactor in your data center. It's full of weird plumbing and stuff. So you'd want to access it over the network. 

So we thought about it a bit, but Sun couldn't figure out how to sell directly to consumers with credit cards who just didn't understand that market at all. And when we talked to the CIOs, they didn't want cloud. They wanted to build bigger data centers. And then Amazon came along and figured out...only knew how to sell stuff on credit cards, so it got in via that route. And the CIOs, many of them still don't like cloud. It was sort of an end run around the ability of CIOs to control where all of the technology came from was that you could just buy it on a credit card. So that was the thing that I think bootstrapped cloud as a thing. 

And then, in about 2008, 2009, I was at Netflix, and we were trying to figure out how to scale what we needed for streaming, which was growing vastly faster than the capacity we needed for DVD shipping. And we had a big outage in 2008. We could talk about outages if you like. It was SAN. Remember Storage Area Networks?

ADRIANA: Mm-hmm.

ADRIAN: We had a virtual SAN controller, and all our machines were connected to the SAN network, and then all the storage was out the back. And one of the SAN controllers decided to silently corrupt all of the blocks, every now and again, silently corrupt a block going through. And then Oracle decided all our disks were bad. Everything collapsed. And then we restored everything, and then it corrupted it again. And then, we rebuilt the entire system from scratch out of new hardware, but that took three days. So we were down for several days. 

The outcome from that...but this is one of those things...you tend to see people having a huge outage, and then you go, okay, so what could we do differently? It's when management pays attention to the systems and the assumptions. And we said maybe we should not be trying to run our own infrastructure. And Reed Hastings, in particular, sort of said, "Could we go use this Amazon thing? Can we just rent it from somebody else?" And we went to Amazon, and Amazon said, "Come back in a year. We're not ready to do that." 

ANA: [laughs]

ADRIAN: This was in 2008. 

ADRIANA: Oh damn.

ADRIAN: So we came back in '09 and started with some stuff that wasn't customer-facing movie encoding and some Hadoop cluster stuff. And then, during the whole of 2010, is when we went and moved the entire front end to the cloud sort of one page at a time. It took about nine months. And then, in 2011, we moved the backend of Oracle onto Cassandra. So it was sort of roughly a two-year process to get what people think of as the Netflix streaming product. 

There was still data center stuff and other things happening in the data center for quite a few years. But they gradually moved corporate IT and all the other billing and things like that out. It was a multi-year thing. But it was an interesting opportunity driven by a very visionary kind of management team that Netflix is not scared of trying something new, certainly at that time. It was like; we'll try something new. Let's see if we can use that as an advantage to get ahead of the competition.

ANA: As I came on to learn about cloud computing, I was always impressed with Netflix's story of just jumping in. And, I mean, it was one of the first companies to do it and for it to kind of go well. And I guess I didn't realize also how long it had taken.

ADRIAN: And the other thing was in about 2010, 2011, I switched. I was managing a team. I was managing one of the development teams, and I switched roles to being the overall cloud architect. And I started going out and trying to explain what we were doing. So since then, my job has been to try and explain what Netflix did in 2010 to people over and over and over again. That's my latest career, if you like.

ANA: [laughs]

ADRIAN: Because people still aren't doing some of the things. I've had some of my talks...I did a talk at QCon in March, which was something like what we learned and didn't learn from Netflix as a microservices retrospective. The video isn't up yet, but they'll post that at some point. It was one of those things where I was at Sun. I was a distinguished engineer, and I was out doing talks and training classes and lots of public speaking. 

And then, when I was at eBay and Netflix, I really didn't do a lot of that for a year or two. And I said, "I could go out and start talking about what we're doing." And they said, "Oh, okay." They'd never seen me doing talks at conferences. 

ADRIANA: [laughs]

ADRIAN: I have a decade of experience of doing that. So I just went, okay, started going out, and made it more explicit. We did it as a technology marketing exercise. And the open-source program was part of that as well. We wanted to create a technology brand around Netflix, so everyone would think it was a high-tech company so they could attract better people to work there and have it be a sort of a halo over the movie brand, which is the main Netflix brand. 

So it's actually difficult when you have these two brands, like, you've got a consumer brand, then you've got a technology brand. Something that eBay never wanted to do because it's a marketplace. They never wanted to be a technology brand, and they had a lot of good technology most people don't know about. And it's something that I see other companies doing a better job of nowadays is your company, one, having a big open-source brand around them.

ANA: It is really awesome to see a lot of companies like, one, realizing that open source has helped them and that they want to contribute back. But then again, when they get to look at their own tech stack, and it's like, we've done really awesome work. Like, we could actually make this vendor-neutral and ship it out there and take all of our proprietary, confidential things and kind of just be like, hey, we build awesome tech. And there are good people here, and you get to know about the brand. But it is definitely a recruiting thing as well at play. 

And since you mentioned the open-source work at Netflix, I think if it wasn't for someone at work, I would have never come across you and gotten a chance to meet you since I got my start in cloud computing next to chaos engineering. And, of course, we have Netflix's open-source Simian Army Chaos Monkey as one of those golden poster childs.

ADRIAN: Yeah, that really helps. And then I was advising Gremlin while you were at Gremlin. We did a re:Invent talk together at some point.

ANA: Yeah, that's right. Folks can watch our 2018 AWS re:Invent talk around chaos engineering.

ADRIAN: Yeah, I was advising Gremlin before I joined AWS. AWS made me shut down all my advisory work because of conflict of interest, but I still got to pull you in for that one. And I'm now doing more advisory things, and that's part of this sort of new role I've got. I'm advising Catchpoint, which is an internet monitoring tool, more observability. You see them at Monitorama. And Harness is CI/CD. They've got a new thing for continuous resilience, basically adding chaos engineering to the pipeline. I gave a talk at Chaos Carnival last month. 

And then Nubank, which is a big end user, Brazilian, next-generation banking company that's got all kinds of cool people working there like [inaudible 21:09] Michael Nygard. And so I like hanging out with them. So that was why I ended up advising them.

ANA: Very nice. Another fun fact is that Adriana is actually from Brazil. [laughs] So when I saw that you're working with Nubank, I was like, hey, Adriana, look, [laughter] he's also working with like the largest bank out there. 

ADRIANA: [laughs]

ANA: As you're advising folks and you get a chance to not be carrying a brand next to you, is there anything that's exciting you right now about your work or the industry?

ADRIAN: The thing I'm trying to push forward...I've got this position where I sort of have credibility, and I know people, but I'm a neutral party in the middle. So what I'm trying to do right now is get the cloud providers to agree on a real-time carbon standard. So what I want is...be it something like CloudWatch, or Prometheus, or whatever, I want there to just be a stream of one-minute interval metrics saying this is how much energy or how much carbon this machine, or this environment, or this pod, or whatever is using. 

And right now, if you go and get the data from the cloud providers, it's monthly. You get a monthly total. And there are some ways of trying to guess what the numbers are from the billing records and things like that. But I think that what I'm proposing, I proposed this when I was at QCon London, what I'm proposing is that we should define what the standard is so that they will emit the same data, whether you're getting it from CloudWatch, or Prometheus, or whatever the...I forget whatever Azure has but whatever. 

But they'd all have the same schema. They'd all be formatted the same way with the same meaning behind the data. So I've been on vacation for the last month. So I just launched the idea, and I'm just going to let it marinate. Sometimes you just launch an idea. And if you get too hot and heavy with it, you burn out. So I stuck the idea out there. I let it sit for a little bit. 

And then, in the next month or two, I'm going to be trying to talk to people about is this possible? Can you get it together? Can we get people to agree that this could be done? And can I get at least one cloud provider to build it and then see if we can beat up the other ones to go and copy it and do the same thing? Rather than having very different data and interfaces and resolutions as the way we have it. Really, the data is all over the place. 

The talk I gave at Monitorama last year was really about that because the way I think about it is that carbon should just be a metric like utilization, latency, throughput, carbon power. It should just be another metric that you have. And all of the monitoring tools should just be using it and passing it through. So you should see it in, you know, pick your favorite monitoring tool, you know, whatever, Datadog or something.

It should just be there as a metric they can provide. And that's one of the reasons I'm talking to Harness and Catchpoint was to try and sort of push that forward with them. But then they can't build the talks if data isn't there. I'm sort of a vaguely neutral third party sitting in the middle. I know who's who at the different cloud providers, and I'm going to try and talk them into doing something or at least explaining to me why they can't, something like that.

ANA: This also sounds like a great idea or opportunity to bring forward to the Cloud Native Foundation since they also work on standardizing some stuff across cloud. But I also know when projects are, like, early start, you kind of just need to do a prototype first.

ADRIAN: Yeah. The Green Software Foundation is another Linux Foundation group where I've sort of pushed it there, and that we're going to sort of validate it there. There are definitely some things happening at CNCF, and I know the people there too. I was on the board there for a while. My team at AWS wrote the document that got us to join. 

ANA: [laughs] Awesome.

ADRIAN: So I had the budget to pay for KubeCon and all that stuff. So for several years, I was driving that from AWS' point of view. So yeah, I think I know the people there. There are actually so many projects going on that I was trying to just use ChatGPT to discover them all [laughs] and explain them to me. 

ADRIANA: Yeah, there are a lot.

ADRIAN: It sort of worked a bit. And some other people popped up and said, "Hey, there's this project and this project." There's a nice project called Kepler, which looks really interesting; that's a CNCF project.

ADRIANA: Just going back to your carbon metric, I hope we get it to be part of standard metrics that are emitted because it comes full circle really nicely with...Ana and I were discussing in the first episode of the season. Ana, I don't know if you remember we were talking about sustainability and computing. And it's kind of cool because this is the last episode of the season. 

But, I mean, even cooler is the fact that we are in an industry that is not great for the environment for the sheer fact that we are using computers. And when you're using data centers and cloud computing, that stuff uses a lot of energy. So to make carbon consumption, like, make it part of the conversation, I think is so, so valuable because I think we do need to be more cognizant of these types of things. Because the whole thing of, like, we've only got the one planet left, and if we have a way of tracking it, then we have a way of making it better.

ADRIAN: Yeah. It's pretty complicated as well. There are a lot of things that aren't very obvious. If you look at a solid-state disk, it takes more carbon to make it than it will use in electricity in its lifetime. There's so much silicone in it, and the carbon is all in that, whereas the spinning disk is the other way around. There's a 10-watt motor spinning the disk all the time. So it's storage, and you put it on tape, and it's basically zero carbon.

And then the other thing is if you get green energy everywhere, you're not done. In a few years' time, it's pretty easy to get the green energy. Progress is going really well there. But then you've got all the concrete in the buildings, which buildings are much bigger emitters of carbon. It's about, I think, 40% of the carbon emissions of the world is in buildings, building, and operating buildings, whereas IT is a few percent and in transport tens of percent, and things like that. 

So if you're in a company like Amazon, most of Amazon's carbon footprint is deliveries. It's the aircraft and the packages and all the shipping. The AWS piece is a smaller part. If you can use more compute to optimize your real carbon footprint and your product carbon footprint, you know, moving at... we call it moving electrons and moving atoms. In the IT industry, we're moving electrons around. Electrons are much easier to move around than atoms.

If you're moving physical things around or making stuff, or mining, or whatever, or transport, or buildings, that's where the carbon really is. So you have to think about your supply chains, and this is starting to come along. I saw an announcement today that the European Union is now going to start taxing the carbon footprint of things that they import. They've just finally agreed to that rule. It takes a while to kick in. But you can't just say, oh, they are only doing that in Europe; it doesn't matter because we're in the U.S. If you want to export something to Europe, you're going to be paying a carbon tax on it in the future. 

ANA: [laughs]

ADRIAN: So whatever they do in Europe will eventually spread around the world, even if the U.S. is a few years behind what's happening there. So, interesting times.

ANA: It's always neat to see what Europe is doing around reliability or sustainability because they seem to pass laws a few years before the U.S. and it's like, oh yeah, they're doing it. Like, let's give it another five years, and America will kind of hop on.

ADRIANA: Yeah, it was very cool being in Amsterdam and seeing how much more environmentally conscious they are compared to many cities in North America, and I don't mean just from a biking perspective. You will find different bins for different things like for your compost and your different types of recycling and all that stuff. And we're not even there.

I go to a hotel room in the States, and there's just a garbage can. I go to a hotel room in Canada or in other parts of the world that are a little more environmentally conscious, and you'll find a recycling bin in the garbage can. So it's quite interesting. 

ADRIAN: Yeah. I think I saw Liz Rice actually cycled to KubeCon from London or wherever she lives somewhere in the southeast. 

ADRIANA: Damn.

ADRIAN: She actually was posting about it. 

ADRIANA: That's awesome.

ANA: That's awesome.

ADRIANA: One of the questions that I wanted to ask you is, since you've gotten a chance to see all of the cloud movement and be part of it, where do you see DevOps and SRE going in the next five years?

ADRIAN: I did a blog post on platform engineering, which was sort of related to this [laughter] latest hot word, and then somebody...was it Jeff Cessna* [SP] who was muttering about this? He basically said something like, well, you can sort of divide the world into the enterprise world and what we used to call scale digital natives, if you know what I mean, like Netflix, Uber, companies like that grew up in this world. And they are organized very differently, and they behave differently. There are a few things that maybe don't fit neatly into that category. 

But what we did in the scale digital natives is we just combined the development and operations stuff. We didn't create a separate team to do it and call it the DevOps team. We just taught our developers to operate and automated everything. 

[laughter]

ADRIANA: Oh my God.

ADRIAN: And built all this tooling so that you wrote code, you hit check-in, and then you waited 15 minutes, and it was running. That was it. You didn't have to go --

ANA: Assuming Jenkins doesn't break. 

[laughter]

ADRIAN: Well, yeah. We did break Jenkins. But Netflix built a thing called Spinnaker because we were doing so much Jenkins abuse, basically. 

ANA: That's great.

ADRIAN: So somebody's there making the pipeline work. But you should just check code in, and it should just appear in production. You shouldn't be talking to operations to make that happen. So that was kind of the thing. And then we put all the developers on-call so that if your thing broke, you'd just shipped it that day, that you didn't have a meeting to explain to somebody else how it worked. So you just were on-call for anything that you'd done. 

And then, of course, for people to get off-call, they had to have somebody else that also knew how it worked. So that kind of got people to do more pair programming or at least code reviews with somebody else. Like, whoever was on-call for that shift for a few days would be code reviewing all of the changes that went out to production that they didn't make themselves. So you ended up with sort of a buddy system where there were at least two people that knew how everything worked. And all we did to get there was put people on-call. Developers don't like to be on-call, but if it goes wrong, we're going to call you anyway.  

ANA: That's true. [laughs]

ADRIAN: So you may as well admit that upfront, and we put you on-call. And if you're changing it every day, like you could, then you didn't have a handoff meeting to explain how it works. There's no TOI, you know, transfer of information meetings. And you didn't write a huge amount of documentation about the way it works today versus yesterday. And there are probably five different versions running in production because you've got all these different test versions and old versions. 

And it's complicated. You need the person that currently has their head around that little part of the system, which is why it's a microservice. It gives you a nice boundary. I know these bunch of microservices, and if something goes wrong with one of them, I know who to call because I know who built that service. So that was the way we did DevOps.

But what happened in enterprise was they had an ops team, and then they found they couldn't hire ops people unless they called them DevOps. So they bought them a copy of Chef or something and started to call them DevOps. And then, The SRE book came out, so then they started hiring SREs and bought them some different tools. And now they're calling it the platform team and buying some more tools or whatever. 

But it's sort of roughly the same team. And it's a separate team. And the developers are on one side. And there are more operators on the other side. And they didn't really solve for some of the problems that you really had where the developers didn't understand what it meant to operate something. I've always bridged the two. Like I said, I was a one-day ops person and a four-day dev person back in the 1980s, so I count that as DevOps. But I think that that's my sort of take on it.

Then, where's it going? I think the interesting thing about where it might be going...because you get this quote from William Gibson that "The future is already here. It's just not evenly distributed." And I've been living that for many years. I'm still explaining things that we did ten years ago at Netflix to people. 

The sort of bits of the future I've seen, there's a book called "The Value Flywheel Effect" by David Anderson. And it's largely about what they did at Liberty Mutual, which is an old company, but they got to go incredibly fast. It's a serverless-first approach, a lot of Wardley mapping to figure out what to do. And they ended up with the ability to deploy things in hours, completely new products, not a new version or a push some functional addition. They built a completely new product from scratch and delivered it tomorrow. And most people don't understand how you could do that, like, a new insurance product. 

Okay, I have an idea for a product; okay, next day, we've built it. It took us a few hours. We've built it. We've shipped it. It's running now. We want to change it, okay, we'll change it again. They're going at a pace that's just unreasonably fast. And I think that that's kind of taking serverless-first taking serverless to the limit. There are lots of CDK patterns in there. In fact, cdkpatterns.org or.com, I forget which, was created by them. So they've just templated everything. If you want to build, most of the things you'd want to build are on common patterns. So that's one thing. 

The other thing is, you know, this whole lot of fuss recently about ChatGPT. Something that you'll be able to do is have a conversation about the code you want to build, and it'll end up being built. That's where we're going. You don't need to know the language. You need to know what it is you're trying to solve for. And it's a conversation like you have a conversation with your business person about whatever, whoever wants to build the thing, the product manager, then you go and try and build it. 

Effectively, the product manager needs to know a bit more about development, how development might look. But I think we'll end up in a place where most things are being managed and modified by having conversations with these sorts of AI bot things that are doing it. So that's, I think, where we go.

ANA: It is interesting to think about how ChatGPT/AI is really going to transform the way we do platform engineering or build applications, like, the scaffolding aspect of it of your organization has these little blocks, and you just kind of get to merge them in together to get up and running as needed.

ADRIAN: Yeah. And there are things that are too complicated. Like Kubernetes is too hard for any one person to understand. I think Java, the language, the entire environment around Java is too hard to understand. And it's used absolutely every day.

ADRIANA: [laughs]

ADRIAN: There are too many pieces to it. And the cloud, there are too many services on AWS for everyone to understand. But your chatbot is the thing that can understand all of them. So it's more about, well, I want to do this, or I'll pull these different services you've never heard of that actually do that and put that together. So I think that's where we're going. 

Jeremy Daly, who talks of serverless patterns, is well known for that. He's got a startup called Ampt that I'm also helping advise. And with Ampt, you have a chunk of code, and you annotate a little bit. Then you say deploy, and it says, oh, it looks like you have an endpoint that is going to take an input request, so I need to create a web front end. And it just provisions everything around it. That's kind of the idea. 

ANA: Whoa.

ADRIAN: And I think if you take that and then you teach ChatGPT stuff around it, there's something there where you basically describe what you want, a little bit of annotation here and there to give it some hints. And maybe you don't even need that. It just looks at the code you've written and says, oh, this is a web server, so I'd better stand it up as a web server and open the right ports and put a firewall on and all these things. 

And every time you deploy it, it can say, oh, there's a new service now. I'll use the new one. It's going to be different each time you deploy. So it's not the same as just sort of automatically generating a load of YAML and then getting stuck with editing it. It sort of takes all that out. 

ADRIANA: That's very cool. 

ADRIAN: And it's sort of a side effect. Like I said, back in the day when we first started using AWS, I did know how to use every AWS service there was because there weren't enough of them to confuse me. [laughter] And now --

ANA: You definitely can't do that in 2023. [laughs]

ADRIAN: I am a pretty useless developer in terms of cloud development now. I'd actually have no idea how to get anything running. I'd have to use something else to do it.

ANA: Yeah, I'm down to a select few AWS services that I'm like, I can still use you. There are too many services that are coming out, and I can't keep up.

ADRIAN: I think the last time I actually was hands-on in an account was there's some competition thing they run at re:Invent where everyone gets in there, and they have little quizzes where they have to work through.

ANA: GameDay.

 

ADRIAN: Yeah, one of those things. I joined one of the teams and was trying to help figure out how to solve for...it was something about unicorn rides or something like that.

ANA: Unicorn Rentals. 

ADRIAN: Unicorn Rentals, yeah. [laughs]

ANA: Yeah, I think it's Unicorn Rentals. I think they've been running with the same software stack, and they've just been adding a lot more challenges. But it's always really fun to go to re:Invent and see how it's evolved every single year. Well, as we're getting ready to wrap up the episode, we wanted to ask if you have any practical advice for our listeners today.

ADRIAN: I guess a few sort of larger-scale things. One of the things I've done throughout my career is share a lot. And I found that I became the expert in stuff by sharing what I was sharing answers to things. So you can kind of be the expert and say, "You have to come to me to get the answer," and try to lock it down so you can be the only expert. Well, that causes other people to go to other experts or set themselves up. 

What I found was...I read a book on Solaris performance tuning back in the '90s that was all of the answers to all the questions that I'd seen over and over again, and blogging more recently, and all the presentations I do. So if you've half-figured something out, try to write it down. It's going to help you figure it out better. And then, if you share it and discover whether it's right or not, other people will give you feedback on it. So stick your neck out in public. It's worth, you know, even if you're wrong, you just go, oh, okay, there's a better way of doing that. 

So that's kind of to encourage people to not get too hung up on imposter syndrome or whatever; just try things out. After a while, you get more confident because you...it's like standing on stage. The first few times you do it, you're scared. After you've done it a long time, it's no big deal. You get, oh yeah, there are 10,000 people in the audience, fine, okay. I'm just going to say whatever I say. It doesn't matter anymore. 

But there's a sense of building up that confidence for sharing what you know is sort of a good way to build a personal brand that will get you the next job. It'll make the industry better, things like that. And then the rest of it is just trying to find a little gap where people don't see...if some people seem to find things difficult and you find them easy, that's probably a little space to go build some expertise. Most of my career has been trying to find a path like that, which means I'm now --

ANA: Experiment and share. [laughs] 

ADRIAN: Yeah. The trouble is you end up...I have this problem. I'm not a typical thing, right? I'm not a VP of engineering. I'm not a developer. And I have a weird mixture of backgrounds. I've worked in marketing; I think probably more than I've worked in engineering, reporting to the CMO at Amazon for AWS for a while. I'm sort of branded as a technologist, but I've been able to leverage that into something that's much more communicating about technology really rather than anything else. 

So anyway, that's what I've found. I'm not sure if that's useful. One of the things is just thinking about chaos engineering and where that's going. I think it's important to understand how computers are controlling more and more of what we do. And they're getting more involved in controlling our physical lives, and there are more safety-critical things happening. So I think it's important to understand safety. 

And there's...Todd Conklin has a really good podcast called The Pre Accident Podcast that I listen to. I've been listening to it for years. I've actually been on it a couple of times to talk about chaos engineering. And Sidney Dekker. There's a whole bunch of other books and authors there. It's really important to understand what's happening out in the physical world because a few years later, you're implementing some software that's actually modifying or controlling something that's going to be safety critical. 

And there's a bunch of principles there and just trying to make sure the usability...the usability of things when they're in failure modes is probably the weakest thing I see, and that's why you need chaos engineering. If you don't test it, you don't know what it's going to do when it's unhappy. And it will probably --

ANA: It's going to break on its own.

ADRIAN: It will break on its own, but it will break in a way that confuses you. And then, when you try and fix it, you'll make it worse. And you see that over and over again. That's pretty much what happened at Chornobyl and all these other sorts of big runaway events. It's like operators got confused, and then they pushed all the wrong buttons and pulled their own levers and rebooted the wrong machines, and everything got worse. So that's kind of the you have to sort of practice how do you fix it when it's unhappy? And how do you get it to fix itself so it becomes more stable?

ADRIANA: I think it's really good advice because I think sometimes, as developers, we're almost too scared to push our systems to the limits because it's like, okay, I got it working. Don't anger the code, right? 

ADRIAN: Yeah.

ANA: [laughs]

ADRIANA: And we have to get over that fear because otherwise, weird things are going to happen in prod, and we will be completely unprepared, and it'll be worse. 

ADRIAN: Yes. Like test to pass or test to fail, right?

ANA: Yes.

ADRIANA: Mm-hmm.

ADRIAN: Like, that's really...and the whole test-driven development, which people should probably be...one of the things actually...this is interesting with the AI chatbots as well. If you write a test and tell your chatbot to write code that passes this test, they've got something they can work against. And they can run that code, and you say, no, this is the error message I got. And you've actually...if you do test-driven development and get the chatbot to do the rest of the work, right? That's a lot of, I think, where we're going to end up. 

What are the things this has got to do? And then what happens when it fails? How do you want it to behave? And that's really this sort of continuous delivery and having a...in your delivery pipeline for a developer, having chaos engineering be part of that pipeline, rather than something operations does like the data center failover that's supposed to be done every year.

So move it from being part of an annual exercise to being something that's part of the continuous development process. That's what Harness is, is a CI/CD company. They just added that capability. They launched that last month. So those are the sorts of things that I think are interesting right now.

ADRIANA: That's awesome. Just before we wrap up, I know you've got some upcoming talks. So, do you mind telling us what some of them are?

ADRIAN: I'll be at GlueCon in Denver in May, Speeding up Innovation and the Tipping Point. So this is generally talking about the sort of nature of innovation and the way that we see these different things moving. And that whole conference is really turning into...there's a lot of talks there around this sort of ChatGPT, like, what's that going to do? 

The whole conference is one of those ones that tend to follow the latest bleeding-edge stuff. And so GlueCon doesn't have a fixed theme. The theme is whatever is new this year, and they just jumped on it. So the agenda is full of discussions around what is the impact of this sort of AI-driven sort of development operations stuff? So that's going to be interesting. And then I'm going to be at Monitorama. I'll see maybe both of you there.

ANA: Yes. We'll be there. 

ADRIANA: Yeah, my talk's right after yours. [laughter]

ADRIAN: I came up with this really cool idea. I called it a tale of two histograms. It was the best of response times, it was the worst of response times, which is the opening line from "A Tale of Two Cities" by Dickens. And somehow, I've got to get the famous quote from the end; it was a far, far better thing, something or other. I'm going to see if I can work that into my talk somewhere. 

ANA: [laughs]

ADRIANA: Cool.

ADRIAN: But anyway, I've been messing around with histograms of response times. And I had this very geeky idea for a talk, which is mostly a bunch of code in R that I've been fussing with. And then Jason said, "We'll make that the opening talk." And -- [laughter] 

ADRIANA & ANA: No pressure. 

ADRIANA: No pressure. [laughs]

ADRIAN: All right. So I got to spend some more time coding to get that a bit more sorted out in some better, slicker talk if it's going to be the opening one. And then I think I'm going to be at Lean Agile Scotland in Edinburgh in September and then maybe at an event in Athens in Greece in the end of September. So I'm generally accepting invites from places that are interesting that I'd like to visit. [laughter] That's my sense rather than...Since I don't work for anyone anymore, that's my kind of algorithm for deciding where I want to talk, other than places like Monitorama, which is really a tribe. 

ADRIANA: [laughs]

ADRIAN: It's a tribal gathering, and I like to be part of the tribe there.

ADRIANA: That's awesome. I'm definitely looking forward to meeting you in person at Monitorama. That'll be awesome. 

ADRIAN: Yeah. It's been very cool.

ADRIANA: Thank you so much, Adrian, for joining us in today's podcast. 

Don't forget to subscribe and give us a shout-out on Twitter via oncallmemaybe and Mastodon. We're on all the socials these days, I think. Be sure to check out [laughs] the show notes on oncallmemaybe.com for additional resources and to connect with us and our guests on social media. For On-Call Me Maybe, we're your hosts, Adriana Villela and...

ANA: Ana Margarita Medina, signing off with...

ADRIAN: Peace, love, and code.

ANA: Whoo.

[laughter]

ANA: Yay.

Twitter Mentions