This is PART 2 of my conversation with Michael Hart. View PART 1.

About Michael Hart

Michael has been fascinated with serverless, and managed services more generally, since the early days of AWS because he’s passionate about eliminating developer pain. He loves the power that serverless gives developers by reducing the number of moving parts they need to know and think about. He has written libraries like dynalite and kinesalite to help developers test by replicating AWS services locally. He enjoys pushing AWS Lambda to its limits. He wrote a continuous integration service that runs entirely on Lambda and docker-lambda, which he maintains and updates regularly, and has gone on to become the underpinning of AWS SAM Local (now AWS SAM CLI).

Twitter: @hichaelmartGithub: github.com/mhartMedium: medium.com/@hichaelmart


Transcript:

Jeremy: Alright, so now we're going to go to the next level stuff, right? So if you’ve been...

Michael: That's not next level enough for you.

Jeremy: Well, that's what I’m saying. If you made it this far, I hate to tell you what we just talked about was kid's stuff, right? We're going to the next level. Alright. So you have been working on a new project called Yumda, right? Tell us about this. Because this thing — this blows my mind.

Michael: Right. So this is basically what was born out of the realization that people have struggled traditionally to get things compiled — native binaries or anything like that compiled for Lambda. For example, if you do want to write a CI system like lambci, then you will need some sort of git binary or a git library. But I would suggest using the git binary because libgit is just not there with all the features, But, you know, you'll need to get binary running on your Lambda so you can do a git clone of the repo that you're then going to do your CI test on, and getting that on Amazon Linux 1 was kind of hard enough. Getting it on Amazon Linux 2 is much harder, because there are so many fewer dependencies that exist there. I think on Amazon Linux 1 already had — it has curl on it. You know, if you're in Node.js 8, you could just shell out to curl, so a git has curl as a dependency. So if you're compiling git for, you know, the older runtimes you didn't need to worry about a curl or anything like that. You just need to worry about git. On Amazon Linux 2, you don't have curl, you don't have some really, really basic system libraries. So if you want to get git running on Amazon Linux 2, you need to pull in a lot of stuff yourself. And I got to thinking, well, what would be the best way to provide, you know, a bunch of pre-built packages out of the box? Yes, you could use layers. And I think layers are a great idea for very high-level packages, very, very large binaries that have a huge tree of dependencies or certain utilities. But it's impractical to be creating a layer for every single dependency that your native binary's going to use. You don't want to be creating one layer for libcurl, and another layer for libssh and another layer for this. Firstly, you're only limited to five layers that you can currently use in your Lambda, so you'd need to be squashing them together anyway. And secondly, it's just layers, certainly as they stand at the moment, they're not — if there's no particularly good discovery around them. It's nothing like doing an `npm install` or a `yum install` or something like that.

Jeremy: Well, and I also think that many of those layers, that if you install five layers, that a lot of those might be sharing dependencies under the hood as well, like they might have shared dependencies and then you might be installing those twice or three times. I don't know if they would...

Michael: Right, right, could they be clashing.

Jeremy: Yeah.

Michael: No, no.

Jeremy: But anyway, sorry.

Michael: No, no. So that's another consideration. So I thought, well, ideally, what people want to do, and this is certainly what people do in the container world, if you're writing a Docker container, you know, and you need native dependencies, one of the first steps you'll do in your Docker file is you'll do `yum install` whatever dependency I need. And that'll go down and pull all the sub-dependencies and then that will be installed in your Docker container. And then you can, you know, run your app from there knowing that this stuff exists. We don't have anything like that for Lambda, so I thought, well, I want to run YUM install essentially and have all those packages — all those Amazon Linux 2 packages that are there — you know why couldn't I just get them and install them for Lambda? And the reason that you can't do that is when you run a yum install, it installs in the system directories; it installs software in /usr/bin, /usr/lib64 if it's a dynamic library. And you can’t install that to those places on Lambda. You can only install to, if you're using layers, /opt so /opt/bin is in the path and /opt/lib is in the lb library path, which is where dynamic libraries get loaded from. So you need to make sure that your binaries in your dynamic library sit in those path. That's where they'll be unzipped to essentially when your layer is mounted or /var/task if you've bundled them up with your Lambda function. So you need to make sure that the binaries that you're shipping and the dynamic libraries that you're shipping are okay living in those paths and there's a lot of binaries and libraries out there that aren’t. You can't just copy them from /usr/bin to /opt/bin because something's being compiled into that binary that is assumed that's living in /usr/bin. There are a bunch that you can just move around  and that is a good first test. You may as well try it out, see if you can move a library from here to there or see if you can move a binary from here to there. But there might just be something down the track while you're using it, where it’s suddenly like, hey, I can't find this file or maybe it's depending on the configuration file in a path that's been hard coded as well, and you can't get your configuration file to that path because it's not writeable by you. So what I did was I took the Amazon Linux RPMs, and you can get, you know, all these RPMs are open source. You can get the source RPMs. RPMs is this sort of Red Hat package manager format for what a native package looks like on Red Hat Linux and all of its various children, including Amazon Linux, which, which sort of stemmed from Red Hat. So RPMs are what yum what YUM Install will use to install. So I pulled all these RPMs down there, and then I just re-compiled them instead of instead of /user being the path they were compiled for, compiled them for /opt so then you know, I had all these packages that I had re-compiled on, and then I created just a little a little Docker container that has YUM on it that is configured to install these RPMs in the right place because you also — the way that you, if you ever do want to YUM install the package in a non-system director, you have to provide a bunch of configuration to it to let it know that you're doing that. So I sort of pre-configured all that and to talk to the YUM repo that I had se...

Twitter Mentions