The Modern .NET Show

S07E03 - Chainguard and Securing Your Containers with Adrian Mouat

Sponsors

Support for this episode of The Modern .NET Show comes from the following sponsors. Please take a moment to learn more about their products and services:

Please also see the full sponsor message(s) in the episode transcription for more details of their products and services, and offers exclusive to listeners of The Modern .NET Show.

Thank you to the sponsors for supporting the show.

Embedded Player

S07E03 - Chainguard and Securing Your Containers with Adrian Mouat
The Modern .NET Show

S07E03 - Chainguard and Securing Your Containers with Adrian Mouat

Supporting The Show

If this episode was interesting or useful to you, please consider supporting the show with one of the above options.

Episode Summary

In this episode Adrian Mouat, of Developer Relations at Chainguard, discusses his extensive background in computer science, technology, and specifically, containerisation. His journey began with studying computer science and high-performance computing, leading to various roles in the field, including a significant tenure at EPCC (The University of Edinburgh) and writing one of the first books on Docker. Now at Chainguard, Adrian’s focus is on secure and minimal container images, directly engaging with the community through blogs and videos.

The conversation transitions into the topic of containerisation, where Adrian explains what containers are and the problems they solve. Containers can be likened to lightweight virtual machines that encapsulate software and its dependencies, providing reliability and reproducibility across environments. Adrian emphasizes how containers help developers avoid conflicts in dependencies, making them key tools for creating consistent development environments. This is particularly relevant for developers working on legacy projects who may not be able to incorporate containerisation readily.

Adrian elaborates on the evolution of container technology, highlighting the benefits of tools like Docker and the emergence of development containers which optimize the developer experience. The discussion touches upon the learning curve associated with adopting containerisation, the complexities of building containers, and the need for practices like caching to improve development speed. Adrian acknowledges the initial drawbacks of containerisation but emphasizes that modern tools have significantly mitigated these issues, allowing for easier adoption and integration into development workflows.

Security is also a major theme, and Adrian dives into the challenges surrounding container security, especially concerning vulnerabilities within base images. He details the importance of minimizing attack surfaces by using smaller, more secure base images and discusses the evolution of Google’s Distroless images as a response to common security threats. Adrian draws attention to the critical nature of keeping images up to date, addressing potential vulnerabilities swiftly, and the role of automated scanning tools in protecting against security risks.

Furthermore, the conversation touches on the concept of Software Bill of Materials (S-BOMs) and the significance of these documents in the context of regulatory compliance and supply chain security. Adrian highlights that transparent documentation of dependencies and their sources will eventually empower organizations to assess and manage vulnerabilities more effectively. He emphasizes the need for continuous collaboration with third-party suppliers to ensure that S-BOMs cover all layers of software in a given stack.

As the discussion concludes, Adrian outlines how Chainguard facilitates regular updates and provides security advisories for their containers, aiming to ease the burden on developers managing vulnerabilities. He encourages organizations to automate updates and to leverage the security features offered by Chainguard’s images. Ultimately, Adrian invites listeners to explore Chainguard’s resources through their academy site and GitHub repositories, where they can learn more about secure image practices, and engage directly with them for questions or demos.

Overall, the interview showcases Adrian’s expertise and the proactive measures Chainguard is taking to enhance security and usability in the rapidly evolving landscape of container technology. The insights shared are particularly valuable for developers seeking to integrate containerisation securely and efficiently into their workflows, ultimately fostering a culture of security and accountability in software development.

Episode Transcription

Okay. So I’ll come on to that point is that’s obviously something i’d like to talk about. But a couple of things I should mention, I guess. That I think you’re absolutely right with all the points you raised, but we are trying to work on on everything there. So a couple of things are worth pointing out: one is docker-init; so nowadays if you start in like a new project with python or node or whatever, you can run the docker-init command, and what that will do is like create a dockerfile and a couple of other files, I think, to help you get started, and it sort of contains that the best practices. So to try and help you get over the hump of trying to understand how to create a dockerfile, and all the different ways you can build that without needing to know everything. So I think that really helps.

- Adrian Mouat

Welcome friends to The Modern .NET Show; the premier .NET podcast, focusing entirely on the knowledge, tools, and frameworks that all .NET developers should have in their toolbox. We are the go-to podcast for .NET developers worldwide, and I am your host: Jamie “GaProgMan” Taylor.

In this episode, Adrian Mouat joined us to talk about Chainguard, what a distroless container is, a number of tools that you can use to check whether your containers have any CVEs present, attestations and reproducibility, and a number of ways to secure your applications once they are running in the wild.

Yeah, I like your point there about showing your receipts. So in attestations, you can also say things like, you know, “we did do this on this image.” You can create an attestation that says, “hey, I ran a scanner on this image and I had this output at this time.” And because it’s all signed, you know that that did happen, if you like. Yeah, and also like, you know, you could have an attestation that said, “I ran these tests on this image at this time and this was the output,” sort of thing. So it’s sort of proving that certain steps were taken.

- Adrian Mouat

Anyway, without further ado, let’s sit back, open up a terminal, type in dotnet new podcast and we’ll dive into the core of Modern .NET.

Jamie : [0:00] So Adrian thank you ever so much for being on the show. I am always really appreciative of people taking time out of their day or evening or whatever time of period it is—I don’t know I’m just making stuff up here—whatever time period it is, where people are willing to sit and chat with me. So I really appreciate it, thank you.

Adrian : [0:19] Oh, thank you very much for having me.

Jamie : [0:21] Amazing, amazing. Now we were going to be joined by your colleague Jordi, but he’s rather busy at the moment, so he isn’t able to join us. But I just want to make an audio shout out and just say, “thank you, Jordi, for this.” I really appreciate him getting in touch and saying, “hey, you should check out what Adrian’s doing,” and all that kind of stuff. So that’s pretty cool of him.

Adrian : [0:41] Yeah, he is on a flight to Brussels, I believe. So we’ll let him off for this one.

Jamie : [0:46] Yeah, yeah. I mean, he could have dialed in from the plane, but it would have been a bit loud, right?

Adrian : [0:52] Yeah.

Jamie : [0:54] Amazing, amazing. Okay, so I wonder before we start would it be possible to, sort of, give the listeners perhaps an elevator pitch or a little bit of a flavour about you and the kind of work you’re doing; or maybe your history with, you know, computers and technology, or neither of those. Whatever you want to share, you know, just so that people can get an idea of who Adrian is.

Adrian : [1:13] Yeah. So I dropped down a few notes, I’m not really sure how much to give you. I have to to say, when I give like a talk live I quite often, pretty much, some people give a whole slide of who they are, I tend to like skip over it. But yeah in this case, I think probably a couple of things are interesting. So currently I’m a technical community advocate—or Dev Rel— for Chainguard; and at Chainguard we basically make secure container images. So that means I’m, kind of, always engaging with the community, and writing blogs, and recording videos on everything related to secure and minimal container images.

Adrian : [1:49] Now i’m not sure how far to go back, but basically I did computer science at uni, and then I did an MSc in high performance computing. So that was all about parallelism and supercomputers and so on. And then I worked a few jobs, and I returned to to where i did the MSc—EPCC (The University of Edinburgh)—and I worked there for, like, almost of a decade, I think, on various European projects.

Adrian : [2:13] But then I was doing contracting. And I was working with Pini Reznik and Jamie Dobson, and we started looking at containers. And that actually ended up with me writing “Using Docker”, which is one of the first books on Docker and containers for O’Reilly. And then after that, I was chief scientist at Container Solutions for a while, which basically meant that I led the research projects and a few interesting things came out of there.

Adrian : [2:40] But most recently, yeah, I’m Dev Rel at Chainguard. Yeah, so just teaching people about secure and minimal containers, really.

Jamie : [2:49] Nice. Nice.

Jamie : [2:52] So, I guess, we are going to be talking about securing containers and stuff in our conversation. But I wonder, I mean, you wrote the book, right, on Docker. So I wonder if you could give some of our listeners a brief overview of what containers are, what they help us to achieve and stuff like that. And this is literally because—and I’m not trying to throw shade on anyone who’s listening who’s in this situation— a lot of folks in software development are working on those, sort of, brownfield projects or those older projects where it’s perhaps not feasible to containerise things, and to do stuff like that. And so because of that, they haven’t had a chance to, sort of, use them in their in their day-to-day work. So I wonder, for those folks, who maybe containers have passed them by and they’re like, “I really wish someone would explain it,” would you be able to perhaps speak to to what they are and what kind of problems they solve?

Adrian : [3:50] Yeah of course. So it’s 10 years ago [that] I wrote that book, so apologies I can’t remember exactly how I described it there. Yeah, it absolutely makes sense that people may not have encountered containers, especially if you’re perhaps more on the development side of things. I guess it’s more of a DevOps and Ops tool than a pure development tool, although I do see it becoming more and more relevant to development over time.

Adrian : [4:17] So I guess the reason I saw the usefulness of containers straight away is because I’d been using Vagrant. Can you remember Vagrant? It was basically a way of programmatically creating a VM and, essentially, containers do the same thing but much faster and simpler. So a container is really just, you can think of it as lightweight VM; that’s not what it is, because it doesn’t do virtualisation. But basically it’s a a black box—well I shouldn’t say “black box”—but it’s a box that you can stick your software and all its dependencies, down to the operating system level into. And the nice thing about that is, if I create a container and it works on my machine, I should be able to give anybody else the same container and it should work identically, because it has all its dependencies down to the operating system level except for the kernel.

Adrian : [5:09] Yeah so that’s where it really comes in: it’s really about, you know, having this isolated artifact that you can distribute.

Jamie : [5:19] Sure sure. And I think for a lot of folks they, kind of, they haven’t come across this, I guess, especially. So I had issues with when .NET first went cross-platform right; when they first created .NET vNext, and I was like, “amazing! I can run this app, it runs on my Mac. And then I can move over to one of my Linux boxes, you know, linux-on-the-desktop, and it runs there. And I can move over to Windows and I can run it there.” And then I was like, “wait, hang on. But the .NET framework folks are going to get all confused here, because all of their dependencies are maybe Windows specific, and things like that.” And I think that’s, you know, for those people who’ve been in non-cross-platform languages or non-cross-platform frameworks, they might be hitting upon these issues that they didn’t even realise were a thing for the longest time.

Jamie : [6:12] Or, indeed, it’s also if you’re not, from my perspective, even if you’re not going cross-platform right. Let’s say I’m going to pick on .NET Framework again: let’s say you’re using .NET Framework and you’re using, I don’t know Progress Telerik’s tools, and you’re locked to a specific version for maybe licensing or support reasons. You don’t want to accidentally update that every time you build, so you load that into your container right. So then like you said, all of your dependencies are already there, including their version numbers and all that kind of stuff, right?

Adrian : [6:43] Yeah, exactly. So you’re hitting on the point of reproducibility. Like, you know, it’s going to work exactly the same each time and have exactly the same dependencies and that you can specify that for other people. Yeah.

Jamie : [6:54] Cool, cool.

Jamie : [6:54] So I know you said earlier on it’s becoming more and more a tool, containerisation is becoming more of a tool, that’s related to developers. I ’m not claiming any kind of like genius here, but I remember when, in the .NET world when Blazor first came out, you had to install like a preview version of .NET to get it working, I ’m using “.NET” here, it was “.NET Core” at the the time; but yeah i’m saying “.NET”.

Jamie : [7:20] You had to install this preview version and, you know, .NET Core and modern .NET are really nice in that they install side by side, but there’s still a little bit of setup there. And what I realised was, “hey. If I wanted to sacrifice the rich debugging tools,” which weren’t available anyway because it was preview, “I could totally just write the code and build it using a containerisation platform.” So I chose docker. I was like, “hey there is a docker image for the build and the run, so why don’t I just write the code, do a docker build, see if it fails, fix the failures, do a docker build, hey look it’s running!” right. And, you know, fast forward to today and we’ve got dev containers which is essentially that, but not. It’s really cool.

Adrian : [8:04] Yeah exactly, I mean back in 2014, I was using it for Python, So rather than use virtual environments, I used docker containers to contain my Python applications; which is, you know, it’s a very similar way of working than virtual environments, but arguably a little bit more versioned and contained. I think the downside at that point was it’s probably a little bit slower because you had to keep running your docker build. But yeah, that’s what they address with dev containers.

Jamie : [8:32] Yeah, yeah. And I remember, oh my goodness, I’m forgetting the name of the person. I’m furiously looking it up whilst I’m—ah, there we go. Yeah, so I remember watching a talk like eight years ago by Jess(ie) Fraz(elle). And I’ve never spoke to Jess Fraz. so I don’t know the correct pronouns to use. I’m going to assume she/her, I apologise if that’s not the case. But during her talk, she was like, “right. I’ve got this bare bones linux system, it’s got no UI. Guess what? I’m firing up Skype because it’s inside of a docker container,“and up it came. And that was like a super extreme version of what containerisation… now. I’m using the word “docker” here, we’re talking about docker because it’s one of the many products; but it’s kind of become like, you know, the Americans would say “kleenex” right; because nobody says, “hand me a tissue,” they say, “hand me a kleenex,” right.

Jamie : [9:28] So when you’re using containers, being able to take even your running apps that you didn’t author and all of their dependencies and maybe a window manager and, throw that up onto a onto a small system is fantastic. Because then you’re isolating everything right.

Adrian : [9:44] Yeah, exactly. I know the talk you’re talking about from Jessie Frazelle, that was. Oh she had .dotfiles or something, and she also had like examples for in Spotify, and Firefox, and so on from a container. So yeah, that was a really interesting project.

Jamie : [10:00] It really was. It really was.

Jamie : [10:01] What I’ll do for the folks who haven’t seen it, I will get the link to the video and put it on in the show notes. I think it was at one of the Docker cons or something like that. It was fantastic. Just, sort of, showing off what you can do with it. But anyway, we talked about that for a bit too long.

Jamie : [10:15] So yeah, those are kind of the pros. The wrapping up of your dependencies, perhaps making it easier to build things, even things with like dev containers, for folks who haven’t… So real quick, for folks who haven’t done that: that is essentially—and this is working from memory here, I haven’t done anything with dev containers in about three months—you essentially describe, let’s say you’ve got a project that’s a .NET project… well I’ll use node, I’ll pick on node. I don’t mean to but I’ll pick on node right. Say you’ve got a project that uses npm version, i don’t know 15, but the latest version is npm 18, and you know that it won’t run on the latest version. You can set up a small json file that says, “hey, I need this particular container because it’s got this particular version of all of my dev dependencies. And when I hit run inside of my IDE, I don’t want you to actually build it with the local tools. I want you to spin up the dev container, build it in the dev container, run it in that container, and then expose that out, but also give me rich debugging. So like I can actually jump in.” Maybe JavaScript is a bad example here because you don’t get break points in that in the same way that you do in say .NET, but you can have that that sort of feedback loop without having to install the SDKs, the runtimes, all of your dependencies on your host machine, keeps it sort of cleaner and easier to jump between projects.

Adrian : [11:35] Yeah, sure.

Jamie : [11:37] Cool. So we’ve talked about, we’ve talked about the positives there. There’s lots of positives to containerisation. Um, some of the negatives, I guess the one that I jump into straight away is, Hey, there’s a bit of better learning to do. But I know t hat there’s loads more, or rather there’s a couple of more negatives. And I wonder, would you be able to talk to some of those?

Adrian : [12:00] Sure. I’m wondering which ones you’re thinking of.

Adrian : [12:05] So I used to try to run all my development work with containers. And sometimes you get stuck on the cycle.

Adrian : [12:16] Is it Microsoft that sometimes talk about the OODA (observe, orient, decide, act) loop? Where you want to be able to sort of debug and run things as fast as possible, so you don’t lose your sort of train of focus. And I think this has probably changed with dev containers, but it used to be that, you know, if you had to like build a container and run it to get things to work again then it kind of broke your your stream of focus if you like. I’m not sure “stream of focus” is the term; I’m not sure what I mean.

Adrian : [12:43] But I, you know, with dev containers I think that’s much much better, but that has been an issue with it; like builds taking too long, and to address that you really need to look at caching and tools like like dev containers. And definitely the learning curve: people bring that up, honestly I don’t think it’s too bad, especially with dockerfiles because they—typically, if you know what Linux is, and how to use Linux, then that should be fairly familiar. I think it’s harder for people, like you know, from the .NET or Microsoft background because you’re still typically building containers with Linux commands. There’s a learning curve there as well, yeah. But i’m wondering what other drawbacks you’re thinking about.

Jamie : [13:29] So I’m thinking mainly around the complexity, like you were saying there, there’s some complexity of like caching and all that kind of stuff. If you’re using a container to build your your apps and your services, then you’re literally building from scratch every single time; whereas, like, inside of an IDE, you know, your IDE will be—or the tooling around your IDE—will be clever enough to go, “hey, you only changed this portion of this source file. I don’t need to rebuild everything, just the bit that needs to be rebuilt.” That only works with certain, you know, pre-built languages and stuff like that—pre-packaged languages—so I’m not sure whether that works that way with Python. But as an example, you know, with .NET, if I have two or three class libraries, and I only change one class library, and there’s no knock-on effect, I don’t need to recompile the other two. Whereas if I’m using a dev container, maybe I need to recompile all three. So there’s a little bit of complexity there. There’s also the complexity of sort of learning it.

Jamie : [14:25] And then the big one, which I guess we’ll come on to, which is the the most important one: which is that sometimes, not always, but sometimes the container builders—I can’t think of the right word, the “developers” of the original container that you’re using—might be a bit slow to update for CVEs and things like that.

Adrian : [14:46] Okay. So I’ll come on to that point is that’s obviously something i’d like to talk about. But a couple of things I should mention, I guess. That I think you’re absolutely right with all the points you raised, but we are trying to work on on everything there. So a couple of things are worth pointing out: one is docker-init; so nowadays if you start in like a new project with python or node or whatever, you can run the docker-init command, and what that will do is like create a dockerfile and a couple of other files, I think, to help you get started, and it sort of contains that the best practices. So to try and help you get over the hump of trying to understand how to create a dockerfile, and all the different ways you can build that without needing to know everything. So I think that really helps.

Adrian : [15:33] And the other one is docker-compose watch. So there’s been a lot of work recently, especially with docker, on trying to to be better about dealing with updates when you have lots of small files. I think that was one of the biggest problems, especially with node and Python, was they had lots of small files and you change one thing and it would take forever to update, because it had a hard time figuring out when all these small files changed. But yeah docker-compose watch is much better at that now.

Adrian : [16:04] But yeah, going on to your point about base images: that has been a problem for a long long time. And there’s a couple of different things that go on there, and it ties into best practices and how to think about things. And this latest term and things like that. So, but, basically you’re quite right; so the issue is people, by default, will do something like “from Debian”. So they’ll start their containers from a Debian or Ubuntu or other base image. And sometimes those base images are relatively large. So they might be 100 megabytes in size. And they might not necessarily be up to date.

Adrian : [16:49] Now, that might not be the distro’s fault. It might be and it might not be. But, yes, so this the problem with these large images is sort of twofold: one they can just be large which bloats your own image size and makes it more difficult to iterate on and also distribute. But also they tend to be, if you run a scanner, so we have to talk about scanners which is quite a big thing.

Adrian : [17:19] So you have these tools like Snyk, Grype, docker scout that you can run on a container image. And the idea is it will tell you if there’s any CVEs present. CVEs being “Common Vulnerabilities and Exposures”. And these are known vulnerabilities that have been reported to people like the NVD (National Vulnerabilities Database). And so, ideally, you’re probably thinking, “well, I don’t want to have any known vulnerabilities in my container.” Unfortunately life is not that simple. And what you’re going to find is if you’ve got a scanner on a typical base image, you will probably find CVEs. And that might be a surprise to you if you’ve not used to the industry. But that is kind of the way it works. In reality the CVEs that you find, like 90-odd percent of them, are unlikely to affect you; but the big issue is, “how do you know about, you know, what if it’s, like, what if the CVE is one of the one percent that is bad and could end up with my application being hacked?” So there’s a big movement to try and reduce the number of CVEs in images. And one of the best ways you can do that it’s just reducing the size of the image, because [if] there’s less software in the image then there’s less that can be hacked, if you like. There’s less attack surface there.

Adrian : [18:43] Does that make sense?

Jamie : [18:44] Yeah, yeah. And that makes sense from like a security point of view even outside of containers right? Because if I—what’s the best way to think about this? I usually think like metaphors outside of software right, because everybody deals with things outside of software. So maybe if I have… okay, I’ll use my canonical example of stealing a car, right? I’m going to have to start thinking like a criminal. Please don’t steal a car. But if you want to steal a car for the pure joy of stealing a car, you’re going to walk up and down the street and find the car that looks the least secure, right? You’re going to look for a car that doesn’t have one of those steering locks. You’re going to look for a car where maybe the folks who locked it had to use a physical key and turn it in the in the car door to lock it; because then that means that there’s a physical mechanism that you can work with. Maybe you’re going to look for ones where, once someone is parked up they haven’t had to push a button to immobilize the car or set an alarm right, Because if all you want to do is steal the car you’re looking for that greater attack surface of: it doesn’t have a lock, it doesn’t have a fob, it doesn’t have an immobilizer. And so to make your car more secure, you add those things right?

Adrian : [19:55] Right. Actually, I think what you’re talking about there talks to another point, which we sometimes refer to as “defence in depth.” And what that’s saying is, “you never rely on just one security method.” So as well as having your key fob, you might also have a steering lock or one of those gear stick things or something like that. And you might also have a tracker. So you can have multiple levels of security that are all trying to prevent the same sort of attack. But you never just rely on one level.

Adrian : [20:22] So it sometimes used to be, I certainly remember 20 years ago working with a company that thought they were secure because they had a firewall. But then if you just rely purely on the firewall, the problem is an attacker gets past the firewall then everything was open, and that’s not what you want. So you need to have, like, multiple sort of defence in depth. So definitely reducing, sort of, size can be… can help with security; but you also want to be looking at things like Linux capabilities. So reducing what functions can be called from your containers; you don’t want to be running as root, so if an attack does happen then the attacker has less privileges. There’s things like seccomp, and yeah basically it’s just multiple things you can kind of layer up, and then hopefully if one level doesn’t catch an attack then another level will.

Jamie : [21:17] Sure, sure. Yeah, you’re right. There’s lots of stuff there. And yeah, I know that in the .NET space, there’s been a movement recently, I think, based on, you know, like you were saying there about not running as root. In the .NET space for containers there’s been a big thing recently in the last year of, “hey, finally our containers don’t run as root.” Which is great because then, like you said, right for the for those who are maybe not a Linux or Unix experienced person: when you’re running as root you have full control over the computer, you can do all of the things that a privileged administrator user can do, up to and including, you know, opening ports, exporting data, wiping the machine, you know doing things like that, iInstalling apps, things like that. And so what you want to do is you want to, obviously, lock that down like you’re saying, by reducing the amount of privileges, reducing that attack surface. If the user that my app is running under only has access to the app, and maybe the directory that that application is running in in a non container space, then the operating system will block a malicious user who has taken over that user from doing anything else that they shouldn’t be able to do. At least that’s the trust that we could place in the operating system right. If we’re saying, “hey do this thing,” it will always make sure that that the the thing is done that the security is in place.

Adrian : [22:45] Yeah exactly that was a good explanation thank you.

Jamie : [22:47] Thanks. So then, okay, so we’ve got all of these things that could potentially be upsetting to our security friends relating to containers: they can be slow to update with CVEs; they can be set up so that they’re only running as the root user; they might be set up without certain security hardening in place, certain security apps in place. So, now that you’ve scared everyone, what’s the solution?

Adrian : [23:18] Right. So I wonder if, yeah okay let me start, go back a few years. So at this point, I would’ve been working at Container Solutions. One thing that people started doing was they started looking at what was the smallest base images you’d get away with. And some people realised if you can compile to a static binary—so some languages like C and C++, Rust and Go can compile a completely static binary. So the idea is this binary is completely standalone. And you can copy a binary like that into what’s called the scratch container, which is a completely empty container. So basically, there’s no operating system at all inside this container. It just runs directly on top of the kernel, if you like. Just executes this binary, and that’s all it can do. And so in some ways, that’s the most minimal thing you can do, and in some ways, the most secure thing you can do.

Adrian : [24:15] But in reality, it’s quite rare that you can create a completely static binary. The vast majority of applications out there need a few more things to be in the container. So, for example, you might need, well, most applications need TLS certificates. A lot of applications expect things like a /temp directory or temporary directories to be available. Or there’d be some structure like a /usr and a /home directory to be available. And that’s where this idea of this Google Distroless project came up. So at Google they decided—“they” actually being Matt Moore and Dan Lorenc—went on to found chainguard, did this project where they took the Debian operating system and they stripped it down as much as they could. So they had this really minimal container that’s only a few megabytes in size, 3 megabytes maybe, and contains just enough to run the average static binary.

Adrian : [25:17] So the idea is you have your TLS certificates, you’ve got some sort of basic Unix or Linux structure there, and nothing else. But that is perfect for just copying the binary and running it. And notably what it doesn’t have, it doesn’t have a shell and it doesn’t have a package manager. And so they call this, because you strip out all the distro components, it’s what’s called distroless. So that can be really fantastic for security, because if an attacker gets into a distroless container they don’t even have a shell so they can’t even poke at things. It creates a really sort of, I suppose a hostile environment—that’s maybe not the right term—but it certainly makes attackers lives harder because there’s less sort of tools they can use for sort of living off the land attacks and things like that. Yeah so that was the idea behind distroless. So totally, it’s worth having a look at the Google Distroless project.

Adrian : [26:11] Now, the Google Distroless project created this container that you can use for static binaries. And they also created very similar ideas for like Java and Python, I think. Now, obviously, Java and Python, they can’t create static binaries. You have to have the JRE or the Python runtime available. So in those cases, they created runtime images where you could copy in your Java jar or your Python application directly. And so these runtime images were very cut down as well. Still didn’t have a package manager or a shell, but they had enough to run the Java application or the Python application.

Adrian : [26:50] Now, both Matt Moore and Dan Lorenc founded ChainGuard. And at ChainGuard, we took this even further. And we wanted to create a whole suite of distroless images. So we provide distroless runtime images for things like Java, but also .NET, and all your various nodes, and all the various programming languages. But we also provide application containers. So if you want a really cut down secure version of Redis or nginx or something like that then we have that as well.

Jamie : [27:26] Okay, okay so what you’re saying is, I can check out the things that Chainguard do, and that will help keep our security friends happy.

Adrian : [27:39] Yeah I hope so.

Adrian : [27:40] So like if you run one of those scanners that we talked about earlier on our images, in most cases they should have zero CVEs, and certainly, you know, should be low to zero CVEs for most of the scanners on our images. Just, basically, a few ways we manage to do that: the first one is, you know, there’s an approach to minimalism just by having less in there there’s less software that can have a CVE. But also we’re very aggressive about keeping things up to date.

Adrian : [28:12] So I should say before we go any further, that our images, we have free versions of the images but they’re only on the latest version; so if you want to use our developer tier of image, which I thoroughly recommend you try out, but do be aware it’s always the latest version, so you’ll be on the newest version of Python or node or whatever. If you need an older version, unfortunately that’s a, “go and talk to sales,” conversation.

Adrian : [28:40] But the point i’m trying to get to is that: one thing that’s really good for security and avoiding CVEs is keeping everything up to date, because you want to be running the latest software with the latest security updates in it. And that’s probably one of the number one ways I see people getting attacked: it’s just because they haven’t kept things up to date, and you know, you’re running an old version of Python with a known vulnerability that people can… attackers can potentially take advantage of. Yeah so that was the two major things: we keep things very aggressively up to date, and we keep things small.

Adrian : [29:16] And the third thing is we also, because we have our own distribution called Wolfi we can issue security advisories, and so we can like investigate any CVE that is flagged against our images, and we can look into it, and we can say, “actually no. We’ve mitigated this,“or, “no this isn’t a real CVE,” or, “actually we’ve updated this package ourselves so this is mitigated in our image,” sort of thing.

Jamie : [29:44] Yeah, yeah. I’m just looking at the .NET SDK image on the Chainguard website. And yeah, it’s the latest non-preview version. And looking at the advisories, there’s a couple of them. I mean, the ones that are listed as saying either “fixed” or “not affected.” So I guess the “not affected” is it doesn’t matter because this particular, like you said, “we’ve already upgraded this thing within the container itself,” right?

Adrian : [30:14] It could be that. So what you find is that the scanners, they have a lot of work to do. So they tend to be quite fuzzy about how they match things. So some of the “not affected” I’ve seen before are like, I think there’s one in Redis, and it was flagging a CVE, but the CVE had to do with the particular way another distribution had bundled it. So it was only a CVE in this one specific distribution; but still it was flagged against a lot of people’s build of Redis. So in that case we can just say, “hey, you know, we’re not affected by this vulnerability.”

Jamie : [30:50] Right, right. I see, I see. I really like the inclusion of SMOBs as well. Sorry, I was just speaking out loud there, but yeah carry on.

Adrian : [31:00] No. Please continue.

Jamie : [31:02] Yeah, yeah. It’s just the the inclusion of SBOMs as well. Because i think that a lot of, we’re moving into it into an arena of software development where there are parts of the world where it is now a legal requirement to provide a software bill of materials—or an SBOM—and just saying, “well, I’m on the container and so I don’t need to create an SBOM,” or “I don’t need an SBOM for my runtime,” well actually maybe you do. So I like that too; like I said I’m just clicking around in the user interface and there is an SBOM button, and I can actually download it as a JSON which is fantastic. And it actually lists everything that’s installed there, which is, yeah, super important for those who need that kind of stuff.

Adrian : [31:47] Yeah, it’s probably worth talking about how that’s done for a second. So that’s the way we store the SBOMs is in an attestation. So what you can do with container images and the Sigstore project is you can make various attestations about your containers. So the first thing, and most obvious thing you can do is you can sign your containers. So we ship signatures so you can prove that the container you’re running was built by Chainguard and came from us. But we also ship other attestations, one of which is the SBOM; another one of which is the config. So you can download the config for our images, and by rebuild, and that will provide you the app code build configuration for our images. And you run the app code tool on the build config from the attestation you should be able to build an exactly identical copy of the image image, which is actually relatively unusual in the container world. So you should be able to build a bit-for-bit identical version of the container yourself. Yeah, which is pretty amazing. You should be able to create a 100% identical copy of our images.

Adrian : [33:02] The issue is with the APK files. So all our images are basically just a bunch of Alpine packages put into a container image. Now, you may not be able to rebuild those APKs identically, even though the source is available. And that’s because the APKs include a signature inside them. You don’t need the, you don’t need a private signature to recreate an image because the signature file is stored in a separate attestation. So you can’t recreate the attestation, but you still have the attestation that signed image that was created with our private image, if you like. Does that make sense? I feel I’ve confused everybody now.

Jamie : [33:45] It makes sense to me, it makes sense to me.

Jamie : [33:48] So for the folks who are listening who may not know what attestation is: to attest to something, just in case you don’t know it’s quite literally, “hey, I can say,” I think Adrian did a really good job of describing it without actually saying the word; it’s a way of proving that some steps lead to some known output, and it’s a way of, almost like, showing your receipts to say, “this is definitely how we got here, and we can prove that this is correct,“which is kind of, not exactly, but kind of the idea behind things like SBOMs and things like that. To be able to say to those in in regulatory circumstances, “we have built this software, and we have used these particular pieces of technology, and these particular steps, and if you follow these steps completely you will always get,” like you said, you end up in that provenance setup. You will always get the exact same build out the other end, you know. Nothing will ever change because we’re pinning our dependencies on these particular versions of packages, and these these particular containers, and these particular tools, and things like that. So you should always get the exact same output.

Adrian : [34:59] Yeah, I like your point there about showing your receipts. So in attestations, you can also say things like, you know, “we did do this on this image.” You can create an attestation that says, “hey, I ran a scanner on this image and I had this output at this time.” And because it’s all signed, you know that that did happen, if you like. Yeah, and also like, you know, you could have an attestation that said, “I ran these tests on this image at this time and this was the output,” sort of thing. So it’s sort of proving that certain steps were taken.

Adrian : [35:30] In a lot of ways it’s all about supply chain security—that’s the the buzzword if you like. And you can also make comparisons to other industries, so if you look at, you know, the food industry for example: people that supply food have to be able to show where the ingredients came from, especially with things like livestock stock; but also to do with ingredients and so on. And, you know, you have to be able to show where the farm was, and if you even if you look on your packet of food there’ll be statements about where the food came from, when it’s good until, and things like that. So there’s a lot of analogies you can, sort of, see there as well.

Jamie : [36:09] Sure, and I mean, the SBOM itself comes from the bill of materials that manufacturers use right. So it’s a similar sort of setup in the manufacturing space: if I’m building something, if I’m fabricating some thing, I need to for regulatory reasons tell someone—maybe the regulator, maybe the government whatever—“this is what this thing is made, of in these particular steps,” so that I can then make that proof statement; show my receipts of saying you know, “this is not a harmful,” as in it isn’t made of harmful materials, and you know handling this thing will not cause you any problems or indeed if it does these are the the materials that are within it, right? “This is a container full of goo. The goo is corrosive, so please make sure that everyone who handles it has these particular things in place to make sure that they won’t get harmed.” Obviously, that’s the manufacturing side of things. We don’t usually have to worry about that in software.

Adrian : [37:08] I think you’ve had a great analogy there. So if you look back at something like Log4j, what people would really like to say is, “okay, I’ve got all these SBOMs. I can tell exactly which version of software is being run throughout my systems. And I can tell that my versions of Log4j are not vulnerable to this attack.” And hopefully you’d be able to, if it was all automated and you had the full information, you should be able to see that instantly. As it was with Log4j, people were dealing with this issue for weeks or even months trying to figure out all their components and systems which are vulnerable to to Log4j. So the sort of, you know, the sort of golden promise of SBOMs is being able to instantly answer that question about whether or not you have vulnerable components in your systems. Now I’m not sure we’ll ever get there, but certainly SBOMs, and producing complete SBOMs, are a step along the way towards that.

Jamie : [38:12] A hundred percent yeah. Because when some big vulnerability happens, you can then have a team of people who then say, “are we affected by this? Everybody gather around with your SBOMs or whatever proof you need of the libraries you’re using with the version numbers. And we will search through, we can then decide if the sky isn’t falling. No one’s heads are going to roll because we’re not using this particular version of this particular thing,” or indeed, “if we are using this particular version of a vulnerable package or a vulnerable system, we can then say categorically we know that two of the 50 systems are using this, and so we need to invest some engineering time,” and then you can figure out how much that engineering time’s going to cost and all that kind of stuff. So it keeps all of those those people who need to know these things happy, right.

Adrian : [39:02] Exactly. But do note that this is, like it sounds, simple when you put like that. I think the main reason that becomes complicated is transitive dependencies. So okay, you can say this for the software you build, maybe not easily but certainly more easily than you can about anybody else’s software; because like, say you’re using components from somewhere else or you’re certainly building software using you know third-party libraries, but you may also be using third-party software as a black box, essentially, in your systems. And so you need to know that full the SBOMs for all of those, and in turn you need to know the SBOMs for the software they’re using. So it’s a, you know, whole turtles all the way down problem.

Jamie : [39:50] 100%, 100%. Yeah, it’s certainly not something we can fix overnight; it’s certainly not something that fixes immediately just by producing an SBOM for our pieces of software. But I guess it gets us, if we can make it an industry… I mean, there are much more influential and much smarter people than I shouting about SBOMs, but if we can make it a thing that just happens by default then we’ll be able to get them for everybody. Which is why I really appreciate that the containers that you all are creating over at Chainguard have them anyway, because then I can say, “well, okay, so my software that I’m building, maybe I’m using some third-party libraries and software. Maybe I’m using a Progress Telerik or something.” I’m only picking on them because they’re a big name. And quite frankly, they can take me mentioning them once or twice. There’s no maliciousness or anything. It’s just something that everybody knows.

Jamie : [40:44] My app is being built with, say, .NET 8 and it is using a Progress Telerik plugin, and it is running on, for instance, a Chainguard container, I know that the very top and the very bottom I have SBOMs. I’ve got an SBOM for my software as it was built, I’ve got an SBOM for the container. All I need to worry about is the bit in the middle, so then I can then reach out to whomever at—I don’t know if they provide SBOMs, but let’s pretend that they do—I can reach out to Progress and say, “hey, can I get an SBOM for this particular version of this particular app? My, you know, where I’m selling this software requires a full SBOM all the way down,” and I’m sure that they will have the ability to create those. But with the creation of these things we will need to be poking at people and saying, “hey, can I create one or can you create one?” you know; just so we can get a good idea of what’s going on everywhere, because like you say, I think that’s really important that it’s not just the software we build, right?

Jamie : [41:45] I always tell people who are getting into the industry, you aren’t going to write every single line of code. You’re going to pull in a library. You’re going to communicate with that library, maybe transform the data from your format into a format it wants, push it into that library. And then when you get the response, translate it back into the format you want and you’re done. So, you know, a lot of us are slowly becoming kind of middleware engineers and we’re plugging the Lego bricks together, but it’s needing, we need to know where those Lego bricks have come from and what they’re made of and how they were made, right?

Adrian : [42:18] Yeah, yeah. I think that’s perfect. I think I also, there’s some people that are sceptical of SBOMs. And the reason, or one of the major reasons is, it feels like they’re a very limited use until they’re ubiquitous, right? Until I can get an SBOM for everything all the way down, it’s all a bit flawed. And there’s a lack of tools at the minute for building SBOMs, and there’s a lack of tools for making sure SBOMs are complete, and there’s a lack of tools for just interrogating SBOMs and getting information out of them so it feels like there’s still quite a few steps to go through until SBOMs are truly useful but hopefully we’re getting there.

Jamie : [43:01] Yeah, totally, totally. And you know, Chainguard is helping by producing SBOMs by default, like I said earlier right. I’m really impressed by that, it makes sense you know, you guys are after making things more secure, why would you not produce SBOMs right? If you didn’t, I’d be asking, “what’s going on?”

Adrian : [43:22] Yeah exactly.

Jamie : [43:24] Okay, so let’s talk about how often do these get updated then? Like say, I’ve pulled a Chainguard container so that I can do some.. I can host some app. And then I hear, you know, two weeks and down the line, “hey, there’s a problem with,” maybe I’m writing something in Python. “Hey, there’s a problem with this particular library or this particular thing. You need to go update your stuff.” Do I then like, are you already on that? And do I just do like a, you know, a docker pull and re-pull the image and then away we go? Or like, how does that all work?

Adrian : [44:01] Right, so that’s a good question. So unfortunately, there’s a few different answers here. Let’s start with the containers. So we have our own Linux distribution that I’ve alluded to a few times, but it’s called Wolfi. We sometimes call it Undistro because we don’t ship the—it’s not a full Linux distribution. Mainly because we don’t ship a Linux kernel. It’s just a bunch of packages that have been built from source, and distributed via the apk format. Apk is the Alpine package manager format; we’re not an alpine distribution but we use some of the same tools or formats as Alpine.

Adrian : [44:41] We’re actually not compatible with Alpine packages, so you can’t mix and match Alpine and Wolfi packages. Wolfie is compiled against glibc all the way down unlike Alpine, which is muzzle. But all the Wolfi packages, we have a whole bunch of automation in place which tries to update our packages whenever an upstream project releases a new version. So if package foo comes out with a new release, typically within four hours, we’ll have the new package version in Wolfi. And then, of course, we’ve got to build a container from it, and that should be done within a day as well. So typically, from a new release hitting an upstream project, the next day we’ll have a container with that new version of the software in it.

Adrian : [45:29] Now, you did mention like Python packages. So if one of our images is built with a Python package, we should get the update. Assuming the upstream project has released a new version [which] addresses that vulnerability, we should get the update within four hours and have a new image ready the next day. Now, you might be pip installing that package in your own dockerfile in which case, unfortunately, you’re going to be responsible for updating that. I’m hoping, over time perhaps, we’ll have more packages within Wolfi that you can install via Wolfi and we’ll take care of updating; but at the minute if you pip install something you would be responsible for updating that yourself.

Jamie : [46:11] Sure, Sure. That makes sense right. In the same way that if I’m doing a .NET thing and I’m doing a .NET add package, I’m responsible for making sure that I have a pinned version that has an update. But it’s really impressive that potentially within, you know, you’re saying within four hours, there might be a new build, which means that maybe, you know, within a day, there’s a new version of the container. That’s incredibly impressive.

Jamie : [46:38] You know especially with things like, you know, enterprise level software; I’ve been in a position where I’m like, “right we need an update for this particular library,” and ther author because, it’s it’s that old xkcd comic of you know the whole world is built on software by one person who is lives in a shack in Nebraska or whatever right. You’re waiting for that person to update it, or indeed if you’re getting the the permission from, you know, the person in charge of your project, you’re then submitting a patch to actually update it for everybody. And then it goes through the build process and pushing out to to different places, like for instance if you’re relying on an an operating system level package. Let’s say we’re not using Chainguard, we’re not using Wolfi, we’re using maybe—I don’t know—Ubuntu or Debian or something to host our app; there’s something in the operating system has been updated, if you’re on the cloud you’re at the mercy of whenever they update that cloud image with your operating system. That could be, you know, a couple of hours, it could be tomorrow, it could be next week. If you’re running internally, now you’ve got your whole it team screaming at you—your whole ops team screaming that you—saying, “we can’t update this image because, you know, some software somewhere or some process somewhere relies on the the current version of it we can’t verify it,“right. So you’re in that horrible situation whereas if you wrap it up in a container which is updated within 24 hours perhaps, “hey you got no problems,” right.

Adrian : [48:11] Hopefully, yeah.

Adrian : [48:12] I think there’s a couple of interesting points there. So you mentioned, like, the random person in Nebraska, and I think that’s an excellent point. But 99% of the time that random person in Nebraska does push out a new release for any problems quite quickly. But sometimes, like, your Linux distributions are a little bit slow at picking up the new release. And that’s where we really shine. We pick up that new release immediately. But say, you know, there’s a project that has a vulnerability flagged against it, but it’s not a vulnerability in that project’s source code. It’s like a transitive dependency. Well, that project may well not do a release just because one of the dependencies has a vulnerability because it’s not in their code. So that, so you know, they might be in no rush to update it especially if it’s just a minor vulnerability, because you know they have other things to be doing with their time.

Adrian : [49:06] In those cases what you might find is Chainguard may do a new build that updates that transitive dependency ourselves, and that’s where we’d issue a security advisory, because we’ll still be using the sort of same version of the the software. So say it’s, you know, package Foo at version 2.01, we’re still all getting that release from the upstream it’s just that we’ve patched it to address this transitive dependency, and that’s, you know, it goes back to this point about security advisories.

Jamie : [49:34] Yeah. The, “getting clearance for everything getting everything updated,” is just, it’s a nightmare. But yeah if I can just docker pull and get a later version that is—and I don’t even need to do docker pull, right; because if i’m in using CI/CD perhaps it’s just going to do that by default when I push out a new version anyway.

Adrian : [49:58] Right. I mean, that’s one thing I really advise people to to do: is to put some automation in place to try and update things as much as possible; obviously you want to be running tests against stuff you update, don’t just update something and run it blindly. You want tests to be in place to catch any issues and probably staging and other things like that.

Adrian : [50:15] I think where Chainguard images and distroless images really save people time is just in the step about scanning. So like I said before, a lot of companies have invested in scanning to try and make sure they’re not running with vulnerabilities, but the problem is these tools report a lot of vulnerabilities. Like if you just try running Snyk or Grype or something on like a typical node image you’re gonna find vulnerabilities, and the problem is at that stage what do you do with these reports? Like you can spend time trying going to investigate each one, or you can ask your development team to say, “hey these don’t matter,” or you know do something with them. But whatever you do unless you’re just ignoring them it requires a lot of work. And I think that’s where distroless and Chainguard images really shine: is we can just say, “hey, use these images and you’re not going to have all this work.”

Jamie : [51:14] Sure, sure. Yeah and I think that’s something that a lot of people don’t tend to to do, at least in my experience. Now my experience is very limited to just the companies I’ve worked with or the projects I’ve worked on; where yeah we’ll use this container image but we don’t know what’s in there. So we won’t use Snyk, we won’t use any of these tools to do the static analysis, and we won’t use any of these tools to sort of try and strip things out. It’s just, cool do—I keep saying docker just because it’s a docker, docker is the ubiquitous app—“what you do is docker pull, cool. I’ve got docker, I’ve got the container running for my python app, I’ve got the container running for my .NET app or whatever, and then I’m done.” And then it’s like, “well hang on a second there. What about the security stuff?” and they’re like, “what security stuff?” I’m like, “you’re just running code on your computer without having actually done any kind of investigation as to what it is.” And here comes—this is a bugbear of mine, I’m sorry but—“It’s open source. Someone else has done the check for me,” I’m like, “cool I’m really glad that you trust that. That’s really cool let me just tell you about xz,” or whatever the the library was that got overtaken earlier this year you know. That’s open source, you trust them?

Jamie : [52:31] But yeah I think it’s really important that folks do this; or or indeed, look to you guys to actually help do this. Because then you’ve got like a trusted person, a trusted system, right; because you don’t want to be just running random binaries on your computer. At least I don’t anyway right.

Adrian : [52:51] Yeah. I mean, this is another very difficult issue I think. The first thing I’d say is the only reason we spotted xz, or xz, or whatever it’s called is because it was open source so people could, you know, build it themselves. Like if it hadn’t been an open source library, we probably never have noticed. So open source probably saved our bacon there, to be honest. What’s the other point I was gonna make? It’s, sorry i lost my train of thought.


A Request To You All

If you're enjoying this show, would you mind sharing it with a colleague? Check your podcatcher for a link to show notes, which has an embedded player within it and a transcription and all that stuff, and share that link with them. I'd really appreciate it if you could indeed share the show.

But if you'd like other ways to support it, you could:

  • Leave a rating or review on your podcatcher of choice
  • Consider buying the show a coffee
    • The BuyMeACoffee link is available on each episode's show notes page
    • This is a one-off financial support option
  • Become a patron
    • This is a monthly subscription-based financial support option
    • And a link to that is included on each episode's show notes page as well

I would love it if you would share the show with a friend or colleague or leave a rating or review. The other options are completely up to you, and are not required at all to continue enjoying the show.

Anyway, let's get back to it.


Jamie : [53:22] That’s okay, that’s okay. For those, just case folks are not entirely sure what happened with xz or xz or whatever: tl;dr it was part of a compression library, and someone or someone’s spent multiple years building up trust in the open source community to take over this library whilst one of the developers was having a mental health break; and then put some code in there that allowed anyone with SSH on their machine to have their machine taken over during an SSH session. Which if you’re a Windows dev and you’ve never used SSH: it’s a way of communicating between two computers using the terminal, right? You can open up a session, dial into that other computer, and run arbitrary commands on it. Which is scary enough as it is, but having a backdoor into that is also scary. And like you said, Adrian, had it not been open source. and from what I can remember had it not been someone at Microsoft wanting to do Postgres stuff on a particular distro we may not have seen it as fast as we had. It’s ridiculous, it really is.

Adrian : [54:28] There was actually a, there was a fantastic interview on the Oxide and Friends podcast—I’m not sure if I’m allowed to mention other podcasts—but there was a fantastic interview on Oxide and Friends where they go through this in depth, and they talk to the person that found the vulnerability. So I totally recommend checking that out.

Adrian : [54:48] I would like to actually mention a couple of things that we do a chainguard to try and protect against that, So one of the things we’re moving to is, like I said, we always, we publish like the released version of software; we don’t grab head, we grabbed it at last release. Now one of the interesting things about the xz exploit was that the some of the, sort of, nefarious code was in the release build; so the bit that actually took the source code and produced the release tar.gz or whatever it was. So one thing that we’re looking at is going back to github and building from the, sort of, the tagged release version on the actual source repository; as opposed to blindly using the the tarball or whatever from the released version.

Adrian : [55:40] And the other thing we’re looking at this we have a tool called, I think it’s bincapz, which was built by one of our engineers and attempts to try and figure out if, sort of, the capabilities of a package application have changed over time. And I’m sure we can link that to that in the the show notes hopefully.

Jamie : [55:58] Sure sure. Yeah yeah. So how, so currently then you’re pulling a tar. gz of like, some source, code from GitHub and just like you’re going into a release downloading that and bundling that into your—I’m doing a really bad job of describing, I’m sure you can talk a little bit better about it—bundling that into your build, into your containers, but you’re saying that you want to be able to go back to GitHub and grab the release version or something like that. I wonder if you could expand on that a little bit.

Adrian : [56:30] Yeah. So if you see on GitHub at the minute, you know there’s like a releases tab. Now typically, if you go there you can download a tar.gz, and that tar.gz will contain the source code for that release. Ao all our software at Chainguard is built from source, but there’s an important step here where we want to make sure that we got, you know, the correct source and limit the ability for an attacker to play with the source between us getting it and compiling it. And you know when it was actually written and who wrote it, if you like. So you know you want to get the source as pure as possible, if you like, to limit the number of places where an attacker has an opportunity to inject something. So one of those places is between, you know, the the tagged release version; so in git you can tag a release. So I can go to my source code repository and say, “I want the version that was tagged 2.01 or whatever.”

Adrian : [57:29] But you can also just go to the project and download the tar. gz correspondent to 2.01; and so what i’m saying is we want to try and make sure, wherever possible, we’re going to the source code to actual source and get the tagged version, so that we don’t have to trust like the tarball. And because something could have gone on between the source and that tarball being created, if you like, in that part of the process. Which is exactly what happened in the xz attack.

Jamie : [58:00] Sure. Yeah. So when I create a release on GitHub all I have to do to create that release—just to fill in some of the gaps here—is in in my local git I do git tag add, a new tag with a description, hit go; that then tags that source code, I push that up to GitHub. But when I create a release it is from that tag, but I can actually grab a different, I could grab a tar. gz of anything and upload that as part of the release. So that’s what you’re getting at: there is that, I can actually change that tar. gz before releasing. So then the “release” of the tagged version of the code in the source code repository is the version that we’re wanting to release, but actually I can go ahead [and] do a build locally, inject some code in like we saw with xz, and then upload that tar. gz into that release; everybody rushes over, hits the download button on that release, who knows what’s in that code right.

Adrian : [59:04] Yeah. So that’s, it’s even more nefarious, to be honest, because the process that built the release was all on GitHub; I think it was still running an action, don’t quote me on that. But it was, you know, it was the actual code and even the code to build the release was available, but that was the bit that had been attacked and that’s what they’re able to like pinpoint and say, “hey here’s where something went on.” It wasn’t that it had been done quietly on somebody’s laptop. The code was there, it just wasn’t in GitHub. It wasn’t in the actual source code itself it was in the the bit of code that did the release, that the evil part was going on.

Adrian : [59:48] So yeah, it was quite an astonishing… Yeah, exactly. So yeah, if we don’t have to trust that part of the CI/CD pipeline, which should be doing very little, it should be just taking the source code and creating the tarball, if you like. But it’s still an extra little part to trust, if you like.

Jamie : [1:00:09] 100 percent. Yeah. It’s a really interesting thing. It’s crazy that we have, have all of these steps involved to build and release our software. And that any, at literally, any point we can throw these mitigations in. But I like that you’re—because like all of this, all of the source code for free open source software is available on GitHub. So it makes sense that you would have the, the repository in, I don’t know, in your build pipeline or whatever for all of your dependencies of say Python or whatever right. So then you can get the actual source code and build that for your containers; which I guess leads on to a thought that I had with a previous guest who was talking—a different guest, Niels Tanis, who was talking about, “how do we create reproducible builds of our external sources?” Instead of using your package manager to pull down your packages, you should go to the GitHub repository, if there is a GitHub repository or a GitLab or whatever, pull that source code locally, take ownership of a copy of it, and build from that right. That’s likely the only way to ensure that you are getting the code as the author put it out there.

Adrian : [1:01:23] Yeah, I mean, I guess that’s true. But, you know, you’re going through quite a few steps there. Yeah it’s true.

Adrian : [1:01:33] I mean, I think at the end of the day, we all have to, you have no choice but to trust some people. And it’s who do you trust? So we, generally, trust our Linux distributions and they’re generally trustworthy. Yeah, I guess it’s all about putting checks, and being able to to prove where things came from. I think that’s where we have a little bit of work left to do and really, sort of, will help to improve things. So, you know, again with open sourcing we could at least go back and see where things had gone wrong, where the nefarious code had been injected and things like that. But yeah. Yeah, I don’t know what to say really.

Jamie : [1:02:18] Well, I mean what you’re saying is Chainguard makes it easier, because you don’t have to worry about it so much right.

Adrian : [1:02:25] Well, what i’m saying is, you know, it’s not even that. What I’m saying is, You have to trust somebody.

Jamie : [1:02:31] Yeah.

Adrian : [1:02:32] By trusting Chainguard, we will try and work to minimize the amount of trust that we have in other people behind us, if you like. So we’re doing extra checks and so on, so that if you trust us to do those checks, hopefully you’re in a better place.

Jamie : [1:02:50] Amazing. Amazing.

Jamie : [1:02:51] In that case, I think that’s a great place for us to to talk about where folks can go to learn a little bit more about the container images that you all are putting together, maybe the technology underlying it, maybe they want to poke around in Wolfie and see, you know, “oh can I trust this? Yes, I can because I’m going to audit it myself.” You know, I’ve always said to people, “you need to have someone that you trust in that chain.” That’s why I mentioned, you know, why we got on to the thing about xz. But anyway, I’ve gone way off topic there, so what’s the best place for folks to go to learn a bit more about Chainguard and the images you all are creating?

Adrian : [1:03:24] Yeah, so we actually have an academy site. I believe that’s edu.chainguard.dev; so at Chainguard we have a .dev domain. But on the academy site you will find lots of tutorials and more information on both Wolfi and Chainguard images. Also on GitHub, you can check out Wolfi. I think it’s, I need to, I’ll need to check, sorry, I’ll need to check exactly what the…

Jamie : [1:03:55] I’ll get the links for the GitHub, but it’s on GitHub, right? I think I’ve got it on my screen as github.com/wolfi-dev. That’s one place that I found it.

Adrian : [1:04:05] Yes, wolfi-dev, that’s what I was trying to remember, exactly. And there’s also a Chainguard repository as well, but wolfi-dev is where all the build files for all our packages are. So like I said, all our stuff’s built from source, and we try to bootstrap everything as well. Ao there’s a nice post on how we went back to to early Java version to try and bootstrap our versions of Java and so on.

Jamie : [1:04:27] Right. That’s cool. What about you then? If folks have maybe got a question for you, or they’ve got a thought about, “hey, you know, it’d It’d be great if I could get like a demo or something, you know.” I know you said that the paid version includes access to all of the, the older containers, but maybe, you know, if, folks are like, “hey, wouldn’t it be great if I could get like a video that I can really sell it to the team? Hey, Adrian, where do I go to get those?” Can they reach out to you directly? Like on the artist formerly known as Twitter, or are you on like Mastodon, things like that?

Adrian : [1:05:02] Yeah. So I should be most places. I’ve seen LinkedIn getting a lot more interest recently, but yes i’m on X, LinkedIn, and Mastodon. It should… I have relatively unusual surname which is Mouat—M-O-U-A-T—so you should be able to find me relatively easily; and i’m sure we’ll stick links in the notes. Also you mentioned videos, so if you go to YouTube there’s a Chainguard channel, and there’s quite a few videos, some by me, on how to use Chainguard images there.

Jamie : [1:05:36] Sure, excellent. I’ll get all of those links and I’ll put them in the show notes for people to check out, so they don’t have to do, “the google it.” They can just push the button, and find it all out.

Jamie : [1:05:48] Amazing, it has been absolutely fantastic to chat with you today Adrian. It’s been a really eye-opening experience for me because i’ve learned.. I, personally, have learned a whole bunch more stuff about provenance and about like, “how do we ensure that the software that we’re building is the software that we’re building all the way down to the silicon,” and, how it is a really tough problem to solve. So I really appreciate you coming on the show. Thank you ever so much.

Adrian : [1:06:16] Thank you very much for having me.

Wrapping Up

Thank you for listening to this episode of The Modern .NET Show with me, Jamie Taylor. I’d like to thank this episode’s guest for graciously sharing their time, expertise, and knowledge.

Be sure to check out the show notes for a bunch of links to some of the stuff that we covered, and full transcription of the interview. The show notes, as always, can be found at the podcast's website, and there will be a link directly to them in your podcatcher.

And don’t forget to spread the word, leave a rating or review on your podcatcher of choice—head over to dotnetcore.show/review for ways to do that—reach out via our contact page, or join our discord server at dotnetcore.show/discord—all of which are linked in the show notes.

But above all, I hope you have a fantastic rest of your day, and I hope that I’ll see you again, next time for more .NET goodness.

I will see you again real soon. See you later folks.

Follow the show

You can find the show on any of these places